That seems like sage advise :) Thanks Stefan
On 29 July 2016 at 22:06, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
Having experienced some frustration myself when I first started with R many years ago, I can relate to your apparent frustration. However, if you would like to succeed in using R I strongly recommend learning R and not trying to write Haskell or Erlang or C or Fortran or any other language when writing in R. I am sure there are many things R could do better, and once you understand how R actually works you might even be in a position to contribute some improvements. But thinking in those other languages with an R interpreter on front of you is going to just make you more frustrated. For one thing, everything in R is a vector... even lists. Appending to a list is not O(1) as it would be for a linked list. Thus it is preferred to find algorithms that pre-allocate memory for results. Map (lapply) is 1:1 to encourage that. Reduce is N:1 because it is simpler that way. Use Map to make a grouping vector that you can use to select which elements you want to process and then map over that subset of your input data or aggregate over the whole thing. Also, names are attributes of the list vector... one name per element. Not all list operations maintain that attribute so you often have to explicitly copy names from source to destination. Oh and "source" is a common base R function... and so it is generally advised to not re-use common names in the global environment. -- Sent from my phone. Please excuse my brevity. On July 29, 2016 8:43:16 AM PDT, Stefan Kruger <stefan.kruger at gmail.com> wrote:
I still don't understand why you want Reduce to to lapply's job. Reduce maps many to one and lapply maps many to many.
Say you want to map a function over a subset of a vector or list? With
the
generalised version of Reduce you map many-to-one, but the one can be a
'complex' structure. lapply() and friends not only map many-to-many,
but
X-to-X - the resulting list will be the same length as the source. This
frequently gets used in Elixir, Erlang, Haskell etc as a means of
processing a pipeline or stream - start with a vector, select a subset
based on some predicate, turn this subset into an entirely different
object/list/
In iterative-fashion pseudo code
source = list(c(1,2,3,4), c(8,7,6,5,4,3,7), c(5,4))
result = { }
foreach (item in source) {
if (length(item) > 2) {
result[generate_some_name()] = length(item)
}
}
That's and example of what I want to do. It maps many (a subset of the
vectors in source) to one (the result named list). It's a map-filter -
but
even more general than your typical map-filter in that you can change
the
data structure - e.g. map a function over a vector, use a subset of the
results, and turn those into a list or S3 object.
Stefan
On 29 July 2016 at 15:54, William Dunlap <wdunlap at tibco.com> wrote:
Reduce (like lapply) apparently uses the [[ operator to extract components from the list given to it. X[[i]] does not attach names(X)[i] to its output (where would it put it?). Hence your se To help understand what these functions are doing try putting print statements in your test functions:
data <- list(one = c(1, 1), three = c(3), two = c(2, 2))
r <- Reduce(function(acc, item) { cat("acc="); str(acc) ;
cat("item=");
str(item); length(item) }, data, init=list()) acc= list() item= num [1:2] 1 1 acc= int 2 item= num 3 acc= int 1 item= num [1:2] 2 2
data2 <- list(one = c(oneA=1, onB=1), three = c(threeA=3), two =
c(twoA=2, twoB=2))
r <- Reduce(function(acc, item) { cat("acc="); str(acc) ;
cat("item=");
str(item); length(item) }, data2, init=list()) acc= list() item= Named num [1:2] 1 1 - attr(*, "names")= chr [1:2] "oneA" "onB" acc= int 2 item= Named num 3 - attr(*, "names")= chr "threeA" acc= int 1 item= Named num [1:2] 2 2 - attr(*, "names")= chr [1:2] "twoA" "twoB" I still don't understand why you want Reduce to to lapply's job. Reduce maps many to one and lapply maps many to many. Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Jul 29, 2016 at 1:37 AM, Stefan Kruger
<stefan.kruger at gmail.com>
wrote:
Jeremiah - neat - that's one step closer, but one small thing I still don't understand:
data <- list(one = c(1, 1), three = c(3), two = c(2, 2))
r = Reduce(function(acc, item) { append(acc,
setNames(length(item),
names(item))) }, data, list())
str(r)
List of 3 $ : int 2 $ : int 1 $ : int 2 I wanted the names to remain, but it seems like the "data" parameter
loses
its names when consumed by the Reduce()? If I print "item" inside
the
reducing function, it's not got the names. I'm probably missing some central tenet of R here. As to your comment of this being lapply() implemented by Reduce() -
as I
understand lapply() (or map() in other functional languages), it's limited to returning a list/vector of the same length as the original.
Consider
this contrived example:
r = Reduce(function(acc, item) { if (length(item) > 1)
{append(acc,
setNames(length(item), names(item)))} }, data, list())
str(r)
int 2
r
[1] 2 I don't think you could achieve that with lapply()? Thanks Stefan On 28 July 2016 at 20:19, jeremiah rounds <roundsjeremiah at gmail.com> wrote:
Basically using Reduce as an lapply in that example, but I think
that
was
caused by how people started talking about things in the first
place =)
But
the point is the accumulator can be anything as far as I can tell. On Thu, Jul 28, 2016 at 12:14 PM, jeremiah rounds < roundsjeremiah at gmail.com> wrote:
Re: "What I'm trying to work out is how to have the accumulator in Reduce not be the same
type
as
the elements of the vector/list being reduced - ideally it could
be an
S3
instance, list, vector, or data frame." Pretty sure that is not true. See code that follows. I would
never
solve this task in this way though so no comment on the use of
Reduce
for
what you described. (Note the accumulation of "functions" in a
list is
just a demo of possibilities). You could accumulate in an
environment
too
and potentially gain a lot of copy efficiency.
lookup = list()
lookup[[as.character(1)]] = function() print("1")
lookup[[as.character(2)]] = function() print("2")
lookup[[as.character(3)]] = function() print("3")
data = list(c(1,2), c(1,4), c(3,3), c(2,30))
r = Reduce(function(acc, item) {
append(acc, list(lookup[[as.character(min(item))]]))
}, data,list())
r
for(f in r) f()
On Thu, Jul 28, 2016 at 5:09 AM, Stefan Kruger <
stefan.kruger at gmail.com>
wrote:
Ulrik - many thanks for your reply. I'm aware of many simple solutions as the one you suggest, both
iterative
and functional style - but I'm trying to learn how to bend
Reduce()
for
the purpose of using it in more complex processing tasks. What I'm
trying
to
work out is how to have the accumulator in Reduce not be the
same
type as
the elements of the vector/list being reduced - ideally it could
be
an S3
instance, list, vector, or data frame. Here's a more realistic example (in Elixir, sorry) Given two lists: 1. data: maps an id string to a vector of revision strings 2. dict: maps known id/revision pairs as a string to true (or 1) find the items in data not already in dict, returned as a named
list.
```elixir
data = %{
"id1" => ["rev1.1", "rev1.2"],
"id2" => ["rev2.1"],
"id3" => ["rev3.1", "rev3.2", "rev3.3"]
}
dict = %{
"id1/rev1.1" => 1,
"id1/rev1.2" => 1,
"id3/rev3.1" => 1
}
# Find the items in data not already in dict. Return as a
grouped map
Map.keys(data)
|> Enum.flat_map(fn id -> Enum.map(data[id], fn rev -> {id,
rev}
end)
end)
|> Enum.filter(fn {id, rev} -> !Dict.has_key?(dict,
"#{id}/#{rev}")
end)
|> Enum.reduce(%{}, fn ({k, v}, d) -> Map.update(d, k, [v],
&[v|&1])
end) ``` On 28 July 2016 at 12:03, Ulrik Stervbo
<ulrik.stervbo at gmail.com>
wrote:
Hi Stefan, in that case,lapply(data, length) should do the trick. Best wishes, Ulrik On Thu, 28 Jul 2016 at 12:57 Stefan Kruger
<stefan.kruger at gmail.com
wrote:
David - many thanks for your response. What I tried to do was to turn data <- list(one = c(1, 1), three = c(3), two = c(2, 2)) into result <- list(one = 2, three = 1, two = 2) that is creating a new list which has the same names as the
first,
but
where the values are the vector lengths. I know there are many other (and better) trivial ways of
achieving
this -
my aim is less the task itself, and more figuring out if this
can
be
done
using Reduce() in the fashion I showed in the other examples
I
gave.
It's
a building block of doing map-filter-reduce type pipelines that
I'd
like to
understand how to do in R.
Fumbling in the dark, I tried:
Reduce(function(acc, item) { setNames(c(acc,
length(data[item])),
item },
names(data), accumulate=TRUE) but setNames sets all the names, not adding one - and acc is
still
a
vector, not a list. It looks like 'lambda.tools.fold()' and possibly
'purrr.reduce()'
aim
at
doing what I'd like to do - but I've not been able to figure
out
quite
how. Thanks Stefan On 27 July 2016 at 20:35, David Winsemius
<dwinsemius at comcast.net>
wrote:
On Jul 27, 2016, at 8:20 AM, Stefan Kruger <
stefan.kruger at gmail.com>
wrote:
Hi - I'm new to R. In other functional languages I'm familiar with you can
often
seed a
call
to reduce() with a custom accumulator. Here's an example
in
Elixir:
map = %{"one" => [1, 1], "three" => [3], "two" => [2, 2]}
map |> Enum.reduce(%{}, fn ({k,v}, acc) ->
Map.update(acc, k,
Enum.count(v), nil) end)
# %{"one" => 2, "three" => 1, "two" => 2}
In R-terms that's reducing a list of vectors to become a
new
list
mapping
the names to the vector lengths.
Even in JavaScript, you can do similar things:
list = { one: [1, 1], three: [3], two: [2, 2] };
var result = Object.keys(list).reduceRight(function (acc,
item) {
acc[item] = list[item].length;
return acc;
}, {});
// result == { two: 2, three: 1, one: 2 }
In R, from what I can gather, Reduce() is restricted such
that
any
init
value you feed it is required to be of the same type as
the
elements
of
the
vector you're reducing -- so I can't build up. So whilst
I can
do, say
Reduce(function(acc, item) { acc + item }, c(1,2,3,4,5),
96)
[1] 111 I can't use Reduce to build up a list, vector or data
frame?
What am I missing? Many thanks for any pointers,
This builds a list:
Reduce(function(acc, item) { c(acc , item) },
c(1,2,3,4,5), 96,
accumulate=TRUE) [[1]] [1] 96 [[2]] [1] 96 1 [[3]] [1] 96 1 2 [[4]] [1] 96 1 2 3 [[5]] [1] 96 1 2 3 4 [[6]] [1] 96 1 2 3 4 5 But you are not saying what you want. The other examples
were
doing
something with names but you provided no names for the R
example.
This would return a list of named vectors:
Reduce(function(acc, item) { setNames( c(acc,item),
1:(item+1))
},
c(1,2,3,4,5), 96, accumulate=TRUE) [[1]] [1] 96 [[2]] 1 2 96 1 [[3]] 1 2 3 96 1 2 [[4]] 1 2 3 4 96 1 2 3 [[5]] 1 2 3 4 5 96 1 2 3 4 [[6]] 1 2 3 4 5 6 96 1 2 3 4 5
Stefan
--
Stefan Kruger <stefan.kruger at gmail.com>
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more,
see
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
reproducible
code.
David Winsemius Alameda, CA, USA
--
Stefan Kruger <stefan.kruger at gmail.com>
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible
code.
--
Stefan Kruger <stefan.kruger at gmail.com>
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible
code.
--
Stefan Kruger <stefan.kruger at gmail.com>
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Stefan Kruger <stefan.kruger at gmail.com> [[alternative HTML version deleted]]