Skip to content
Back to formatted view

Raw Message

Message-ID: <CAAxdm-7kn39DLsv1o984Mb5SnHPX6+JkOfJzyuXPK46WG2KVrw@mail.gmail.com>
Date: 2012-07-03T21:05:23Z
From: jim holtman
Subject: Data manipulation with aggregate
In-Reply-To: <1341331470144-4635298.post@n4.nabble.com>

try this:

> myData = data.frame(Name = c('a', 'a', 'b', 'b'), length = c(1,2,3,4), type
+ = c('x','x','y','z'))
>
> result <- do.call(rbind, lapply(split(myData, myData$Name), function(.name){
+ data.frame(Name = .name$Name[1L]
+ , length = mean(.name$length)
+ , type = if (all(.name$type[1L] == .name$type)) .name$type[1L] else NA
+ )
+ })
+ )
> result
  Name length type
a    a    1.5    x
b    b    3.5 <NA>
>



On Tue, Jul 3, 2012 at 12:04 PM, Filoche <pmassicotte at hotmail.com> wrote:
> Hi everyone.
>
> I have these data :
>
> myData = data.frame(Name = c('a', 'a', 'b', 'b'), length = c(1,2,3,4), type
> = c('x','x','y','z'))
>
> which gives me:
>
> ? Name length type
> 1 ? ?a ? ? ?1 ? ?x
> 2 ? ?a ? ? ?2 ? ?x
> 3 ? ?b ? ? ?3 ? ?y
> 4 ? ?b ? ? ?4 ? z
>
> I would group (mean) this DF using 'Name' as grouping factor. However, I
> have a field ('type') which is a string. I would like to use the unique
> value of this field when possible (i.e. when all the 'type' values are the
> same for each group) or replace with NA when 'type' has multiple values.
>
> In fact, I would like to obtain this:
>
> ? Name length type
> 1 ? ?a ? ? ?1.5 ? ?x
> 2 ? ?b ? ? ?3.5 ? ?NA
>
> For instance, I was using this command:
>
> aggregate(list(myData$length, myData$type), list(myData$Name), FUN = mean)
>
> But it can't deal with string data.
>
> I hope I have been clear enough.
>
> With regards,
> Phil
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Data-manipulation-with-aggregate-tp4635298.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.