how to subset unique factor combinations from a data frame.
Hi You probably did not notice xtabs I mentioned before. as.data.frame(xtabs(~x+xx))
u <- as.data.frame(table(x, xx)) head(u)
x xx Freq 1 A a 18 2 B a 27 3 C a 30 4 D a 30 5 E a 27 6 F a 18
v<-as.data.frame(xtabs(~x+xx))
head(v)
x xx Freq 1 A a 18 2 B a 27 3 C a 30 4 D a 30 5 E a 27 6 F a 18 Regards Petr r-help-bounces at r-project.org napsal dne 05.01.2011 08:46:21:
Hi Dennis, It worked! this is what I am looking for. Many thanks. Rgds, SNVK _____ From: Dennis Murphy [mailto:djmuser at gmail.com] Sent: Tuesday, January 04, 2011 9:07 PM To: SNV Krishna Cc: r-help at r-project.org Subject: Re: [R] how to subset unique factor combinations from a data
frame.
Hi: Did you try something like summdf <- as.data.frame(with(df, table(Commodity, Attribute, Unit))) ? The rows of the table should represent the unique combinations of the
three
variables.... Here's a simple toy example to illustrate:
x <- sample(LETTERS[1:6], 1000, replace = TRUE) xx <- sample(letters[1:6], 1000, replace = TRUE) u <- as.data.frame(table(x, xx)) dim(u)
[1] 36 3
head(u)
x xx Freq 1 A a 26 2 B a 29 3 C a 25 4 D a 25 5 E a 27 6 F a 29 HTH, Dennis On Tue, Jan 4, 2011 at 2:19 AM, SNV Krishna <krishna at primps.com.sg>
wrote:
Hi, Sorry that my example is not clear. I will give an example of what each variable holds. I hope this clearly explains the case. Names of the dataframe (df) and description Year :- Year is calendar year, from 1980 to 2010 Country :- is the country name, total no. (levels) of countries is ~ 190 Commodity :- Crude oil, Sugar, Rubber, Coffee .... No. (levels) of commodities is 20 Attribute: - Production, Consumption, Stock, Import, Export... Levels ~
20
Unit :- this is actually not a factor. It describes the unit of
Attribute.
Say the unit for Coffee (commodity) - Production (attribute) is 60 kgs. While the unit for Crude oil - Production is 1000 barrels Value :- value
tail(df, n = 10) // example data//
Year Country Commodity Attribute Unit Value 1991 United Kingdom Wheat, Durum Total Supply (1000 MT) 70 1991 United Kingdom Wheat, Durum TY Exports (1000 MT) 0 1991 United Kingdom Wheat, Durum TY Imp. from U (1000 MT) 0 1991 United Kingdom Wheat, Durum TY Imports (1000 MT) 60 1991 United Kingdom Wheat, Durum Yield (MT/HA) 5 Wish this is clear. Any suggestion Regards, SNVK -----Original Message----- From: Petr PIKAL [mailto:petr.pikal at precheza.cz] Sent: Tuesday, January 04, 2011 4:06 PM To: SNV Krishna Cc: r-help at r-project.org Subject: Odp: [R] how to subset unique factor combinations from a data frame. Hi r-help-bounces at r-project.org napsal dne 04.01.2011 05:21:25:
Hi All I have these questions and request members expert view on this. a) I have a dataframe (df) with five factors (identity variables) and
value
(measured value). The id variables are Year, Country, Commodity,
Attribute,
Unit. Value is a value for each combination of this. I would like to get just the unique combination of Commodity, Attribute
and
Unit. I just need the unique factor combination into a dataframe or a
table.
I know aggregate and subset but dont how to use them in this context.
aggregate(Value, list(Comoditiy, Atribute, Unit), function)
b) Is it possible to inclue non- aggregate columns with aggregate
function
say in the above case > aggregate(Value ~ Commodity + Attribute, data =
df,
FUN = count). The use of count(Value) is just a round about to return
the
combinations of Commodity & Attribute, and I would like to include
'Unit'
column in the returned data frame?
Hm. Maybe xtabs? But without any example it is only a guess.
c) Is it possible to subset based on unique combination, some thing like this.
subset(df, unique(Commodity), select = c(Commodity, Attribute,
Unit)).
I
know this is not correct as it returns an error 'subset needs a logical evaluation'. Trying various ways to accomplish the task.
Probably sqldf package has tools for doing it but I do not use it so you
have to try yourself.
df[Comodity==something, c("Commodity", "Attribute", "Unit")]
can be other way.
Anyway your explanation is ambiguous. Let say you have three rows with
the
same Commodity. Which row do you want to select? Regards Petr
will be grateful for any ideas and help Regards, SNVK [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.