Hi,
Sorry that my example is not clear. I will give an example of what each
variable holds. I hope this clearly explains the case.
Names of the dataframe (df) and description
Year :- Year is calendar year, from 1980 to 2010
Country :- is the country name, total no. (levels) of countries is ~ 190
Commodity :- Crude oil, Sugar, Rubber, Coffee .... No. (levels) of
commodities is 20
Attribute: - Production, Consumption, Stock, Import, Export... Levels ~ 20
Unit :- this is actually not a factor. It describes the unit of Attribute.
Say the unit for Coffee (commodity) - Production (attribute) is 60 kgs.
While the unit for Crude oil - Production is 1000 barrels
Value :- value
tail(df, n = 10) // example data//
Year Country Commodity Attribute Unit
Value
1991 United Kingdom Wheat, Durum Total Supply (1000 MT) 70
1991 United Kingdom Wheat, Durum TY Exports (1000 MT) 0
1991 United Kingdom Wheat, Durum TY Imp. from U (1000 MT) 0
1991 United Kingdom Wheat, Durum TY Imports (1000 MT) 60
1991 United Kingdom Wheat, Durum Yield (MT/HA) 5
Wish this is clear. Any suggestion
Regards,
SNVK
-----Original Message-----
From: Petr PIKAL [mailto:petr.pikal at precheza.cz]
Sent: Tuesday, January 04, 2011 4:06 PM
To: SNV Krishna
Cc: r-help at r-project.org
Subject: Odp: [R] how to subset unique factor combinations from a data
frame.
Hi
r-help-bounces at r-project.org napsal dne 04.01.2011 05:21:25:
Hi All
I have these questions and request members expert view on this.
a) I have a dataframe (df) with five factors (identity variables) and
value
(measured value). The id variables are Year, Country, Commodity,
Attribute,
Unit. Value is a value for each combination of this.
I would like to get just the unique combination of Commodity,
Attribute
and
Unit. I just need the unique factor combination into a dataframe or a
table.
I know aggregate and subset but dont how to use them in this context.
know this is not correct as it returns an error 'subset needs a
logical evaluation'. Trying various ways to accomplish the task.
Probably sqldf package has tools for doing it but I do not use it so you
have to try yourself.
df[Comodity==something, c("Commodity", "Attribute", "Unit")]
can be other way.
Anyway your explanation is ambiguous. Let say you have three rows with the
same Commodity. Which row do you want to select?
Regards
Petr
will be grateful for any ideas and help
Regards,
SNVK
[[alternative HTML version deleted]]