-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Jerry Floren
Sent: Friday, 22 January 2010 9:53 a.m.
To: r-help at r-project.org
Subject: Re: [R] Help with subset
Thank you Peter. I am really new to this. The spreadsheet I
am working with has 12,379 rows with the first row consisting
of the variable names and
12,378 rows of data. There are seven columns, and the 7th
column is the only one with numerical data ("Results").
I need to match up the variable Results with the variable
"Anlysis_Soil", which is the type of test performed by the
labs on one of 20 different soil samples. Here are some
examples of the Anlysis_Soil variable:
Anlysis_Soil
Bases-Aluminum KCL Extr-2008-116
Bases-Aluminum KCL Extr-2008-116
Bases-Aluminum KCL Extr-2008-117
Bases-Aluminum KCL Extr-2008-118
Bases-Aluminum KCL Extr-2008-118
Bases-Aluminum KCL Extr-2008-119
Bases-Aluminum KCL Extr-2008-120
Bases-Aluminum KCL Extr-2008-120
Bases-Aluminum KCL Extr-2009-101
Actually, I am not interested in any of the above, because
there are too few (less than 9).
I think I need to first identify the unique Anlysis_Soil from
the entire list, and I thought using "list" might work:
anlyses <- list(Anlysis_Soil)
str(anlyses)
List of 1
$ : Factor w/ 1695 levels "Bases-Aluminum KCL
Extr-2008-116",..: 1 1 2 3 3
4 5 5 6 6 ...
It does correctly identify there are 1695 unique
"Anlysis_Soil" variables.
However, "anlyses" contains all 12,378 "Anlysis_Soil" variables. For
example:
print(anlyses)
...
...
...
[12374] Soil pH & EC-Soil EC (1to2)-2009-115
[12375] Soil pH & EC-Soil EC (1to2)-2009-115
[12376] Soil pH & EC-Soil EC (1to2)-2009-115
[12377] Soil pH & EC-Soil EC (1to2)-2009-115
[12378] Soil pH & EC-Soil EC (1to2)-2009-115
1695 Levels: Bases-Aluminum KCL Extr-2008-116 ...
And once again shows correctly that there are 1,695 unique
"Anlysis_Soil"
variables.
Once the unique Anlysis_Soil variables are identified, I need
to determine the ones greater than 8, and I see how that
could be done with your code.
I am not clear what you mean by, "for (myV in myVars)" ? Is
myV the name of one of the unique variables that has at least
9 Results? Is myVars the entire column of "Anlysis_Soil" ?
I am not sure if this is any clearer.
Thanks,
Jerry Floren
Minnesota Department of Agriculture
--
View this message in context:
http://n4.nabble.com/Help-with-subset-tp1049883p1058242.html
Sent from the R help mailing list archive at Nabble.com.