Assoociative array?
I am using a simple R statement to read in the file:
a <- read.csv("Sample.dat", header=TRUE)
There is alot of data but the first few lines look like:
DayOfYear,Quantity,Fraction,Category,SubCategory
1,82,0.0000390392720794458,(Unknown),(Unknown)
2,78,0.0000371349173438631,(Unknown),(Unknown)
. . .
71,2,0.0000009521773677913,WOMEN,Piratesses
72,4,0.0000019043547355827,WOMEN,Piratesses
73,3,0.0000014282660516870,WOMEN,Piratesses
74,14,0.0000066652415745395,WOMEN,Piratesses
75,2,0.0000009521773677913,WOMEN,Piratesses
If I read the data in as above, the command
a[1]
results in the output
[ reached getOption("max.print") -- omitted 16193 rows ]]
Shouldn't this be the first row?
a$Category[1]
results in the output
[1] (Unknown)
4464 Levels: Tags ... WOMEN
But
a$Category[365]
gives me:
[1] 7 Plates (Dessert),Western\n120,5,0.0000023804434194784,7 Plates (Dessert)
4464 Levels: Tags ... WOMEN
There is something fundamental about either vectors of the read.csv command that I am missing here.
Thank you.
Kevin
---- jim holtman <jholtman at gmail.com> wrote:
Please provide commented, minimal, self-contained, reproducible code, or at least a before/after of what you data would look like. Taking a guess at what you are asking, here is one way of doing it:
x <- data.frame(cat=sample(LETTERS[1:3],20,TRUE),a=1:20, b=runif(20)) x
cat a b 1 B 1 0.65472393 2 C 2 0.35319727 3 B 3 0.27026015 4 A 4 0.99268406 5 C 5 0.63349326 6 A 6 0.21320814 7 C 7 0.12937235 8 A 8 0.47811803 9 A 9 0.92407447 10 A 10 0.59876097 11 A 11 0.97617069 12 A 12 0.73179251 13 B 13 0.35672691 14 C 14 0.43147369 15 C 15 0.14821156 16 C 16 0.01307758 17 B 17 0.71556607 18 B 18 0.10318424 19 C 19 0.44628435 20 B 20 0.64010105
# create a list of the indices of the data grouped by 'cat' split(seq(nrow(x)), x$cat)
$A [1] 4 6 8 9 10 11 12 $B [1] 1 3 13 17 18 20 $C [1] 2 5 7 14 15 16 19
# or do you want the data split(x, x$cat)
$A cat a b 4 A 4 0.9926841 6 A 6 0.2132081 8 A 8 0.4781180 9 A 9 0.9240745 10 A 10 0.5987610 11 A 11 0.9761707 12 A 12 0.7317925 $B cat a b 1 B 1 0.6547239 3 B 3 0.2702601 13 B 13 0.3567269 17 B 17 0.7155661 18 B 18 0.1031842 20 B 20 0.6401010 $C cat a b 2 C 2 0.35319727 5 C 5 0.63349326 7 C 7 0.12937235 14 C 14 0.43147369 15 C 15 0.14821156 16 C 16 0.01307758 19 C 19 0.44628435 On Sat, Jul 12, 2008 at 3:32 AM, <rkevinburton at charter.net> wrote:
I have search the archive and I could not find what I need so I will try to ask the question here. I read a table in (read.table) a <- read.table(.....) The table has column names like DayOfYear, Quantity, and Category. The values in the row for Category are strings (characters). I want to get all of the rows grouped by Category. The number of unique category names could be around 50. Say for argument sake the number of categories is exactly 50. Can I somehow get a vector of length 50 containing the rows corresponding to the category (another vector)? I realize I can access any row a[i]$Category (right?). But I wanta vector containing the rows corresponding to each distinct Category name. Thank you. Kevin
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?