Odp: Book: The New S Language
Hi r-help-bounces at r-project.org napsal dne 12.12.2007 07:45:54:
I'm struggling a bit with R with understanding functions and what's
going
on under the hood.
For example, I was given this snippet of code via this mailing list:
DF <- data.frame(A =c("A", "A", "A", "B", "C"), B=c(1,1,2,2,0))
g <- paste(DF$A, DF$B)
s <- split(DF, g)
split appears to be taken the string provided by paste, e.g. "A 1", and
then deciding that the first character "A" in this string relates to
DF$A,
and the second character "1" to DF$B. The split help pages don't really tell me why split is working like this
-
maybe I'm just looking in the wrong place in the help page.
Maybe you misunderstand what split does:
DF <- data.frame(A =c("A", "A", "A", "B", "C"), B=c(1,1,2,2,0))
a data frame with 2 columns
g <- paste(DF$A, DF$B)
a character vector (which is internally turned into factor). And split splits your data frame DF into groups according to levels of "g". It does not at all lookup into columns of your data frame (and the help page does not even give you a suspicion that it shall behave like that). first sentence in help page tells you split divides the data in the vector x into the groups defined by f. compare your result with
g.f<-factor(g) str(g.f)
Factor w/ 4 levels "A 1","A 2","B 2",..: 1 1 2 3 4
g.n<-as.numeric(g.f) g.n
[1] 1 1 2 3 4 split(DF, g.f) split(DF, g.n) And try to explain with your idea how split works this DF$A<-rnorm(5) s <- split(DF, g) Regards Petr
I've got plenty of books on R, but only one of them gives a bit of info
on
the use of split - not in enough detail though. Does the book
referenced
in the help pages ("The New S Language") give more insight into
this and other functions?
Will I just get an Aha! moment with R after using it for a while?
Thanks in advance...
Chris
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.