Getting subsets of a data frame
I am reading as fast as I can! Just started with R five days ago. I found the following in the documentation: "Although the default for 'drop' is 'TRUE', the default behaviour when only one _row_ is left is equivalent to specifying 'drop = FALSE'. To drop from a data frame to a list, 'drop = FALSE' has to (sic) specified explicitly." I think the exception mentioned in the first sentence is the reason for my confusion. I also think the second sentence is wrong and should have 'TRUE' instead of 'FALSE'. While it is true that a data frame is a list, it is not a list of numbers, but rather a list of columns, which, if I understand correctly, can be either vectors or matrices. So regardless of the value assigned to 'drop' the returned object is a list. When I asked "why isn't sw[1, ] a list?" I should have asked instead "why isn't sw[1, ] a list of vectors?" I did some experiments with a data frame a, where the columns are vectors (no matrix columns):
is.data.frame(a) # just checking
[1] TRUE
a1<- a[3, ] (is.data.frame(a1))
[1] TRUE (did not sop being a data frame)
(is.list(a1))
[1] TRUE (but it is a list)
a2<- a[3, , drop=T] (is.data.frame(a2))
[1] FALSE (no longer a data frame)
(is.list(a2))
[1] TRUE (but it is a list)
a3<- a[3, , drop=F] (is.data.frame(a3))
[1] TRUE (still a data frame)
(is.list(a3))
[1] TRUE (but it is a list) I also tried:
a2[1]
$dates.num [1] 477032400
a3[1]
dates.num 3 477032400 (notice the row name)
attributes(a3[1])
$names [1] "dates.num" $class [1] "data.frame" $row.names [1] "3"
attributes(a2[1])
$names [1] "dates.num" FS
On 4/16/05, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
On Sat, 16 Apr 2005, Prof Brian Ripley wrote:
Perhaps Fernando will also note that is documented in ?"[.data.frame", a slightly more appropriate reference than Bill's. It would be a good idea to read a good account of R's indexing: Bill Venables and I know of a couple you will find in the R FAQ.
BTW, sw <- swiss sw[1,,drop=TRUE] *is* a list (not as claimed, but as documented) sw[1, ] is a data frame sw[, 1] is a numeric vector. I should have pointed out that "[.data.frame" is in the See Also of Bill's reference. BTW to Andy: a list is a vector, and Kurt and I recently have been trying to correct documentation that means `atomic vector' when it says `vector'. (Long ago lists in R were pairlists and not vectors.)
is.vector(list(a=1))
[1] TRUE
On Sat, 16 Apr 2005, Liaw, Andy wrote:
Because a data frame can hold different data types (even matrices) in different variables, one row of it can not be converted to a vector in general (where all elements need to be of the same type). Andy
From: Fernando Saldanha Thanks, it's interesting reading. I also noticed that sw[, 1, drop = TRUE] is a vector (coerces to the lowest dimension) but sw[1, , drop = TRUE] is a one-row data frame (does not convert it into a list or vector) FS On 4/16/05, Bill.Venables at csiro.au <Bill.Venables at csiro.au> wrote:
You should look at
?"["
and look very carefully at the "drop" argument. For your example
sw[, 1]
is the first component of the data frame, but
sw[, 1, drop = FALSE]
is a data frame consisting of just the first component, as mathematically fastidious people would expect. This is a convention, and like most arbitrary conventions
it can be very
useful most of the time, but some of the time it can be a very nasty trap. Caveat emptor. Bill Venables. -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of
Fernando Saldanha
Sent: Saturday, 16 April 2005 1:07 PM To: Submissions to R help Subject: [R] Getting subsets of a data frame I was reading in the Reference Manual about Extract.data.frame. There is a list of examples of expressions using [ and [[, with the outcomes. I was puzzled by the fact that, if sw is a data
frame, then
sw[, 1:3] is also a data frame, but sw[, 1] is just a vector. Since R has no scalars, it must be the case that 1 and 1:1
are the same:
1 == 1:1
[1] TRUE Then why isn't sw[,1] = sw[, 1:1] a data frame? FS
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595