problem for strsplit function
Thanks Bert, I'm reading some books now. But it takes me a while to get familiar R. Best,
Kai On Friday, July 9, 2021, 03:06:11 PM PDT, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
On 09/07/2021 5:51 p.m., Jeff Newmiller wrote:
"Strictly speaking", Greg is correct, Bert. https://cran.r-project.org/doc/manuals/r-release/R-lang.html#List-objects Lists in R are vectors. What we colloquially refer to as "vectors" are more precisely referred to as "atomic vectors". And without a doubt, this "vector" nature of lists is a key underlying concept that explains why adding a dim attribute creates a matrix that can hold data frames. It is also a stumbling block for programmers from other languages that have things like linked lists.
I would also object to v3 (below) as "extracting" a column from d. "d[2]" doesn't extract anything, it "subsets" the data frame, so the result is a data frame, not what you get when you extract something from a data frame. People don't realize that "x <- 1:10; y <- x[[3]]" is perfectly legal. That extracts the 3rd element (the number 3).? The problem is that R has no way to represent a scalar number, only a vector of numbers, so x[[3]] gets promoted to a vector containing that number when it is returned and assigned to y. Lists are vectors of R objects, so if x is a list, x[[3]] is something that can be returned, and it is different from x[3]. Duncan Murdoch
On July 9, 2021 2:36:19 PM PDT, Bert Gunter <bgunter.4567 at gmail.com> wrote:
"1.? a column, when extracted from a data frame, *is* a vector." Strictly speaking, this is false; it depends on exactly what is meant by "extracted." e.g.:
d <- data.frame(col1 = 1:3, col2 = letters[1:3]) v1 <- d[,2] ## a vector v2 <- d[[2]] ## the same, i.e identical(v1,v2)
[1] TRUE
v3 <- d[2] ## a data.frame v1
[1] "a" "b" "c"? ## a character vector
v3
? col2 1? ? a 2? ? b 3? ? c
is.vector(v1)
[1] TRUE
is.vector(v3)
[1] FALSE
class(v3)? ## data.frame
[1] "data.frame" ## but
is.list(v3)
[1] TRUE which is simply explained in ?data.frame (where else?!) by: "A data frame is a **list** [emphasis added] of variables of the same number of rows with unique row names, given class "data.frame". If no variables are included, the row names determine the number of rows." "2.? maybe your question is "is a given function for a vector, or for a ? ? data frame/matrix/array?".? if so, i think the only way is reading ? ? the help information (?foo)." Indeed! Is this not what the Help system is for?! But note also that the S3 class system may somewhat blur the issue: foo() may work appropriately and differently for different (S3) classes of objects. A detailed explanation of this behavior can be found in appropriate resources or (more tersely) via ?UseMethod . "you might find reading ?"[" and? ?"[.data.frame" useful" Not just 'useful" -- **essential** if you want to work in R, unless one gets this information via any of the numerous online tutorials, courses, or books that are available. The Help system is accurate and authoritative, but terse. I happen to like this mode of documentation, but others may prefer more extended expositions. I stand by this claim even if one chooses to use the "Tidyverse", data.table package, or other alternative frameworks for handling data. Again, others may disagree, but R is structured around these basics, and imo one remains ignorant of them at their peril. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Jul 9, 2021 at 11:57 AM Greg Minshall <minshall at umich.edu> wrote:
Kai,
one more question, how can I know if the function is for column manipulations or for vector?
i still stumble around R code.? but, i'd say the following (and look forward to being corrected! :): 1.? a column, when extracted from a data frame, *is* a vector. 2.? maybe your question is "is a given function for a vector, or for
a
? ? ? data frame/matrix/array?".? if so, i think the only way is
reading
? ? ? the help information (?foo). 3.? sometimes, extracting the column as a vector from a data
frame-like
? ? ? object might be non-intuitive.? you might find reading ?"[" and ? ? ? ?"[.data.frame" useful (as well as ?"[.data.table" if you use
that
? ? ? package).? also, the str() command can be helpful in
understanding
? ? ? what is happening.? (the lobstr:: package's sxp() function, as
well
? ? ? as more verbose .Internal(inspect()) can also give you insight.) ? ? ? with the data.table:: package, for example, if "DT" is a
data.table
? ? ? object, with "x2" as a column, adding or leaving off quotation
marks
? ? ? for the column name can make all the difference between ending up ? ? ? with a vector, or with a (much reduced) data table: ----
is.vector(DT[, x2])
[1] TRUE
str(DT[, x2])
? num [1:9] 32 32 32 32 32 32 32 32 32
is.vector(DT[, "x2"])
[1] FALSE
str(DT[, "x2"])
Classes ?data.table? and 'data.frame':? 9 obs. of? 1 variable: ? $ x2: num? 32 32 32 32 32 32 32 32 32 ? - attr(*, ".internal.selfref")=<externalptr> ---- ? ? ? a second level of indexing may or may not help, mostly depending
on
? ? ? the use of '[' versus of '[['.? this can sometimes cause
confusion
? ? ? when you are learning the language. ----
str(DT[, "x2"][1])
Classes ?data.table? and 'data.frame':? 1 obs. of? 1 variable: ? $ x2: num 32 ? - attr(*, ".internal.selfref")=<externalptr>
str(DT[, "x2"][[1]])
? num [1:9] 32 32 32 32 32 32 32 32 32 ---- ? ? ? the tibble:: package (used in, e.g., the dplyr:: package) also ? ? ? (always?) returns a single column as a non-vector.? again, a ? ? ? second indexing with double '[[]]' can produce a vector. ----
DP <- tibble(DT) is.vector(DP[, "x2"])
[1] FALSE
is.vector(DP[, "x2"][[1]])
[1] TRUE ---- ? ? ? but, note that a list of lists is also a vector:
is.vector(list(list(1), list(1,2,3)))
[1] TRUE
str(list(list(1), list(1,2,3)))
List of 2 ? $ :List of 1 ? ? ..$ : num 1 ? $ :List of 3 ? ? ..$ : num 1 ? ? ..$ : num 2 ? ? ..$ : num 3 ? ? ? etc. hth.? good luck learning! cheers, Greg
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.