Skip to content
Back to formatted view

Raw Message

Message-ID: <812547.1625856998@apollo2.minshall.org>
Date: 2021-07-09T18:56:38Z
From: Greg Minshall
Subject: problem for strsplit function
In-Reply-To: Your message of "Thu, 08 Jul 2021 15:24:25 +0000." <762548966.2994133.1625757865539@mail.yahoo.com>

Kai,

> one more question, how can I know if the function is for column
> manipulations or for vector?

i still stumble around R code.  but, i'd say the following (and look
forward to being corrected! :):

1.  a column, when extracted from a data frame, *is* a vector.

2.  maybe your question is "is a given function for a vector, or for a
    data frame/matrix/array?".  if so, i think the only way is reading
    the help information (?foo).

3.  sometimes, extracting the column as a vector from a data frame-like
    object might be non-intuitive.  you might find reading ?"[" and
    ?"[.data.frame" useful (as well as ?"[.data.table" if you use that
    package).  also, the str() command can be helpful in understanding
    what is happening.  (the lobstr:: package's sxp() function, as well
    as more verbose .Internal(inspect()) can also give you insight.)

    with the data.table:: package, for example, if "DT" is a data.table
    object, with "x2" as a column, adding or leaving off quotation marks
    for the column name can make all the difference between ending up
    with a vector, or with a (much reduced) data table:
----
> is.vector(DT[, x2])
[1] TRUE
> str(DT[, x2])
 num [1:9] 32 32 32 32 32 32 32 32 32
>
> is.vector(DT[, "x2"])
[1] FALSE
> str(DT[, "x2"])
Classes ?data.table? and 'data.frame':  9 obs. of  1 variable:
 $ x2: num  32 32 32 32 32 32 32 32 32
 - attr(*, ".internal.selfref")=<externalptr>
----

    a second level of indexing may or may not help, mostly depending on
    the use of '[' versus of '[['.  this can sometimes cause confusion
    when you are learning the language.
----
> str(DT[, "x2"][1])
Classes ?data.table? and 'data.frame':  1 obs. of  1 variable:
 $ x2: num 32
 - attr(*, ".internal.selfref")=<externalptr>
> str(DT[, "x2"][[1]])
 num [1:9] 32 32 32 32 32 32 32 32 32
----

    the tibble:: package (used in, e.g., the dplyr:: package) also
    (always?) returns a single column as a non-vector.  again, a
    second indexing with double '[[]]' can produce a vector.
----
> DP <- tibble(DT)
> is.vector(DP[, "x2"])
[1] FALSE
> is.vector(DP[, "x2"][[1]])
[1] TRUE
----

    but, note that a list of lists is also a vector:
> is.vector(list(list(1), list(1,2,3)))
[1] TRUE
> str(list(list(1), list(1,2,3)))
List of 2
 $ :List of 1
  ..$ : num 1
 $ :List of 3
  ..$ : num 1
  ..$ : num 2
  ..$ : num 3

    etc.

hth.  good luck learning!

cheers, Greg