Skip to content

Reshape:cast; error using "..." in formula expression.

5 messages · misterbray, Dennis Murphy, Hadley Wickham

#
Whenever I use "..." in the formula of the cast function, from the reshape
package, I get the following error:

Error in `[.data.frame`(data, , variables, drop = FALSE) : 
  undefined columns selected


For example:

data(french_fries) #available in the reshape package
time treatment subject rep potato buttery grassy rancid painty
61    1         1       3   1    2.9     0.0    0.0    0.0    5.5
25    1         1       3   2   14.0     0.0    0.0    1.1    0.0
62    1         1      10   1   11.0     6.4    0.0    0.0    0.0
26    1         1      10   2    9.9     5.9    2.9    2.2    0.0
63    1         1      15   1    1.2     0.1    0.0    1.1    5.1
27    1         1      15   2    8.8     3.0    3.6    1.5    2.3
Using painty as value column.  Use the value argument to cast to override
this choice
Error in `[.data.frame`(data, , variables, drop = FALSE) : 
  undefined columns selected


 

--
View this message in context: http://r.789695.n4.nabble.com/Reshape-cast-error-using-in-formula-expression-tp3584721p3584721.html
Sent from the R help mailing list archive at Nabble.com.
#
Hi:

Short answer: use one dot, not three:
Using painty as value column.  Use the value argument to cast to
override this choice
Aggregation requires fun.aggregate: length used as default
  value  3 10 15 16 19 31 51 52 63 78 79 86
1 (all) 54 60 60 60 60 54 60 60 60 60 54 54

Long answer:
It's the same as using . in a model formula. The ... construct is used
as a formal argument in a function *definition* to allow passage of
needed arguments in a function call that are not part of the list of
formal arguments. I noticed
function (data, formula = ... ~ variable, fun.aggregate = NULL,
    ..., margins = FALSE, subset = TRUE, df = FALSE, fill = NULL,
    add.missing = FALSE, value = guess_value(data))

so I can see where you may have gotten confused. Here's an example
using the same data frame where the ... argument comes into play:

# melt the response variables (the sensory attributes) into
# a factor variable for the attributes themselves and a value
# variable for their corresponding values.

ffm <- melt(french_fries, id = c('subject', 'time', 'treatment', 'rep'))
head(ffm)

# Recast the data so that the average score per subject/treatment
score is produced.
# However, there are NAs in the data frame, so we need to pass na.rm = TRUE:

cast(ffm, subject + treatment  ~ variable, value_var = 'value',
        fun.aggregate = 'mean', na.rm = TRUE)

# To average over all subjects, treatments, times and reps,
cast(ffm, . ~ variable, value_var = 'value', fun.aggregate = 'mean',
         na.rm = TRUE)
  value   potato  buttery    grassy  rancid   painty
1 (all) 6.952518 1.823699 0.6641727 3.85223 2.521758

na.rm  is not part of the formal argument list to cast(), but because
the ... construct is present, we can pass na.rm = TRUE to the mean()
function used to aggregate the data in the actual call. Observe that
function (x, trim = 0, na.rm = FALSE, ...)

so the na.rm = TRUE argument in the call to cast() is actually passed to mean().
 [To understand how this works, you need to do some study about
function writing; the R Language Definition manual is one place where
this is described in detail. The formal arguments to mean.default()
are x, trim, na.rm and ...; trim and na.rm have default values 0 and
FALSE than can be overridden in an actual call to that function.]

You should never need to use ... in an actual function call; in a
formula, use of . on one side of ~ means to use all variables in the
data frame except those used on the other side (the side where one or
more  variables are specified). For example, in a linear regression
context,

lm(y ~ ., data = mydata)

would use all variables in mydata except y as covariates in the model.

HTH,
Dennis
On Thu, Jun 9, 2011 at 12:10 AM, misterbray <misterbray at gmail.com> wrote:
#
Dennis, doing some more research, and it seems you actually can include the
... term directly in the formula: cf. page 8 of
http://www.had.co.nz/reshape/introduction.pdf (that article also explains
why you might want to do so). It seems including the ... term only works,
however, when your value column actually has the name "value" (e.g. using
the value="my.val" option yields the error). This was the "bug" that was
catching me up yesterday.

Thank you again Dennis,
Yours,
Rob

--
View this message in context: http://r.789695.n4.nabble.com/Reshape-cast-error-using-in-formula-expression-tp3584721p3586597.html
Sent from the R help mailing list archive at Nabble.com.
2 days later
#
Yes, the basic problem is that you forgot to melt the data before
trying to cast it.

Hadley
On Thursday, June 9, 2011, misterbray <misterbray at gmail.com> wrote: