Skip to content

Reconstruction of a "valid" expression within a function

3 messages · Pascal Boisson, Peter Dalgaard, Tony Plate

#
"Pascal Boisson" <Pascal.Boisson at scri.ac.uk> writes:
No you haven't... You're passing a string to subset(). BTW, it would
be easier to follow your code if it didn't use "subset" with two
different meanings. At the very least you'd need to parse the subset
expression and either eval() it and pass the result to subset(), or
use substitute to insert it at the proper place and eval the whole
enchillada. 

But why? subset() does this stuff internally already:
Ozone Solar.R Wind Temp Month Day
9       8      19 20.1   61     5   9
11      7      NA  6.9   74     5  11
18      6      78 18.4   57     5  18
21      1       8  9.7   59     5  21
23      4      25  9.7   61     5  23
76      7      48 14.3   80     7  15
94      9      24 13.8   81     8   2
114     9      36 14.3   72     8  22
137     9      24 10.9   71     9  14
147     7      49 10.3   69     9  24

[look inside subset.data.frame for the code that accomplishes this]

  
    
#
You are passing just a string to subset().  At the very least you need 
to parse it (but still this does not work easily with subset() -- see 
below).  But are you sure you need to do this?  subset() for dataframes 
already accepts subset expressions involving the columns of the 
dataframe, e.g.:

 > df <- data.frame(x=1:10,y=rep(1:5,2))
 > subset(df, y==2)
   x y
2 2 2
7 7 2
 >

However, it's tricky to get subset() to work with an expression for its 
subset argument.  This is because of the way it evaluates its subset 
expression (look at the code for subset.data.frame()).

 > subset(df, parse(text="df$y==2"))
Error in subset.data.frame(df, parse(text = "df$y==2")) :
         'subset' must evaluate to logical
 > subset(df, parse(text="y==2"))
Error in subset.data.frame(df, parse(text = "y==2")) :
         'subset' must evaluate to logical
 >

It's a little tricky in general passing R language expressions around, 
because many functions that work with expressions work with the 
unevaluated form of the actual argument, rather than with an R language 
expression as the value of a variable.  E.g.:

 > with(df, y==2)
  [1] FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
 > cond <- parse(text="y==2")
 > cond
expression(y == 2)
 > with(df, cond)
expression(y == 2)

One way to make these types of functions work with R language 
expressions as the value of a variable is to use do.call():

 > do.call("with", list(df, cond))
  [1] FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
 >

So, returning to subset(), you can give it an expression that is stored 
in the value of a variable like this:

 > do.call("subset", list(df, cond))
   x y
2 2 2
7 7 2
 >

However, if you're a beginner at R, I suspect that you'll get much 
further if you avoid such meta-language constructs and just find a way 
to make subset() work for you without trying to paste together R 
language expressions.

Hope this helps,

-- Tony Plate
Pascal Boisson wrote: