Skip to content

remove all terms with interaction factor in formula

6 messages · Alexander Shenkin, Bert Gunter, David Winsemius +1 more

#
Hi Folks,

I'm trying to find a way to remove all terms in a formula that contain a
particular interaction.

For example, in the formula below, I'd like to remove all terms that
contain the b:c interaction.
[1] "a"       "b"       "c"       "d"       "a:b"     "a:c"
 [7] "b:c"     "a:d"     "b:d"     "c:d"     "a:b:c"   "a:b:d"
[13] "a:c:d"   "b:c:d"   "a:b:c:d"

My eventual use is to fit models with the reduced formulas.

For example:
c=runif(100), d=runif(100))
I can remove particular terms with update(), but I don't see a way to
remove all terms that contain a given combination of factors.

Any help would be greatly appreciated.  Thanks!

Allie
#
~ a*b*d + a*c*d
-- Bert
On Thu, Sep 13, 2012 at 10:49 AM, Alexander Shenkin <ashenkin at ufl.edu> wrote:

  
    
#
On Sep 13, 2012, at 11:00 AM, Bert Gunter wrote:

            
That seemed pretty clear and obvious, but I started wondering how to tell the machine to do it. Here is another idea:
[1] "a"     "b"     "c"     "d"     "a:b"   "a:c"   "a:d"   "b:d"   "c:d"   "a:b:d" "a:c:d"

(Although I realize it's no longer a formula and might need to be reassembled with `paste` and  `as.formula`.)
#
Your method would not work for, e.g., "a:d".  You could look at the "factors" attribute
of a terms object and select out those columns with non-zero entries for the variables
in the interaction of interest.  E.g.,
a b c d a:b a:c b:c a:d b:d c:d a:b:c a:b:d a:c:d b:c:d a:b:c:d
a 1 0 0 0   1   1   0   1   0   0     1     1     1     0       1
b 0 1 0 0   1   0   1   0   1   0     1     1     0     1       1
c 0 0 1 0   0   1   1   0   0   1     1     0     1     1       1
d 0 0 0 1   0   0   0   1   1   1     0     1     1     1       1
[1] "a"     "b"     "c"     "d"     "a:b"   "a:c"   "a:d"   "b:d"   "c:d"   "a:b:d" "a:c:d" 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On Sep 13, 2012, at 11:53 AM, William Dunlap wrote:

            
It's probably a black mark against my abilities to do logic manipulations, but it made a lot more sense when I wrote it (admittedly the same meaning)  as :

colnames(fm)[ !(fm["b",]==1 & fm["c",]==1) ]

Here's a grepping method that only requires that the order be a.d in any term:
grep("a.+d", attr(terms(~a*b*c*d), "term.labels" ) ,
           invert=TRUE, value=TRUE), collapse="+") ) )
~a + b + c + d + a:b + a:c + b:c + b:d + c:d + a:b:c + b:c:d

I think that if you are working with a*b*c*d that the order will always be a-before-d.
#
The fm[,"c"]==1 does not work correctly as the value 2 also
means that the variable (the row) is in the term (the column).
The 2 means that you don't need to apply the contrasts function
to that variable in this term.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com