remove all terms with interaction factor in formula

6 messages · Alexander Shenkin, Bert Gunter, David Winsemius +1 more

Original

1

6

Alexander Shenkin

Thu, Sep 13, 2012 10:49 AM #

Hi Folks,

I'm trying to find a way to remove all terms in a formula that contain a
particular interaction.

For example, in the formula below, I'd like to remove all terms that
contain the b:c interaction.

[1] "a"       "b"       "c"       "d"       "a:b"     "a:c"
 [7] "b:c"     "a:d"     "b:d"     "c:d"     "a:b:c"   "a:b:d"
[13] "a:c:d"   "b:c:d"   "a:b:c:d"

My eventual use is to fit models with the reduced formulas.

For example:

c=runif(100), d=runif(100))

I can remove particular terms with update(), but I don't see a way to
remove all terms that contain a given combination of factors.

Any help would be greatly appreciated.  Thanks!

Allie

Bert Gunter

Thu, Sep 13, 2012 11:00 AM #

~ a*b*d + a*c*d
-- Bert

On Thu, Sep 13, 2012 at 10:49 AM, Alexander Shenkin <ashenkin at ufl.edu> wrote:

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

David Winsemius

Thu, Sep 13, 2012 11:27 AM #

On Sep 13, 2012, at 11:00 AM, Bert Gunter wrote:

That seemed pretty clear and obvious, but I started wondering how to tell the machine to do it. Here is another idea:

[1] "a"     "b"     "c"     "d"     "a:b"   "a:c"   "a:d"   "b:d"   "c:d"   "a:b:d" "a:c:d"

(Although I realize it's no longer a formula and might need to be reassembled with `paste` and  `as.formula`.)

David.

> -- Bert
> On Thu, Sep 13, 2012 at 10:49 AM, Alexander Shenkin <ashenkin at ufl.edu> wrote:
>> Hi Folks,
>> 
>> I'm trying to find a way to remove all terms in a formula that contain a
>> particular interaction.
>> 
>> For example, in the formula below, I'd like to remove all terms that
>> contain the b:c interaction.
>> 
>>> attributes(terms( ~ a*b*c*d))$term.labels
>> [1] "a"       "b"       "c"       "d"       "a:b"     "a:c"
>> [7] "b:c"     "a:d"     "b:d"     "c:d"     "a:b:c"   "a:b:d"
>> [13] "a:c:d"   "b:c:d"   "a:b:c:d"
>> 
>> My eventual use is to fit models with the reduced formulas.
>> 
>> For example:
>>> my_df = data.frame( iv = runif(100), a=runif(100), b=runif(100),
>> c=runif(100), d=runif(100))
>>> lm(iv ~ a*b*c*d, data=my_df)
>> 
>> I can remove particular terms with update(), but I don't see a way to
>> remove all terms that contain a given combination of factors.
>> 
>> Any help would be greatly appreciated.  Thanks!
>> 
>> Allie
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> 
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA

Thu, Sep 13, 2012 11:53 AM #

Your method would not work for, e.g., "a:d".  You could look at the "factors" attribute
of a terms object and select out those columns with non-zero entries for the variables
in the interaction of interest.  E.g.,

a b c d a:b a:c b:c a:d b:d c:d a:b:c a:b:d a:c:d b:c:d a:b:c:d
a 1 0 0 0   1   1   0   1   0   0     1     1     1     0       1
b 0 1 0 0   1   0   1   0   1   0     1     1     0     1       1
c 0 0 1 0   0   1   1   0   0   1     1     0     1     1       1
d 0 0 0 1   0   0   0   1   1   1     0     1     1     1       1

[1] "a"     "b"     "c"     "d"     "a:b"   "a:c"   "a:d"   "b:d"   "c:d"   "a:b:d" "a:c:d" 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
Of David Winsemius
Sent: Thursday, September 13, 2012 11:28 AM
To: Bert Gunter
Cc: Alexander Shenkin; r-help at r-project.org
Subject: Re: [R] remove all terms with interaction factor in formula


On Sep 13, 2012, at 11:00 AM, Bert Gunter wrote:

~ a*b*d + a*c*d

That seemed pretty clear and obvious, but I started wondering how to tell the machine to
do it. Here is another idea:

grep("b:c", attr(terms(~a*b*c*d), "term.labels" ) ,invert=TRUE, value=TRUE)

 [1] "a"     "b"     "c"     "d"     "a:b"   "a:c"   "a:d"   "b:d"   "c:d"   "a:b:d" "a:c:d"

(Although I realize it's no longer a formula and might need to be reassembled with `paste`
and  `as.formula`.)

--
David.

-- Bert
On Thu, Sep 13, 2012 at 10:49 AM, Alexander Shenkin <ashenkin at ufl.edu> wrote:

Hi Folks,

I'm trying to find a way to remove all terms in a formula that contain a
particular interaction.

For example, in the formula below, I'd like to remove all terms that
contain the b:c interaction.

attributes(terms( ~ a*b*c*d))$term.labels

[1] "a"       "b"       "c"       "d"       "a:b"     "a:c"
[7] "b:c"     "a:d"     "b:d"     "c:d"     "a:b:c"   "a:b:d"
[13] "a:c:d"   "b:c:d"   "a:b:c:d"

My eventual use is to fit models with the reduced formulas.

For example:

my_df = data.frame( iv = runif(100), a=runif(100), b=runif(100),

c=runif(100), d=runif(100))

lm(iv ~ a*b*c*d, data=my_df)

I can remove particular terms with update(), but I don't see a way to
remove all terms that contain a given combination of factors.

Any help would be greatly appreciated.  Thanks!

Allie

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius

Thu, Sep 13, 2012 4:14 PM #

On Sep 13, 2012, at 11:53 AM, William Dunlap wrote:

It's probably a black mark against my abilities to do logic manipulations, but it made a lot more sense when I wrote it (admittedly the same meaning)  as :

colnames(fm)[ !(fm["b",]==1 & fm["c",]==1) ]

Here's a grepping method that only requires that the order be a.d in any term:

grep("a.+d", attr(terms(~a*b*c*d), "term.labels" ) ,
           invert=TRUE, value=TRUE), collapse="+") ) )
~a + b + c + d + a:b + a:c + b:c + b:d + c:d + a:b:c + b:c:d

I think that if you are working with a*b*c*d that the order will always be a-before-d.

David.


> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
>> Of David Winsemius
>> Sent: Thursday, September 13, 2012 11:28 AM
>> To: Bert Gunter
>> Cc: Alexander Shenkin; r-help at r-project.org
>> Subject: Re: [R] remove all terms with interaction factor in formula
>> 
>> 
>> On Sep 13, 2012, at 11:00 AM, Bert Gunter wrote:
>> 
>>> ~ a*b*d + a*c*d
>> 
>> That seemed pretty clear and obvious, but I started wondering how to tell the machine to
>> do it. Here is another idea:
>> 
>>> grep("b:c", attr(terms(~a*b*c*d), "term.labels" ) ,invert=TRUE, value=TRUE)
>> [1] "a"     "b"     "c"     "d"     "a:b"   "a:c"   "a:d"   "b:d"   "c:d"   "a:b:d" "a:c:d"
>> 
>> (Although I realize it's no longer a formula and might need to be reassembled with `paste`
>> and  `as.formula`.)
>> 
>> --
>> David.
>> 
>>> -- Bert
>>> On Thu, Sep 13, 2012 at 10:49 AM, Alexander Shenkin <ashenkin at ufl.edu> wrote:
>>>> Hi Folks,
>>>> 
>>>> I'm trying to find a way to remove all terms in a formula that contain a
>>>> particular interaction.
>>>> 
>>>> For example, in the formula below, I'd like to remove all terms that
>>>> contain the b:c interaction.
>>>> 
>>>>> attributes(terms( ~ a*b*c*d))$term.labels
>>>> [1] "a"       "b"       "c"       "d"       "a:b"     "a:c"
>>>> [7] "b:c"     "a:d"     "b:d"     "c:d"     "a:b:c"   "a:b:d"
>>>> [13] "a:c:d"   "b:c:d"   "a:b:c:d"
>>>> 
>>>> My eventual use is to fit models with the reduced formulas.
>>>> 
>>>> For example:
>>>>> my_df = data.frame( iv = runif(100), a=runif(100), b=runif(100),
>>>> c=runif(100), d=runif(100))
>>>>> lm(iv ~ a*b*c*d, data=my_df)
>>>> 
>>>> I can remove particular terms with update(), but I don't see a way to
>>>> remove all terms that contain a given combination of factors.
>>>> 
>>>> Any help would be greatly appreciated.  Thanks!
>>>> 
>>>> Allie
>>>> 
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> 
>>> 
>>> --
>>> 
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>> 
>>> Internal Contact Info:
>>> Phone: 467-7374
>>> Website:
>>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
>> biostatistics/pdb-ncb-home.htm
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> David Winsemius, MD
>> Alameda, CA, USA
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA

Thu, Sep 13, 2012 4:21 PM #

The fm[,"c"]==1 does not work correctly as the value 2 also
means that the variable (the row) is in the term (the column).
The 2 means that you don't need to apply the contrasts function
to that variable in this term.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Thursday, September 13, 2012 4:15 PM
To: William Dunlap
Cc: Bert Gunter; Alexander Shenkin; r-help at r-project.org
Subject: Re: [R] remove all terms with interaction factor in formula


On Sep 13, 2012, at 11:53 AM, William Dunlap wrote:

Your method would not work for, e.g., "a:d".  You could look at the "factors" attribute
of a terms object and select out those columns with non-zero entries for the variables
in the interaction of interest.  E.g.,

fm <- attr(terms(~a*b*c*d), "factors")
fm

 a b c d a:b a:c b:c a:d b:d c:d a:b:c a:b:d a:c:d b:c:d a:b:c:d
a 1 0 0 0   1   1   0   1   0   0     1     1     1     0       1
b 0 1 0 0   1   0   1   0   1   0     1     1     0     1       1
c 0 0 1 0   0   1   1   0   0   1     1     0     1     1       1
d 0 0 0 1   0   0   0   1   1   1     0     1     1     1       1

colnames(fm)[fm["b",]==0 | fm["c",]==0]

[1] "a"     "b"     "c"     "d"     "a:b"   "a:c"   "a:d"   "b:d"   "c:d"   "a:b:d" "a:c:d"

It's probably a black mark against my abilities to do logic manipulations, but it made a lot
more sense when I wrote it (admittedly the same meaning)  as :

colnames(fm)[ !(fm["b",]==1 & fm["c",]==1) ]

Here's a grepping method that only requires that the order be a.d in any term:

as.formula(paste("~", paste(

      grep("a.+d", attr(terms(~a*b*c*d), "term.labels" ) ,
           invert=TRUE, value=TRUE), collapse="+") ) )
~a + b + c + d + a:b + a:c + b:c + b:d + c:d + a:b:c + b:c:d

I think that if you are working with a*b*c*d that the order will always be a-before-d.

--
David.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On

Behalf

Of David Winsemius
Sent: Thursday, September 13, 2012 11:28 AM
To: Bert Gunter
Cc: Alexander Shenkin; r-help at r-project.org
Subject: Re: [R] remove all terms with interaction factor in formula


On Sep 13, 2012, at 11:00 AM, Bert Gunter wrote:

~ a*b*d + a*c*d

That seemed pretty clear and obvious, but I started wondering how to tell the

machine to

do it. Here is another idea:

grep("b:c", attr(terms(~a*b*c*d), "term.labels" ) ,invert=TRUE, value=TRUE)

[1] "a"     "b"     "c"     "d"     "a:b"   "a:c"   "a:d"   "b:d"   "c:d"   "a:b:d" "a:c:d"

(Although I realize it's no longer a formula and might need to be reassembled with

`paste`

and  `as.formula`.)

--
David.

-- Bert
On Thu, Sep 13, 2012 at 10:49 AM, Alexander Shenkin <ashenkin at ufl.edu> wrote:

Hi Folks,

I'm trying to find a way to remove all terms in a formula that contain a
particular interaction.

For example, in the formula below, I'd like to remove all terms that
contain the b:c interaction.

attributes(terms( ~ a*b*c*d))$term.labels

[1] "a"       "b"       "c"       "d"       "a:b"     "a:c"
[7] "b:c"     "a:d"     "b:d"     "c:d"     "a:b:c"   "a:b:d"
[13] "a:c:d"   "b:c:d"   "a:b:c:d"

My eventual use is to fit models with the reduced formulas.

For example:

my_df = data.frame( iv = runif(100), a=runif(100), b=runif(100),

c=runif(100), d=runif(100))

lm(iv ~ a*b*c*d, data=my_df)

I can remove particular terms with update(), but I don't see a way to
remove all terms that contain a given combination of factors.

Any help would be greatly appreciated.  Thanks!

Allie

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.