Skip to content
Back to formatted view

Raw Message

Message-ID: <21BCAEF4-1486-4133-8CDC-F086AEDC13A3@comcast.net>
Date: 2009-06-19T14:20:50Z
From: David Winsemius
Subject: meaning of  glm(value ~ .,
In-Reply-To: <1245420504.2578.127.camel@desktop.localhost>

All of your points are accepted, and I also give you credit for  
reading the "formula" page better than I.


On Jun 19, 2009, at 10:08 AM, Gavin Simpson wrote:

> On Fri, 2009-06-19 at 09:24 -0400, David Winsemius wrote:
>> On Jun 19, 2009, at 9:00 AM, onyourmark wrote:
> <snip />
>>> means and also, I see
>>>
>>> data=crs$dataset[,c(1:59,922)]
>>>
>>> I have read that the data argument is optional here
>>> "an optional data frame, list or environment (or object coercible by
>>> as.data.frame to a data frame) containing the variables in the
>>> model. If not
>>> found in data, the variables are taken from environment(formula),
>>> typically
>>> the environment from which glm is called"
>>>
>>> when they say "data", is that meant to include the dependent
>>> variable as
>>> well.
>>
>> Yes.
>
> It has to be defined in 'data' or the environment of 'formula', so it
> depends on what the OP meant by "meant to include". You can include it
> in 'data' but don't have to.
>
>>
>>> In other words,
>>> in the above statement 'value' is the dependent variable and it is
>>> also
>>> column 922 in the data set.
>>> Is this correct?
>>
>> Yes.
>
> No - you can't say that it is variable 922, or even any of 1:59 or 922
> for the reasons mentioned above.
>
> set.seed(123)
> dat <- data.frame(A = rnorm(100), B = rnorm(100), C = rnorm(100))
> Y <- rpois(100, 2)
> mod <- glm(Y ~ ., data = dat[,c(1,3)], family = poisson)
> mod
>
> If all you have is this:
>
> mod <- glm(Y ~ ., data = dat[,c(1,3)], family = poisson)
>
> You can't say anything more about Y than that it is either in 'dat' or
> in the environment of 'formula ', which in this case is the global
> workspace.
> G
>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT