Skip to content

meaning of lm( y~., data=mydat ), is it a language feature, is it documented, is it supported?

6 messages · John Sorkin, Duncan Murdoch, Ivan Calandra +3 more

#
The syntax
mydat <- data.frame( y,x )
fit1 <- lm( y~., data=mydat )
appears to perform a multivariable regression of y on every non-y variable in the data frame mydat. I can not find this syntax (y~.) in R documentation. Is y~. a supported feature of the R language? Where can I find it documented? I would hate to write code that is dependent on a non-supported, non-documented language feature.
Thank you,
John
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 

Confidentiality Statement:
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
#
On 23/05/2016 7:26 AM, John Sorkin wrote:
It is documented in the Introduction to R manual (hidden in section 
11.5, "Updating fitted models"), and in ?formula, which ?lm refers to.

Duncan Murdoch
#
Hi John,

This is indeed documented, but you'll have to look at the function 
formula():
?formula

Regarding the dot (.), here is the explanation from the help of formula():
"There are two special interpretations of . in a formula. The usual one 
is in the context of a data argument of model fitting functions and 
means ?all columns not otherwise in the formula?: see terms.formula. In 
the context of update.formula, only, it means ?what was previously in 
this part of the formula?."

HTH,
Ivan

--
Ivan Calandra, PhD
Scientific Mediator
University of Reims Champagne-Ardenne
GEGENAA - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
ivan.calandra at univ-reims.fr
--
https://www.researchgate.net/profile/Ivan_Calandra
https://publons.com/author/705639/

Le 23/05/2016 ? 13:26, John Sorkin a ?crit :
#
John Sorkin <JSorkin <at> grecc.umaryland.edu> writes:
every non-y variable in the data frame mydat. I can not
a supported feature of the R language? Where can I find it
is dependent on a non-supported, non-documented 
language feature.
Medicine Division of Gerontology and Geriatric Medicine
How about section 11.5 of An Introduction to R?
#
It's about formula syntax, so ?formula documents it.

Bert
On Monday, May 23, 2016, John Sorkin <JSorkin at grecc.umaryland.edu> wrote:

            
#
Actually, it is debatable which one of those deserve to be called "usual". Once upon a time, in the heyday of John Tukey, it might have been usual to have data set of a few hundred rows and, like, a dozen columns, exactly one of which being the response. Not so much these days, I'd say.

-pd