Skip to content

subsetting a dataframe

4 messages · John Sorkin, Brian Ripley, Dimitris Rizopoulos +1 more

#
windows XP
R 2.6.0

I am having problems deleting a row from a data frame. I create my dataframe by subsetting a larger dataframe:

ShortLavin<-Lavin[Lavin[,"Site"]=="PP" | Lavin[,"Site"]=="CC" | Lavin[,"Site"]=="FH",]

I then perform a glm using the data frame and plot the results. 

fit1poisson<-glm(NumUniqOpPt~Seq+Site,family=poisson(link = "log"),data=ShortLavin,offset=log(NumUniqPt))
plot(fit1poisson)

On the plots I see a point labeled as 127 that is an extreme value. I want to re-run the glm excluding the extreme observation. I have tried several methods to exclude the observation (shown below), none have worked. 

Minus127<-ShortLavin[-127,]
Minus127<-ShortLavin[-"127",]
Minus127<-ShortLavin[-c(127),]
Minus127<-ShortLavin[-c("127"),]

None of these worked. Suggestions on how I can remove observation 127 would be appreciated

Thank you,
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}
#
On Tue, 4 Mar 2008, John Sorkin wrote:

            
Assuming this is row name "127" derived from row 127 of  the original 
dataset,

Minus127 <- ShortLavin[-match("127", row.names(ShortLavin)), ]

  
    
#
try this:

Minus127 <- ShortLavin[!row.names(ShortLavin) %in% "127", ]


I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm

----- Original Message ----- 
From: "John Sorkin" <jsorkin at grecc.umaryland.edu>
To: <r-help at r-project.org>
Sent: Tuesday, March 04, 2008 2:41 PM
Subject: [R] subsetting a dataframe
#
On 3/4/2008 8:41 AM, John Sorkin wrote:
I would do that in the following way:

ShortLavin <- subset(Lavin, Site %in% c("PP","CC","FH"))
Of course, you could have done the subsetting within the call to glm:

fit1poisson <- glm(NumUniqOpPt~Seq+Site,family=poisson(link = "log"),
data=subset(Lavin, Site %in% c("PP","CC","FH")),
offset=log(NumUniqPt))
Minus127 <- subset(ShortLavin, !rownames(ShortLavin) %in% 127)