Issue replacing dataset values from read data
Hi Emily,
I haven't tested this exhaustively, but it seems to work:
df<-data.frame(id=2001:3300,yrssmoke=sample(1:40,1300,TRUE),
cigsdaytotal=sample(1:60,1300,TRUE),yrsquit=sample(1:20,1300,TRUE))
dfNA<-sapply(df$id,"%in%",c(2165,2534,2553,2611,2983,3233))
# create your NA values
df[dfNA,c("yrsquit","packyrs")]<-NA
# since you know the NA id values
df[dfNA,"yrsquit"]<-0
df[dfNA,"packyrs"]<-df[dfNA,"yrssmoke"]*df[dfNA,"cigsdaytotal"]/20
Jim
On Sat, May 7, 2016 at 8:19 AM, Chang, Emily <Emily.Chang2 at ucsf.edu> wrote:
Dear all,
I am reading a modest dataset (2297 x 644) with specific values I want to change. The code is inelegant but looks like this:
df <- read.csv("mydata.csv", header = TRUE, stringsAsFactors = FALSE)
# yrsquit, packyrs missing for following IDs. Manually change.
for(myid in c(2165, 2534, 2553, 2611, 2983, 3233)){
temp <- subset(df, id == myid)
df[df$id == myid , "yrsquit"] <- 0
temp.yrssmoke <- temp$age-(temp$agesmoke+temp$yrsquit)
df[df$id == myid , "yrssmoke"] <- temp.yrssmoke
df[df$id == myid , "packyrs"] <- (temp$cigsdaytotal/20)*(temp.yrssmoke)
}
If I run just the first line and then the for loop, it works.
If I run the first line and for loop together, yrsquit is properly replaced to == 0, but packyrs is NA still.
Obviously there's many ways around this specific problem, but I was wondering what the issue is here, so as to look out for and avoid it in the future.
Apologies for the lack of reproducible code; I haven't yet reproduced the problem with generated data.
Much thanks in advance.
Best regards,
Emily
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.