Hi,
I have two dataframes:
The first, df1, contains some missing data:
cola colb colc cold cole
1 NA 5 9 NA 17
2 NA 6 NA 14 NA
3 3 NA 11 15 19
4 4 8 12 NA 20
The second, df2, contains the following:
cola colb colc cold cole
1 1.4 0.8 0.02 1.6 0.6
I'm wanting all missing data in df1$cola to be replaced by the value of
df2$cola. Then the missing data in df1$colb to be replaced with the
corresponding value in df2$colb etc.
I can get this to work column by column with single input lines but as my
original dataset is a lot larger I'm wanting a create a loop but can't work
out how.
The single line command is:
df1$cola[is.na(df1$cola)]<-df2$cola
I've tried a replace function within a loop but get error messages:
list<-colnames(df1)
for (i in list) {
r<-replace(df1$i,df1$i[is.na(df1$i)],df2$i)
}
with error messages of:
Warning messages:
1: In is.na(mymat$snp) :
is.na() applied to non-(list or vector) of type 'NULL'
Can anyone help me with this?
Thanks
--
View this message in context: http://r.789695.n4.nabble.com/Help-with-loop-tp4636140.html
Sent from the R help mailing list archive at Nabble.com.
Help with loop
4 messages · paulalou, Rui Barradas, Charles Stangor +1 more
Hello,
A one-liner could be
df1 <- read.table(text="
cola colb colc cold cole
1 NA 5 9 NA 17
2 NA 6 NA 14 NA
3 3 NA 11 15 19
4 4 8 12 NA 20
", header=TRUE)
df2 <- read.table(text="
cola colb colc cold cole
1 1.4 0.8 0.02 1.6 0.6
", header=TRUE)
sapply(names(df1), function(nm) {df1[[nm]][is.na(df1[[nm]])] <-
df2[[nm]]; df1[[nm]]})
Avoid loops, use *apply.
Hope this helps,
Rui Barradas
Em 11-07-2012 15:11, paulalou escreveu:
Hi,
I have two dataframes:
The first, df1, contains some missing data:
cola colb colc cold cole
1 NA 5 9 NA 17
2 NA 6 NA 14 NA
3 3 NA 11 15 19
4 4 8 12 NA 20
The second, df2, contains the following:
cola colb colc cold cole
1 1.4 0.8 0.02 1.6 0.6
I'm wanting all missing data in df1$cola to be replaced by the value of
df2$cola. Then the missing data in df1$colb to be replaced with the
corresponding value in df2$colb etc.
I can get this to work column by column with single input lines but as my
original dataset is a lot larger I'm wanting a create a loop but can't work
out how.
The single line command is:
df1$cola[is.na(df1$cola)]<-df2$cola
I've tried a replace function within a loop but get error messages:
list<-colnames(df1)
for (i in list) {
r<-replace(df1$i,df1$i[is.na(df1$i)],df2$i)
}
with error messages of:
Warning messages:
1: In is.na(mymat$snp) :
is.na() applied to non-(list or vector) of type 'NULL'
Can anyone help me with this?
Thanks
--
View this message in context: http://r.789695.n4.nabble.com/Help-with-loop-tp4636140.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120711/fcc750e5/attachment.pl>
Hi,
Try this:
func1<-function(x,y,z)
?{ifelse(is.na(y[[x]]),z[[x]],y[[x]])}
dat3<-data.frame(lapply(colnames(df1),function(x) func1(x,df1,df2)))
colnames(dat3)<-colnames(df1)
dat3
? cola colb? colc cold cole
1? 1.4? 5.0? 9.00? 1.6 17.0
2? 1.4? 6.0? 0.02 14.0? 0.6
3? 3.0? 0.8 11.00 15.0 19.0
4? 4.0? 8.0 12.00? 1.6 20.0
#or
sapply(colnames(df1),function(x) func1(x,df1,df2))
A.K.
----- Original Message -----
From: paulalou <pls28 at medschl.cam.ac.uk>
To: r-help at r-project.org
Cc:
Sent: Wednesday, July 11, 2012 10:11 AM
Subject: [R] Help with loop
Hi,
I have two dataframes:
The first, df1, contains some missing data:
? cola colb colc cold cole
1? ? NA? ? 5? ? 9? NA? 17
2? ? NA? ? 6? NA? 14? NA
3? ? 3? ? NA? 11? 15? 19
4? ? 4? ? 8? 12? NA? 20
The second, df2, contains the following:
? cola colb colc cold cole
1? 1.4? 0.8 0.02? 1.6? 0.6
I'm wanting all missing data in df1$cola to be replaced by the value of
df2$cola. Then the missing data in df1$colb to be replaced with the
corresponding value in df2$colb etc.
I can get this to work column by column with single input lines but as my
original dataset is a lot larger I'm wanting a create a loop but can't work
out how.
The single line command is:
df1$cola[is.na(df1$cola)]<-df2$cola
I've tried a replace function within a loop but get error messages:
list<-colnames(df1)
for (i in list) {
r<-replace(df1$i,df1$i[is.na(df1$i)],df2$i)
}
with error messages of:
Warning messages:
1: In is.na(mymat$snp) :
? is.na() applied to non-(list or vector) of type 'NULL'
Can anyone help me with this?
Thanks
--
View this message in context: http://r.789695.n4.nabble.com/Help-with-loop-tp4636140.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.