An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120927/edcc1f68/attachment.pl>
replace string values with numbers
8 messages · JiangZhengyu, arun, David Winsemius
On Sep 26, 2012, at 12:52 PM, JiangZhengyu wrote:
Hi everyone, I have a data frame Gene with SNPs eg. P1 P2 P3 CG CG GG -- -- AC -- AC CC AC -- AC I tried to replace all the GG with a value 3. Gene[Gene=="GG"]<-3 It always give me: Warning in `[<-.factor`(`*tmp*`, thisvar, value = 3) : invalid factor level, NAs generated Does any know if there is anything wrong with my code?
You are trying to replace a factor value with a level that it doesn't have. Hence that particular very informative error message.
David Winsemius, MD Alameda, CA, USA
On Sep 26, 2012, at 2:27 PM, David Winsemius wrote:
On Sep 26, 2012, at 12:52 PM, JiangZhengyu wrote:
Hi everyone, I have a data frame Gene with SNPs eg. P1 P2 P3 CG CG GG -- -- AC -- AC CC AC -- AC I tried to replace all the GG with a value 3. Gene[Gene=="GG"]<-3 It always give me: Warning in `[<-.factor`(`*tmp*`, thisvar, value = 3) : invalid factor level, NAs generated Does any know if there is anything wrong with my code?
You are trying to replace a factor value with a level that it doesn't have. Hence that particular very informative error message.
You probably could do this: Gene[] <- lapply(Gene, as.character)
Gene
P1 P2 P3 1 CG CG GG 2 -- -- AC 3 -- AC CC 4 AC -- AC The use of the form `object[] <-` preserves the original dimensions. You would otherwise have needed to use data.frame() around the result.
Gene[Gene=="GG"] <- 3 Gene
P1 P2 P3 1 CG CG 3 2 -- -- AC 3 -- AC CC 4 AC -- AC
and provide commented, minimal, self-contained, reproducible code.
Your code was not really self-contained and reproducible, but what I did was this: Gene <- read.table(text="P1 P2 P3 CG CG GG -- -- AC -- AC CC AC -- AC", header=TRUE) read.table will by default create factors when the input column contains character values unless you use stringsAsFactors=FALSE.
David Winsemius, MD Alameda, CA, USA
Hi,
You can also try these:
Gene<-read.table(text="
P1 P2 P3
?CG CG GG
-- --? AC
?-- AC CC
AC? --? AC
",header=TRUE,sep="")
Gene<-sapply(Gene,as.character)
Gene<-data.frame(gsub("GG","3",Gene))
?Gene
#? P1 P2 P3
#1 CG CG? 3
#2 -- -- AC
#3 -- AC CC
#4 AC -- AC
# str(Gene)
#'data.frame':??? 4 obs. of? 3 variables:
# $ P1: Factor w/ 3 levels "--","AC","CG": 3 1 1 2
#$ P2: Factor w/ 3 levels "--","AC","CG": 3 1 2 1
# $ P3: Factor w/ 3 levels "3","AC","CC": 1 2 3 2
#2nd way
Gene<-read.table(text="
P1 P2 P3
?CG CG GG
-- --? AC
?-- AC CC
AC? --? AC
",header=TRUE,sep="")
?Gene<-within(Gene,{P1<-as.character(P1);P2<-as.character(P2);P3<-as.character(P3)})
Gene[sapply(Gene,function(x) grepl("GG",x))]<-3
Gene
? P1 P2 P3
#1 CG CG? 3
#2 -- -- AC
#3 -- AC CC
#4 AC -- AC
?str(Gene)
#'data.frame':??? 4 obs. of? 3 variables:
# $ P1: chr? "CG" "--" "--" "AC"
# $ P2: chr? "CG" "--" "AC" "--"
# $ P3: chr? "3" "AC" "CC" "AC"
A.K.
----- Original Message -----
From: JiangZhengyu <zhyjiang2006 at hotmail.com>
To: r-help at r-project.org
Cc:
Sent: Wednesday, September 26, 2012 3:52 PM
Subject: [R] replace string values with numbers
Hi everyone, I have a data frame Gene with SNPs eg.? P1 P2 P3
CG CG GG
-- --? AC
-- AC CC
AC? --? AC I tried to replace all the GG with a value 3.? ? Gene[Gene=="GG"]<-3 It always give me:? Warning in `[<-.factor`(`*tmp*`, thisvar, value = 3) :
? invalid factor level, NAs generated Does any know if there is anything wrong with my code? Thanks, Zhengyu ??? ??? ??? ? ??? ??? ?
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120927/585090c6/attachment.pl>
1 day later
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120929/1fe61f2e/attachment.pl>
Hi,
Try this:
geno<-read.table(text="
P1? P2? P3? P4 P5
1? 2? 2? 3 2??
2? 2? 2? 1 1
1? 2? 1? 2? NA
NA 2? 3? 4? 5
1? 1? 3? 1?? 3
",sep="",header=TRUE,stringsAsFactors=FALSE)
?geno1<-as.matrix(geno)
?geno1[is.na(geno1)]<-0
?tmp<-apply(geno1,1,function(x) ifelse((sum(x!=2)>3) & (sum(x==1)>=1) & (sum(x==3)>=1), 1,0) )
tmp
#[1] 0 0 0 0 1
A.K.
----- Original Message -----
From: JiangZhengyu <zhyjiang2006 at hotmail.com>
To:
Cc: r-help at r-project.org
Sent: Friday, September 28, 2012 4:16 PM
Subject: [R] Errors in if statement
Hi guys, I have many rows (>1000) and columns (>30) of "geno" matrix. I use the following loop and condition statement (adapted from someone else code). I always have an error below.? I was wondering if anyone knows what's the problem & how to fix it.?
Thanks,Zhengyu? ########### geno matrix P1? P2? P3? P4
1? 2? 2? 3 2? ?
2? 2? 2? 1 1
1? 2? 1? 2? NANA 2? 3? 4? 5 ###########
for(i in 1:4) {
cat(i,"")
if(sum(geno[i,]!=2)>3 && sum(geno[i,]==1)>=1 && sum(geno[i,]==3)>=1){
? tmp = 1
? }
} ########### 1 2 Error in if (sum(geno[i, ] != 2) > 3 && sum(geno[i, ] == 1) >= 1 && sum(geno[i,? :
? missing value where TRUE/FALSE needed
? ? ? ??? ??? ??? ? ??? ??? ?
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
On Sep 28, 2012, at 1:16 PM, JiangZhengyu wrote:
Hi guys, I have many rows (>1000) and columns (>30) of "geno" matrix. I use the following loop and condition statement (adapted from someone else code). I always have an error below. I was wondering if anyone knows what's the problem & how to fix it.
Boy, it surely looks like missing values are the problem. Have you read: ?sum
David.
> Thanks,Zhengyu ########### geno matrix P1 P2 P3 P4
> 1 2 2 3 2
> 2 2 2 1 1
> 1 2 1 2 NANA 2 3 4 5 ###########
> for(i in 1:4) {
> cat(i,"")
> if(sum(geno[i,]!=2)>3 && sum(geno[i,]==1)>=1 && sum(geno[i,]==3)>=1){
> tmp = 1
> }
> } ########### 1 2 Error in if (sum(geno[i, ] != 2) > 3 && sum(geno[i, ] == 1) >= 1 && sum(geno[i, :
> missing value where TRUE/FALSE needed
>
> [[alternative HTML version deleted]]
David Winsemius, MD
Alameda, CA, USA