Skip to content

replace string values with numbers

8 messages · JiangZhengyu, arun, David Winsemius

#
On Sep 26, 2012, at 12:52 PM, JiangZhengyu wrote:

            
You are trying to replace a factor value with a level that it doesn't have. Hence that particular very informative error message.
#
On Sep 26, 2012, at 2:27 PM, David Winsemius wrote:

            
You probably could do this:

Gene[] <- lapply(Gene, as.character)
P1 P2 P3
1 CG CG GG
2 -- -- AC
3 -- AC CC
4 AC -- AC

The use of the form `object[] <-` preserves the original dimensions. You would otherwise have needed to use data.frame() around the result.
P1 P2 P3
1 CG CG  3
2 -- -- AC
3 -- AC CC
4 AC -- AC
Your code was not really self-contained and reproducible, but what I did was this:

Gene <- read.table(text="P1 P2 P3 
 CG CG GG
 -- --  AC 
 -- AC CC
 AC  --  AC", header=TRUE)

read.table will by default create factors when the input column contains character values unless you use stringsAsFactors=FALSE.
#
Hi,

You can also try these:

Gene<-read.table(text="
P1 P2 P3
?CG CG GG
-- --? AC
?-- AC CC
AC? --? AC
",header=TRUE,sep="")
Gene<-sapply(Gene,as.character)
Gene<-data.frame(gsub("GG","3",Gene))
?Gene
#? P1 P2 P3
#1 CG CG? 3
#2 -- -- AC
#3 -- AC CC
#4 AC -- AC
# str(Gene)
#'data.frame':??? 4 obs. of? 3 variables:
# $ P1: Factor w/ 3 levels "--","AC","CG": 3 1 1 2
#$ P2: Factor w/ 3 levels "--","AC","CG": 3 1 2 1
# $ P3: Factor w/ 3 levels "3","AC","CC": 1 2 3 2


#2nd way 
Gene<-read.table(text="
P1 P2 P3
?CG CG GG
-- --? AC
?-- AC CC
AC? --? AC
",header=TRUE,sep="")

?Gene<-within(Gene,{P1<-as.character(P1);P2<-as.character(P2);P3<-as.character(P3)})
Gene[sapply(Gene,function(x) grepl("GG",x))]<-3
Gene
? P1 P2 P3
#1 CG CG? 3
#2 -- -- AC
#3 -- AC CC
#4 AC -- AC
?str(Gene)
#'data.frame':??? 4 obs. of? 3 variables:
# $ P1: chr? "CG" "--" "--" "AC"
# $ P2: chr? "CG" "--" "AC" "--"
# $ P3: chr? "3" "AC" "CC" "AC"

A.K.

----- Original Message -----
From: JiangZhengyu <zhyjiang2006 at hotmail.com>
To: r-help at r-project.org
Cc: 
Sent: Wednesday, September 26, 2012 3:52 PM
Subject: [R] replace string values with numbers





Hi everyone, I have a data frame Gene with SNPs eg.?  P1 P2 P3 
CG CG GG
-- --? AC 
-- AC CC
AC? --? AC I tried to replace all the GG with a value 3.? ? Gene[Gene=="GG"]<-3 It always give me:? Warning in `[<-.factor`(`*tmp*`, thisvar, value = 3) :
? invalid factor level, NAs generated Does any know if there is anything wrong with my code? Thanks, Zhengyu ??? ???  ??? ?  ??? ??? ? 
??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
1 day later
#
Hi,
Try this:
geno<-read.table(text="
P1? P2? P3? P4 P5
1? 2? 2? 3 2?? 
2? 2? 2? 1 1
1? 2? 1? 2? NA
NA 2? 3? 4? 5
1? 1? 3? 1?? 3
",sep="",header=TRUE,stringsAsFactors=FALSE)
?geno1<-as.matrix(geno)
?geno1[is.na(geno1)]<-0
?tmp<-apply(geno1,1,function(x) ifelse((sum(x!=2)>3) & (sum(x==1)>=1) & (sum(x==3)>=1), 1,0) )
tmp
#[1] 0 0 0 0 1
A.K.




----- Original Message -----
From: JiangZhengyu <zhyjiang2006 at hotmail.com>
To: 
Cc: r-help at r-project.org
Sent: Friday, September 28, 2012 4:16 PM
Subject: [R] Errors in if statement


Hi guys, I have many rows (>1000) and columns (>30) of "geno" matrix. I use the following loop and condition statement (adapted from someone else code). I always have an error below.? I was wondering if anyone knows what's the problem & how to fix it.? 
Thanks,Zhengyu? ########### geno matrix P1? P2? P3? P4 
1? 2? 2? 3 2? ? 
2? 2? 2? 1 1
1? 2? 1? 2? NANA 2? 3? 4? 5 ###########
for(i in 1:4) {
cat(i,"")
if(sum(geno[i,]!=2)>3 && sum(geno[i,]==1)>=1 && sum(geno[i,]==3)>=1){
?  tmp = 1
?  }
} ########### 1 2 Error in if (sum(geno[i, ] != 2) > 3 && sum(geno[i, ] == 1) >= 1 && sum(geno[i,? : 
? missing value where TRUE/FALSE needed
? ? ?  ??? ???  ??? ?  ??? ??? ? 
??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
On Sep 28, 2012, at 1:16 PM, JiangZhengyu wrote:

            
Boy, it surely looks like missing values are the problem. Have you read:

?sum