Your attached file is not a .csv file since the field are not separated by commas (just rename the mydata.csv to mydata.txt).
The command "genod2 <- as.matrix(genod)" created a character matrix from the data frame genod. When you try to force genod2 to numeric, the marker column becomes NAs which is probably not what you want.
The error message is because you passed genod (a data frame) to the snpgdsCreateGeno() function not genod2 (the matrix you created from genod).
------------------------------------
David L. Carlson
Department of Anthropology
Texas A&M University
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of N Meriam
Sent: Tuesday, January 8, 2019 1:38 PM
To: Michael Dewey <lists at dewey.myzen.co.uk>
Cc: r-help at r-project.org
Subject: Re: [R] Warning message: NAs introduced by coercion
Here's a portion of what my data looks like (text file format attached).
When running in R, it gives me this:
df4 <- read.csv(file = "mydata.csv", header = TRUE)
require(SNPRelate)
library(gdsfmt)
myd <- df4
myd <- df4
names(myd)[-1]
[1] "marker" "X88" "X9" "X17" "X25"
[1] 3 4 5 6 8 10
# the data must be 0,1,2 with 3 as missing so you have r
sample.id <- names(myd)[-1]
snp.id <- myd[,1]
snp.position <- 1:length(snp.id) # not needed for ibs
snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
# genotype data must have - in 3
genod <- myd[,-1]
genod[is.na(genod)] <- 3
genod[genod=="0"] <- 0
genod[genod=="1"] <- 2
genod2 <- as.matrix(genod)
head(genod2)
marker X88 X9
X17 X25
[1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3"
[2,] "1043336|F|0-7:A>G-7:A>G" "2" "0" "3" "0"
[3,] "1212218|F|0-49:A>G-49:A>G" "0" "0" "0" "0"
[4,] "1019554|F|0-14:T>C-14:T>C" "0" "0" "3" "0"
[5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3"
[6,] "1106702|F|0-8:C>A-8:C>A" "0" "0" "0" "0"
class(genod2) <- "numeric"
Warning message: In class(genod2) <- "numeric" : NAs introduced by coercion
marker X88 X9 X17 X25
[1,] NA 0 3 3 3
[2,] NA 2 0 3 0
[3,] NA 0 0 0 0
[4,] NA 0 0 3 0
[5,] NA 3 3 3 3
[6,] NA 0 0 0 0
class(genod2) <- "numeric"
class(genod2)
filn <-"simTunesian.gds"
snpgdsCreateGeno(filn, genmat = genod,
+ sample.id = sample.id, snp.id = snp.id,
+ snp.chromosome = snp.chromosome,
+ snp.position = snp.position,
+ snp.allele = snp.allele, snpfirstdim=TRUE)
Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,
: is.matrix(genmat) is not TRUE
Can't find a solution to my problem...my guess is that the problem
comes from converting the column 'marker' factor to numerical.
Best,
Meriam
On Tue, Jan 8, 2019 at 11:28 AM Michael Dewey <lists at dewey.myzen.co.uk> wrote:
Dear Meriam
Your csv file did not come through as attachments are stripped unless of
certain types and you post is very hard to read since you are posting in
HTML. Try renaming the file to ????.txt and set your mailer to send
plain text then people may be able to help you better.
Michael
On 08/01/2019 15:35, N Meriam wrote:
I see...
Here's a portion of what my data looks like (csv file attached).
I run again and here are the results:
df4 <- read.csv(file = "mydata.csv", header = TRUE)
require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> names(myd)[-1][1] "marker" "X88" "X9" "X17" "X25"
# the data must be 0,1,2 with 3 as missing so you have r> sample.id <- names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- 3> genod[genod=="0"] <- 0> genod[genod=="1"] <- 2
genod2 <- as.matrix(genod)> head(genod2) marker X88 X9 X17 X25
[1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3"
[2,] "1043336|F|0-7:A>G-7:A>G" "2" "0" "3" "0"
[3,] "1212218|F|0-49:A>G-49:A>G" "0" "0" "0" "0"
[4,] "1019554|F|0-14:T>C-14:T>C" "0" "0" "3" "0"
[5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3"
[6,] "1106702|F|0-8:C>A-8:C>A" "0" "0" "0" "0"
class(genod2) <- "numeric"Warning message:In class(genod2) <- "numeric" : NAs introduced by coercion> head(genod2)
marker X88 X9 X17 X25
[1,] NA 0 3 3 3
[2,] NA 2 0 3 0
[3,] NA 0 0 0 0
[4,] NA 0 0 3 0
[5,] NA 3 3 3 3
[6,] NA 0 0 0 0
class(genod2) <- "numeric"> class(genod2)[1] "matrix"
# read data > filn <-"simTunesian.gds"> snpgdsCreateGeno(filn, genmat = genod,+ sample.id = sample.id, snp.id = snp.id,+ snp.chromosome = snp.chromosome,+ snp.position = snp.position,+ snp.allele = snp.allele, snpfirstdim=TRUE)Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id, :
is.matrix(genmat) is not TRUE
Thanks,
Meriam
On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr <petr.pikal at precheza.cz> wrote:
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of N Meriam
Sent: Tuesday, January 8, 2019 3:08 PM
To: r-help at r-project.org
Subject: [R] Warning message: NAs introduced by coercion
Dear all,
I have a .csv file called df4. (15752 obs. of 264 variables).
I apply this code but couldn't continue further other analyses, a warning
message keeps coming up. Then, I want to determine max and min
similarity values,
heat map plot, cluster...etc
require(SNPRelate)
library(gdsfmt)
myd <- read.csv(file = "df4.csv", header = TRUE)
names(myd)[-1]
# the data must be 0,1,2 with 3 as missing so you have r
sample.id <- names(myd)[-1]
snp.id <- myd[,1]
snp.position <- 1:length(snp.id) # not needed for ibs
snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
# genotype data must have - in 3
genod <- myd[,-1]
genod[is.na(genod)] <- 3
genod[genod=="0"] <- 0
genod[genod=="1"] <- 2
genod[1:10,1:10]
genod <- as.matrix(genod)
matrix can have only one type of data so you probaly changed it to
character by such construction.
class(genod) <- "numeric"
This tries to change all "numeric" values to numbers but if it cannot it
sets it to NA.
something like
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
ir <-head(iris)
irm <- as.matrix(ir)
head(irm)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 "5.1" "3.5" "1.4" "0.2" "setosa"
2 "4.9" "3.0" "1.4" "0.2" "setosa"
3 "4.7" "3.2" "1.3" "0.2" "setosa"
4 "4.6" "3.1" "1.5" "0.2" "setosa"
5 "5.0" "3.6" "1.4" "0.2" "setosa"
6 "5.4" "3.9" "1.7" "0.4" "setosa"
Warning message:
In class(irm) <- "numeric" : NAs introduced by coercion
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 NA
2 4.9 3.0 1.4 0.2 NA
3 4.7 3.2 1.3 0.2 NA
4 4.6 3.1 1.5 0.2 NA
5 5.0 3.6 1.4 0.2 NA
6 5.4 3.9 1.7 0.4 NA
*Warning message:In class(genod) <- "numeric" : NAs introduced by
Maybe I could illustrate more with details so I can be more specific?
Please, let me know.
I would appreciate your help.
Thanks,
Meriam
[[alternative HTML version deleted]]