Better to use dput(your.data) for sharing data. Anyway I am still confused
but you probably are able to clarify things further.
it is a dataset with genetic marker alleles for single individuals.
the first row is the header, all following rows are individuals. 2 rows
count for 1 individual.
first colum is the individual's number, second colum is the number for
the
population the individual comes from, and all following colums are
different
genetic markers.
what i want to do with this data in R, is to compare one individual with
In those 2 rows for one individual sometimes the genetic marker differs
test[1:2, "scm247"]
[1] 222 231
What do you want to do with them?
each of the other individuals, allele-wise. there are five
possibilities:
the two compared individuals share 4,3,2,1,0 alleles of the currently
examined marker (=colum). for each shared allele this pair of
individuals
shall get 1 scoring point. for each pair of individuals, all scoring
my code again, modified according to your suggestions:
#1) read in data:
daten<-read.table('K:/Analysen/STRUCTURE/test.txt', header=TRUE,
sep="\t")
daten<-as.data.frame(daten)
#2) create empty matrix:
indxind<-matrix(0,nrow=617, ncol=617)
indxind[1:20,1:19]
#3) compare cells to each other, score:
#for the whole dataset: s in 3:34, z1 in 1:617, z2 in 1:617
for (s in 3:6) { #walks though the matrix colum by colum, starting at
colum 3
for (z1 in 1:6) { #for each current colum, take one row (z1)...
for (z2 in 1:6) { #...and compare it to another row (z2) of the
current
colum
if (z1!=z2) {topf<-indxind[z1,z2]
if (daten[2*z1-1,s]==daten[2*z2-1,s]) topf<-topf+1
#actually, 2 rows make up 1 individual,
if (daten[2*z1-1,s]==daten[2*z2,s]) topf<-topf+1
#therefore i compare 2 rows
if (daten[2*z1,s]==daten[2*z2-1,s]) topf<-topf+1
#with another 2 rows
if (daten[2*z1,s]==daten[2*z2,s]) topf<-topf+1
indxind[z1,z2]<-topf
indxind[z2,z1]<-topf
}
#print(c(s,z1,z2,indxind[1,2])) ##counts s, z1 and z2 properly,
but
gives always 8 for indxind[1,2]
}
#indxind[1:5,1:5] #empty matrix
}
#indxind[1:5,1:5] #empty matrix
}
#4) check:
indxind[1:5,1:5]
@ Michael Weylandt: i've done my best with regard to the "big picture"
of my
algorithm and the small reproducible example. i hope both is sufficient.
@ Petr Pikal-3: in this case, there are only numerical values, but it's
a
useful hint for my other codes.
@ Petr Pikal-3 and Berend Hasselman: initializing indxind with 0's
instead
of NAs helps, it fills something in indxind now. but it does the
calculation
only for the first marker (colum 3), afterwards i get an error:
Fehler in if (daten[2 * z1 - 1, s] == daten[2 * z2 - 1, s]) topf <- topf
+
:
Fehlender Wert, wo TRUE/FALSE n?tig ist
Error in if (daten[2 * z1 - 1, s] == daten[2 * z2 - 1, s]) topf <- topf
+ :
Missing value, where TRUE/FAlse is required
Has this something to do with the changing to
daten<-as.data.frame(daten) in
line 3 (instead of as.matrix before)?
--
View this message in context: http://r.789695.n4.nabble.com/help-please-
matrix-operations-inside-3-nested-loops-tp4639592p4639730.html
Sent from the R help mailing list archive at Nabble.com.