Hi, I have a matrix (pwdiff in the example below) with ~480000 rows and 780 columns. For each row, I want to get the percentage of columns that have an absolute value above a certain threshold "t". I then want to allocate that percentage to matrix 'perc' in the corresponding row. Below is my attempt at doing this, but it does not work: I get 'replacement has length zero'. Any help would be much appreciated!! perc<-matrix(c(1:nrow(pwdiff))) for (x in 1:nrow(pwdiff)) perc[x]<-(((ncol(pwdiff[,abs(pwdiff[x,]>=t)]))/ncol(pwdiff))*100) I should add that my data has NAs in some rows and not others (but I do not want to just ignore rows that have NAs) Thanks! Paul -- View this message in context: http://r.789695.n4.nabble.com/counting-columns-that-fulfill-specific-criteria-tp3622265p3622265.html Sent from the R help mailing list archive at Nabble.com.
counting columns that fulfill specific criteria
5 messages · pguilha, PIKAL Petr, Nick Sabbe
Hi
Hi, I have a matrix (pwdiff in the example below) with ~480000 rows and 780 columns. For each row, I want to get the percentage of columns that have an
absolute
value above a certain threshold "t". I then want to allocate that
percentage
to matrix 'perc' in the corresponding row. Below is my attempt at doing this, but it does not work: I get 'replacement has length zero'. Any
help
would be much appreciated!! perc<-matrix(c(1:nrow(pwdiff))) for (x in 1:nrow(pwdiff)) perc[x]<-(((ncol(pwdiff[,abs(pwdiff[x,]>=t)]))/ncol(pwdiff))*100)
As
matrix(c(1:nrow(pwdiff)))
Error in nrow(pwdiff) : object 'pwdiff' not found
gives an error we cannot directly check your code.
From what you say it seems to me that you want something like
rowSums(pwdiff>=t, na.rm=T)/ncol(pwdiff)*100 or maybe rowSums(abs(pwdiff)>=t, na.rm=T)/ncol(pwdiff)*100 but I can be completely wrong. Regards Petr
I should add that my data has NAs in some rows and not others (but I do
not
want to just ignore rows that have NAs) Thanks! Paul -- View this message in context: http://r.789695.n4.nabble.com/counting- columns-that-fulfill-specific-criteria-tp3622265p3622265.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hello Paul.
You could try something like
perc<-apply(pwdiff, 1 function(currow){
mean(abs(currow) > t, na.rm=TRUE)*100
})
I haven't tested this, as you did not provide a sample pwdiff. You should
probably check ?apply for more info.
Two suggestions: probably best not to name any variable t, as this is also
the function for transposing a matrix, and could end up being confusing at
the least. Second: for most practical purposes, it's better to leave out the
*100.
Good luck,
Nick Sabbe
--
ping: nick.sabbe at ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36
-- Do Not Disapprove
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of pguilha Sent: vrijdag 24 juni 2011 13:15 To: r-help at r-project.org Subject: [R] counting columns that fulfill specific criteria Hi, I have a matrix (pwdiff in the example below) with ~480000 rows and 780 columns. For each row, I want to get the percentage of columns that have an absolute value above a certain threshold "t". I then want to allocate that percentage to matrix 'perc' in the corresponding row. Below is my attempt at doing this, but it does not work: I get 'replacement has length zero'. Any help would be much appreciated!! perc<-matrix(c(1:nrow(pwdiff))) for (x in 1:nrow(pwdiff)) perc[x]<-(((ncol(pwdiff[,abs(pwdiff[x,]>=t)]))/ncol(pwdiff))*100) I should add that my data has NAs in some rows and not others (but I do not want to just ignore rows that have NAs) Thanks! Paul -- View this message in context: http://r.789695.n4.nabble.com/counting- columns-that-fulfill-specific-criteria-tp3622265p3622265.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Thanks for your reply, but that is not quite what I am looking for...I do not want to add up all the values in the row, I want to get the number of columns in each row that meet the criteria and then get that as a percentage.... my understanding is that the rowSums function adds up the values does it not? I tried your code anyways and it did not work: Error in abs(pwdiff) >= t : comparison (5) is possible only for atomic and list types and when specifying the columns (perc[x]<-rowSums(pwdiff[,abs(pwdiff[x,])>=thr], na.rm=T)/ncol(pwdiff)), I get the following error: Error in rowSums(pwdiff[, abs(pwdiff[x, ]) >= thr], na.rm = T) : 'x' must be an array of at least two dimensions In addition: There were 30 warnings (use warnings() to see them)
warnings()
Warning messages: 1: In perc[x] <- rowSums(pwdiff[, abs(pwdiff[x, ]) >= thr], ... : number of items to replace is not a multiple of replacement length ... Ive been trying to sort this out for the past three days and cannot get it to work for some reason...I can do it SO easily in excel with a simple macro, but doing that on a 480000x780 table inevitably crashes the computer... Any more help you can provide would be great, thanks! -- View this message in context: http://r.789695.n4.nabble.com/counting-columns-that-fulfill-specific-criteria-tp3622265p3622711.html Sent from the R help mailing list archive at Nabble.com.
2 days later
Hi r-help-bounces at r-project.org napsal dne 24.06.2011 16:51:27:
Thanks for your reply, but that is not quite what I am looking for...I
do not
want to add up all the values in the row, I want to get the number of columns in each row that meet the criteria and then get that as a percentage.... my understanding is that the rowSums function adds up the values does it not? I tried your code anyways and it did not work: Error in abs(pwdiff) >= t : comparison (5) is possible only for atomic and list types
That is why some ***reproducible code*** shall be provided from your side rowSums(USArrests>50)/ncol(USArrests)*100 gives no error and the result tells you percentage of columns for which in each row holds that the number is greater than 50. This may be what you want. rowSums(USArrests>50) simply tells you how many values in each row are greater than specified threshold and dividing it by total number of columns gives you percentage. The error you got means that pwdiff is probably not data frame or t is probably not one number. Only you can know that. For evaluating your objects you can try str(pwdiff) str(t) Besides t is a function or transposing data so you shall find some different name for your constant. Regards Petr
and when specifying the columns (perc[x]<-rowSums(pwdiff[,abs(pwdiff[x,])>=thr], na.rm=T)/ncol(pwdiff)),
I
get the following error: Error in rowSums(pwdiff[, abs(pwdiff[x, ]) >= thr], na.rm = T) : 'x' must be an array of at least two dimensions In addition: There were 30 warnings (use warnings() to see them)
warnings()
Warning messages: 1: In perc[x] <- rowSums(pwdiff[, abs(pwdiff[x, ]) >= thr], ... : number of items to replace is not a multiple of replacement length ... Ive been trying to sort this out for the past three days and cannot get
it
to work for some reason...I can do it SO easily in excel with a simple macro, but doing that on a 480000x780 table inevitably crashes the computer... Any more help you can provide would be great, thanks! -- View this message in context: http://r.789695.n4.nabble.com/counting- columns-that-fulfill-specific-criteria-tp3622265p3622711.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.