identifying when one element of a row has a positive number
Hello, Thanks to everyone for the multiple answers. Josh, thanks for the function. My data 12 datasets have over 500,000 rows so your answer greatly appreciated. Cheers, Daisy
On Thu, Jan 27, 2011 at 9:10 PM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
Hi,
This problem seemed deceptively simple to me. ?After chasing a
considerable number of dead ends, I came up with fg(). ?It lacks the
elegance of Dennis' solution, but (particularly for large datasets),
it is substantially faster. ?I still feel like I'm missing something,
but....
###############################################
## Data
df1 <- data.frame(x = seq(1860,1950,by=10),
?y = seq(-290,-200,by=10), ANN = c(3,0,0,0,1,0,1,1,0,0),
?CTA = c(0,1,0,0,0,0,1,0,0,2), GLM = c(0,0,2,0,0,0,0,1,0,0))
## larger test dataset
dftest <- do.call("rbind", rep(list(df1), 100))
f <- function(x) ifelse(sum(x > 0) == 1L, names(which(x > 0)), NA)
g <- function(x) ifelse(sum(x > 0) == 2L, names(which(x == 0L)), NA)
fg <- function(dat) {
?cnames <- colnames(dat)
?dat <- dat > 0; z <- rowSums(dat)
?z1 <- z == 1L; z2 <- z == 2L; rm(z)
?output <- matrix(NA, nrow = nrow(dat), ncol = 2)
?output[z1, 1] <- apply(dat[z1, ], 1, function(x) cnames[x])
?output[z2, 2] <- apply(dat[z2, ], 1, function(x) cnames[!x])
?return(output)
}
## Compare times on larger dataset
system.time(cbind(apply(dftest[, 3:5], 1, f),
?apply(dftest[, 3:5], 1, g)))
system.time(fg(dftest[, 3:5]))
## compare times under repetitions
system.time(for (i in 1:100) cbind(apply(df1[, 3:5], 1, f),
?apply(df1[, 3:5], 1, g)))
system.time(for (i in 1:100) fg(df1[, 3:5]))
###############################################
Josh
On Thu, Jan 27, 2011 at 12:36 AM, Dennis Murphy <djmuser at gmail.com> wrote:
Hi: Try this: f <- function(x) ifelse(sum(x > 0) == 1L, names(which(x > 0)), NA) g <- function(x) ifelse(sum(x > 0) == 2L, names(which(x == 0L)), NA)
apply(df1[, 3:5], 1, f)
?[1] "ANN" "CTA" "GLM" NA ? ?"ANN" NA ? ?NA ? ?NA ? ?NA ? ?"CTA"
apply(df1[, 3:5], 1, g)
?[1] NA ? ?NA ? ?NA ? ?NA ? ?NA ? ?NA ? ?"GLM" "CTA" NA ? ?NA HTH, Dennis On Wed, Jan 26, 2011 at 9:36 PM, Daisy Englert Duursma < daisy.duursma at gmail.com> wrote:
Hello,
I am not sure where to begin with this problem or what to search for
in r-help. I just don't know what to call this.
If I have 5 columns, the first 2 are the x,y, locations and the last
three are variables about those locations.
x<-seq(1860,1950,by=10)
y<-seq(-290,-200,by=10)
ANN<-c(3,0,0,0,1,0,1,1,0,0)
CTA<-c(0,1,0,0,0,0,1,0,0,2)
GLM<-c(0,0,2,0,0,0,0,1,0,0)
df1<-as.data.frame(cbind(x,y,ANN,CTA,GLM))
What I would like to produce is an additional column that tells when
only 1 of the three variables has a value greater than 0. I would like
this new column to give the name of the variable. Likewise, I would
like a column that tells one only one of the three variables for a
given row has a value of 0. For my example the new columns would be:
one_presence<-c("ANN","CTA","GLM","NA","ANN","NA","NA","NA","NA","CTA")
one_absence<-c("NA","NA","NA","NA","NA","NA","GLM","CTA","NA","NA")
The end result should look like
df2<-(cbind(df1,one_presence,one_absence))
I am sure I can do this with a loop or maybe grep but I am out of ideas.
Any help would be appreciated.
Cheers,
Daisy
--
Daisy Englert Duursma
Room E8C156
Dept. Biological Sciences
Macquarie University ?NSW ?2109
Australia
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/
Daisy Englert Duursma Room E8C156 Dept. Biological Sciences Macquarie University? NSW? 2109 Australia