Using a function with apply Error: undefined columns selected
Hi John,
First, apply isn't guaranteed to work on data frames. There are two
easy ways to do something like this, but we had better have a data
frame:
guppy<-data.frame(taste=rnorm(10,5),
crunch=rnorm(10,5),satiety=rnorm(10,5))
If you just want to apply a function to all or a subset of columns of
a data frame, a for loop can be used:
fract2.1<-function(col,data) {
p<-sum(data[,col],na.rm=TRUE)/sum(!is.na(data[,col]))
return(p)
}
for(col in 1:ncol(guppy)) print(fract2.1(col,guppy))
If you really do want to use an "*apply" function, then the function
has to be written for each column, not the entire data frame:
fract2.2<-function(x) return(sum(x,na.rm=TRUE)/sum(!is.na(x)))
sapply(guppy,fract2.2)
and if you want a subset of the columns, you will have to do it before
you let sapply get into it.
Jim
On Fri, Apr 8, 2016 at 8:39 AM, John Sorkin <jsorkin at grecc.umaryland.edu> wrote:
I am trying to write a function that can be used to apply to process all the columns of a data.frame. If you will run the code below, you will get the error message undefined columns selected. I hope someone will be able to teach me what I am doing wrong.
Thank you,
John
# create data frame.
guppy
fract2 <- function(col,data) {
cat("Prove we have passed the data frame\n")
print(data)
# Get the name of the column being processed.
zz<-deparse(substitute(col))
cat("Column being processed\n")
print(zz)
p<-sum(data[,zz]!="")/length(data[,zz])
return(p)
}
apply(guppy,2,fract2,data=guppy)
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for ...{{dropped:12}}