expanding a presence only dataset into presence/absence
I am sorry.? I forgot to update the code:dat1<- read.table(text="
Species Site Date
a 1 1
b 1 1
b 1 2
c 1 3
",sep="",header=TRUE,stringsAsFactors=FALSE)
dat1$Present<- 1
dat2<-expand.grid(unique(dat1$Species),unique(dat1$Site),unique(dat1$Date))
?colnames(dat2)<- colnames(dat1)[-4] #changed here
res<-merge(dat1,dat2,by=c("Species","Site","Date"),all=TRUE)
res[is.na(res)]<- 0
?res<-res[order(res$Date),]
row.names(res)<- 1:nrow(res)
res
#? Species Site Date Present
#1?????? a??? 1??? 1?????? 1
#2?????? b??? 1??? 1?????? 1
#3?????? c??? 1??? 1?????? 0
#4?????? a??? 1??? 2?????? 0
#5?????? b??? 1??? 2?????? 1
#6?????? c??? 1??? 2?????? 0
#7?????? a??? 1??? 3?????? 0
#8?????? b??? 1??? 3?????? 0
#9?????? c??? 1??? 3?????? 1
A.K.
From: Matthew Venesky <mvenesky at gmail.com>
To: arun <smartpink111 at yahoo.com>
Sent: Monday, April 29, 2013 1:58 PM
Subject: Re: [R] expanding a presence only dataset into presence/absence
To: arun <smartpink111 at yahoo.com>
Sent: Monday, April 29, 2013 1:58 PM
Subject: Re: [R] expanding a presence only dataset into presence/absence
The output that you prepared (for Site 1) looks good... however, I can't get that code to work. I get the following error: > dat2<-expand.grid(unique(dat1$Species),unique(dat1$Site),unique(dat1$Date))colnames(dat2)<- colnames(dat1) Error: unexpected symbol in "dat2<-expand.grid(unique(dat1$Species),unique(dat1$Site),unique(dat1$Date))colnames" -- Matthew D. Venesky, Ph.D. Postdoctoral Research Associate, Department of Integrative Biology, The University of South Florida, Tampa, FL 33620 Website:?http://mvenesky.myweb.usf.edu/ On Mon, Apr 29, 2013 at 1:44 PM, arun <smartpink111 at yahoo.com> wrote: Hi Matthew, > >So, do you think the output I gave is different from what you expected? >Thanks, >Arun > > > > > > >________________________________ >From: Matthew Venesky <mvenesky at gmail.com> >To: arun <smartpink111 at yahoo.com> >Sent: Monday, April 29, 2013 1:15 PM >Subject: Re: [R] expanding a presence only dataset into presence/absence > > > > >I see what you are confused about.? > >I'm sorry. I gave extra sites as examples in my table called "Desired Data" such that there are 3 sites in the "Desired Data" and only 1 site in the "My current data". Ignore sites 2 and 3; you should see what I am trying to do using only site 1. > > > > >-- >Matthew D. Venesky, Ph.D. > > >Postdoctoral Research Associate, >Department of Integrative Biology, >The University of South Florida, >Tampa, FL 33620 > >Website:?http://mvenesky.myweb.usf.edu/ > > >On Mon, Apr 29, 2013 at 1:11 PM, Matthew Venesky <mvenesky at gmail.com> wrote: > >That is part of the difficulty. If Species C was present only on Date 3, we need to have the code manually add Species C as absent (i.e., assign it a value of 0) at that site on the previous sampling dates.? >> >> >>Or, is there something else that is confusing you that I am not explaining? >> >> >> >> >>-- >> >> >Matthew D. Venesky, Ph.D. >> >> >>Postdoctoral Research Associate, >>Department of Integrative Biology, >>The University of South Florida, >>Tampa, FL 33620 >>? >>Website:?http://mvenesky.myweb.usf.edu/ >> >> >>On Mon, Apr 29, 2013 at 12:47 PM, arun <smartpink111 at yahoo.com> wrote: >> >>Hi, >>> >>>Your output dataset is bit confusing as it contains Sites that were not in the input. >>>Using your input dataset, I am getting this: >>> >>> >>>dat1<- read.table(text=" >>> >>>Species Site Date >>>a 1 1 >>>b 1 1 >>>b 1 2 >>>c 1 3 >>>",sep="",header=TRUE,stringsAsFactors=FALSE) >>>dat1$Present<- 1 >>>dat2<-expand.grid(unique(dat1$Species),unique(dat1$Site),unique(dat1$Date)) >>>?colnames(dat2)<- colnames(dat1) >>>res<-merge(dat1,dat2,by=c("Species","Site","Date"),all=TRUE) >>>res[is.na(res)]<- 0 >>>?res<-res[order(res$Date),] >>>?res >>>#? Species Site Date Present >>>#1?????? a??? 1??? 1?????? 1 >>>#4?????? b??? 1??? 1?????? 1 >>>#7?????? c??? 1??? 1?????? 0 >>>#2?????? a??? 1??? 2?????? 0 >>>#5?????? b??? 1??? 2?????? 1 >>>#8?????? c??? 1??? 2?????? 0 >>>#3?????? a??? 1??? 3?????? 0 >>>#6?????? b??? 1??? 3?????? 0 >>>#9?????? c??? 1??? 3?????? 1 >>>A.K. >>> >>> >>> >>> >>> >>> >>>----- Original Message ----- >>>From: Matthew Venesky <mvenesky at gmail.com> >>>To: r-help at r-project.org >>>Cc: >>>Sent: Monday, April 29, 2013 11:12 AM >>>Subject: [R] expanding a presence only dataset into presence/absence >>> >>>Hello, >>> >>>I'm working with a very large dataset (250,000+ lines in its' current form) >>>that includes presence only data on various species (which is nested within >>>different sites and sampling dates). I need to convert this into a dataset >>>with presence/absence for each species. For example, I would like to expand >>>"My current data" to "Desired data": >>> >>>My current data >>> >>>Species Site Date >>>a 1 1 >>>b 1 1 >>>b 1 2 >>>c 1 3 >>> >>>Desired data >>> >>>Species Present Site Date >>>a 1 1 1 >>>b 1 1 1 >>>c 0 1 1 >>>a 0 2 2 >>>b 1 2 2 >>>C 0 2 2 >>>a 0 3 3 >>>b 0 3 3 >>>c 1 3 3 >>> >>>I've scoured the web, including Rseek and haven't found a resolution (and >>>note that a similar question was asked sometime in 2011 without an answer). >>>Does anyone have any thoughts? Thank you in advance. >>> >>>-- >>> >>>Matthew D. Venesky, Ph.D. >>> >>>Postdoctoral Research Associate, >>>Department of Integrative Biology, >>>The University of South Florida, >>>Tampa, FL 33620 >>> >>>Website: http://mvenesky.myweb.usf.edu/ >>> >>>??? [[alternative HTML version deleted]] >>> >>>______________________________________________ >>>R-help at r-project.org mailing list >>>https://stat.ethz.ch/mailman/listinfo/r-help >>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>and provide commented, minimal, self-contained, reproducible code. >>> >>> >>? >?