VLOOKUP in R - tried everything.
On Jun 3, 2011, at 12:43 PM, peter dalgaard wrote:
On Jun 3, 2011, at 16:59 , bjmjarrett wrote:
I am attempting to emulate the VLOOKUP function from Excel in R. I want to compare one column (coll.minus.release) with another (release.days) to get the number of parasitoid released at that time (TotalParasitoids). for example: coll.minus.release release.days ParasitoidTotal -12 -266 1700 8 -259 1000 8 -225 1000 28 -216 1000 41 -28 1148 77 -12 1144 105 0 1160 105 8 972 125 28 1146 125 41 1004 125 77 1003 125 97 1010 .... 2772 NA NA 2801 NA NA 2834 NA NA vlookup <- function(x) data[data$release.days==x,6] # as I have three other columns that are not of interest vlookup(-12) = 1144, and so on, which is great. However, when I try: unlist(sapply(coll.minus.release,vlookup)) to apply it to the whole coll.minus.release it works up to a point, as it doesn't give me 132 values for the 132 values of coll.minus.release. Is this because the table of release.days and TotalParasitoid has less values than coll.minus.release (108 compared to 132)? To fill the gap I put in 0, and as none of the coll.minus.release values = 0 I think it wouldn't affect it.
I wager that a look at setdiff(coll.minus.release,release.days) and vice versa would be illuminating. Notice that with your definition, vlookup(31415926) or any other number absent from release.days gives a zero-length vector. Presumably, you are looking for match().
If I remember correctly VLOOKUP (and HLOOKUP) require sorted cutpoints and picks the lowest number that the vector element exceeds or equals. If so, then I suspect that an indexing strategy with findInterval should work correctly if the dataframe is sorted with order(). If you post a cut-pasteable example with desited input and output, I sure someone can rig something up.
David. > > >> >> Other things I have tried include findInterval and match. >> >> data[findInterval(x=data$coll.minus.release,vec=data >> $release.days,"ParasitoidTotal")] >> >> didn't work as it said vec must be sorted non-decreasingly and >> didn't work >> when I randomised the release.days and ParasitoidTotal columns as >> it doesn't >> matter which order they are in. >> >> Thanks for reading all the way through - I wanted all the >> information I felt >> you might need to help me in it. >> >> Any help will be greatly appreciated. >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/VLOOKUP-in-R-tried-everything-tp3571107p3571107.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Peter Dalgaard > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT