Hi Tim, There's no need to scrape and parse: check out the Download PLANTS database link on the left side of plants.usda.gov Sarah
On Wed, Jan 2, 2013 at 5:48 PM, Tim Seipel <t.seipel at env.ethz.ch> wrote:
Dear Listserv, My aim is to compile plant traits for a list of species from the USDA Plants database. Examples from the list include Poa pratensis Festuca idahoensis Astragalus miser In R, I started with this: library(RCurl) ############################################### plants<-'http://plants.usda.gov/java/nameSearch?' ############################################### url<-paste('mode=','sciname','&keywordquery=','Festuca baffinensis',sep='') sp.url<-paste(plants,url,sep='') ###the link goes to the correct webpage http://plants.usda.gov/java/nameSearch?mode=sciname&keywordquery=Festuca baffinensis <http://plants.usda.gov/java/nameSearch?mode=sciname&keywordquery=Festuca%20baffinensis> p1<-getURL(sp.url) I would like to extract the following text from the page: Symbol: FEBA Group: Monocot Family: Poaceae Duration: Perennial Growth Habit: <http://plants.usda.gov/java/nameSearch#> Graminoid Native Status: <http://plants.usda.gov/java/nameSearch#> L48 N AK N CAN N GL N However I can't seem to find it after parsing the string? Is this related to Java script? Can someone help me extract this information. Thanks for the help! Sincerely, Tim Seipel
-- Sarah Goslee http://www.functionaldiversity.org