reading tables from multiple HTML pages
Hi, beginner to R and was having some problems scraping data from tables in html using the XML package. I have included some code below. I am trying to loop through a series of html pages, each of which contains a single table from which I want to scrape data. However, some of the pages are blank - and so it throws me an error message when it gets to htmlParse(). The loop then closes out and I get the error message below: Error in htmlParse(url) : error in creating parser for http://www.szrd.gov.cn/viewcommondbfc.do?id=728 How might be best to go about keeping the loop running so I can parse the rest? **************************************************** library(XML) url_root<-"http://www.szrd.gov.cn/viewcommondbfc.do?id=" for(i in 700:750){ url = paste(url_root, i, sep="") doc = htmlParse(url) tableNodes = getNodeSet(doc, "//table") tbl = readHTMLTable(tableNodes[[3]]) } **************************************************** Steve Oliver Department of Political Science University of California at San Diego 9500 Gilman Dr. La Jolla, CA 92092 -- View this message in context: http://r.789695.n4.nabble.com/reading-tables-from-multiple-HTML-pages-tp3776605p3776605.html Sent from the R help mailing list archive at Nabble.com.