Skip to content

how to read a web page and extract an html table?

1 message · Pikounis, Bill

#
Adrian,
Parsing arbitrary HTML is generally a nontrivial task.  I would recommend
using something like Perl to convert the HTML to delimited ASCII, and then
use read.table() for example. There are specific modules in Perl (for
example) that can help with the "HTML-2-ASCII" step, if not do it entirely.
I have never used one myself, but I am sure CPAN can be searched for one.

Hope that helps,
Bill


----------------------------------------
Bill Pikounis, Ph.D.
Biometrics Research Department
Merck Research Laboratories
PO Box 2000, MailDrop RY84-16  
126 E. Lincoln Avenue
Rahway, New Jersey 07065-0900
USA

v_bill_pikounis at merck.com

Phone: 732 594 3913
Fax: 732 594 1565