Skip to content

Treatment of xml-stylesheet processing instructions in XML module

2 messages · Adam Cooper, Duncan Temple Lang

#
Hello again,
Another stumble here that is defeating me.

I try:
a<-readLines(url("http://feeds.feedburner.com/grokin"))
t<-XML::xmlTreeParse(a, ignoreBlanks=TRUE, replaceEntities=FALSE,
asText=TRUE)
elem<- XML::getNodeSet(XML::xmlRoot(t),"/rss/channel/item")[[1]]

And I get:
Start tag expected, '<' not found
Error: 1: Start tag expected, '<' not found

When I modify the second line in "a" to remove the following (just
leaving the <rss> tag with its attributes), I do not get the error.
I removed:
<?xml-stylesheet type=\"text/xsl\" media=\"screen\" href=
\"/~d/styles/rss2full.xsl\"?><?xml-stylesheet type=\"text/css\" media=
\"screen\" href=\"http://feeds.feedburner.com/~d/styles/itemcontent.css
\"?>

I would have expected the PI to be totally ignored by default.
Have I missed something??

Thanks in advance...

Cheers, Adam
#
Hi Adam

To use XPath and getNodeSet on an XML document,
you will want to use xmlParse() and not xmlTreeParse()
to parse the XML content. So

t = xmlParse(I(a)) # or asText = TRUE
elem = getNodeSet(t, "/rss/channel/item")[[1]]

works fine.

You don't need to specify the root node, but rather the document
in getNodeSet.

Also, if you have the package loaded, you don't need the XML::
prefix before the function  names.

  HTH
    D.
On 4/6/11 11:32 AM, Adam Cooper wrote: