I dug around in the libxml code and the Web to verify that
validation is indeed only possible in libxml when one uses
DOM (i.e. xmlTreeParse()).
Using DOM is not an option for me, so I need to "validate" the xml parts
I'm interested in within my creation mechanism. It's OK, but not the
best solution in questions of design.
BTW, there is a new version of the XML package on the
Omegahat web site.
I'll use it extensive in this days and unfortunately I have already a
question/problem pending:
Taking the following R function:
test<-function(){
sep=""
xmlText <-""
xmlText <-paste(xmlText,"<spectrum id=\"3257\">",sep=sep)
xmlText <-paste(xmlText,"<mzArrayBinary>",sep=sep)
xmlText <-paste(xmlText,"<data>Monday</data>",sep=sep)
xmlText <-paste(xmlText,"</mzArrayBinary>",sep=sep)
xmlText <-paste(xmlText,"<intenArrayBinary>",sep=sep)
xmlText <-paste(xmlText,"<data>Tuesday</data>",sep=sep)
xmlText <-paste(xmlText,"</intenArrayBinary>",sep=sep)
# xmlText <-paste(xmlText,"</spectrum>",sep=sep)
# xmlText <-paste(xmlText,"<spectrum id=\"3259\">",sep=sep)
xmlText <-paste(xmlText,"<mzArrayBinary>",sep=sep)
xmlText <-paste(xmlText,"<data>Wednesday</data>",sep=sep)
xmlText <-paste(xmlText,"</mzArrayBinary>",sep=sep)
xmlText <-paste(xmlText,"<intenArrayBinary>",sep=sep)
xmlText <-paste(xmlText,"<data>Thursday</data>",sep=sep)
xmlText <-paste(xmlText,"</intenArrayBinary>",sep=sep)
xmlText <-paste(xmlText,"</spectrum>",sep=sep)
xmlEventParse(xmlText, asText=TRUE, handlers = list(text =
function(x, ...) {cat(nchar(x),x, "\n")}))
return(invisible(NULL))
}
Using this function in the given form works fine. xmlEventParse() with
the simplest handler I can imagine finds all 4 text-nodes within the
<spectrum> tag and prints them out. But if one uncomment both lines in
the middle, introducing 2 <spectrum> tags with different id's
xmlEventParse() returns with an exception. Of course the weekdays within
<data> are arbitrary values used here. Further, using an other input
file I could see, that for one and the same <data> node the handler for
"text"-nodes was invoked two times, one time for a first part of the
content and one time for the rest of the content. Both invocations
together gave me exactly the content from the <data> node.
So, am I on the wrong way? Or is this some buggy behaviour?
I appreciat any help and assistance!
Jan