Skip to content

XML to CSV

5 messages · Ben Tupper, Jeff Newmiller, Andrew Lachance +1 more

#
Hi,

You should keep replies on the list - you never know when someone will swoop in with the right answer to make your life easier.

Below is a simple example that uses xpath syntax to identify (and in this case retrieve) children that match your xpath expression.  xpath epxressions are sort of like /a/directory/structure/description so you can visualize elements of XML like nested folders or subdirectories.

Hopefully this will get you started.  A lot more on xpath here http://www.w3schools.com/xml/xml_xpath.asp  There are other extraction tools in xml2 - just type ?xml2 at the command prompt to see more.

Since you have more deeply nested elements you'll need to play with this a bit first.

library(xml2)
uri = 'http://www.w3schools.com/xml/simple.xml'
x = read_xml(uri)

name_nodes = xml_find_all(x, "//name")
name = xml_text(name_nodes)

price_nodes = xml_find_all(x, "//price")
price = xml_text(price_nodes)

calories_nodes = xml_find_all(x, "//calories")
calories = xml_double(calories_nodes)

X = data.frame(name, price, calories, stringsAsFactors = FALSE)
write.csv(X, file = 'foo.csv')

Cheers,
Ben
Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org
#
Andrew... you really need to understand the outline/tree nature of your XML schema to understand why blanks might appear in your data when you try to squeeze it into a rectangular layout like CSV. Opening the file in a modern Web browser like Firefox can help you see the forest among the trees, since they can collapse/expand the subtrees. Keep in mind that understanding how XML works is really not the purpose of this list, but there are lots of books and tutorials about it such as the one mentioned by Ben. If your schema has many irregular subtrees it may be a poor match for fitting into one CSV and you might need to resort to putting it into a relational group of CSV files... but relational schema design is another off-topic area of study for this list. Once you know what you want to accomplish a little better (enough to make example input and output data sets) we can help you more with the R coding aspect of your problem... but guessing at your needs with no access to data is really not effective use of anyone's time.
#
Hello Andrew,

as you are "clean slate" anyway in handling XML files, you could take a look to XSLT processing -- also an off-topic area. 
There are free tools available around, and many examples of "XML to CSV XSLT" on StackOverflow.

HTH,
Gabriele

-----Original Message-----
On January 4, 2017 12:45:08 PM PST, Ben Tupper <btupper at bigelow.org> wrote:
20 days later
#
Hello all,

Thank you for the extremely helpful information. As a follow up, some of
the nested elements are of the form below:
-<DischargeMedication>
    <Medication MedAdmin="0" MedID="10"/>
    <Medication MedAdmin="0" MedID="11"/>

I've been having trouble extracting this information and was wondering if
anyone had any suggestions.

Thank you,
Andrew

On Thu, Jan 5, 2017 at 7:39 AM, Franzini, Gabriele [Nervianoms] <
Gabriele.Franzini at nervianoms.com> wrote:

            

  
    
#
They are attributes, not nodes so, if I understood the question:

"//DischargeMedication/Medication/@MedAdmin"
"//DischargeMedication/Medication/@MedID"

should do.
HTH, 
Gabriele


From: Andrew Lachance [mailto:alachanc at bates.edu] 
Sent: Wednesday, January 25, 2017 3:12 PM
To: Franzini, Gabriele [Nervianoms]
Cc: r-help at r-project.org
Subject: Re: [R] XML to CSV

Hello all,

Thank you for the extremely helpful information. As a follow up, some of the nested elements are of the form below:
-<DischargeMedication>
? ? <Medication MedAdmin="0" MedID="10"/>
? ? <Medication MedAdmin="0" MedID="11"/>

I've been having trouble extracting this information and was wondering if anyone had any suggestions.

Thank you,
Andrew
On Thu, Jan 5, 2017 at 7:39 AM, Franzini, Gabriele [Nervianoms] <Gabriele.Franzini at nervianoms.com> wrote:
Hello Andrew,

as you are "clean slate" anyway in handling XML files, you could take a look to XSLT processing -- also an off-topic area.
There are free tools available around, and many examples of "XML to CSV XSLT" on StackOverflow.

HTH,
Gabriele

-----Original Message-----
On January 4, 2017 12:45:08 PM PST, Ben Tupper <btupper at bigelow.org> wrote: