Skip to content

parse XML file

2 messages · Kai Serschmarn, Ben Tupper

#
Thank you Barry, that works fine.
Sorry for stupid questions... however, I couldn't manage to get a  
dataframe out of this.

That's what I was doing:

doc = xmlRoot(xmlTreeParse("de.dwd.klis.TADM.xml"))
dumpData <-  function(doc){
	for(i in 1:length(doc)){
		stns = doc[[i]]
	for (j in 1:length(stns)){
		cat(stns$attributes['value'],stns[[j]][[1]]$value,stns[[j]] 
$attributes['date'],"\n")
		}
		}
		}
dumpData(doc)

Thanks for your helping
kai
#
Hi,
On Jun 29, 2011, at 6:26 AM, Kai Serschmarn wrote:

            
Perhaps this would work for you.  It generates a list of data frames,  
one for each station.

###### BEGIN

## start with your doc - split it into a list of nodes (one for each  
child)
stn <-  xmlChildren(doc)


# converts a station node to a data frame
getMyStation <- function(x){

    # get the name of the station
    stationName <- xmlAttrs(x)["value"]

    # a function to extract the date and value
    getMyRecords <- function(x){
       date <- xmlAttrs(x)["date"]
       val <- xmlValue(x)
       y <- c( date, val)
       return(y)
    }

    # for each child, extract the records
    r <- lapply(x, getMyRecords)
    nR <- length(r)

    # bind into one matrix - all characters as this point
    y <- do.call(rbind, r)

    # make a data.frame
    df <- data.frame("Station" = rep(stationName, nR), "date" = y[,1],  
"value" = y[,2],
       row.names = 1:nR, stringsAsFactors = FALSE)

    return(df)
}


# now loop through the station nodes - extract data into a data frame
x <- lapply(stn, getMyStation)

##### END


Cheers,
Ben

Ben Tupper
Bigelow Laboratory for Ocean Sciences
180 McKown Point Rd. P.O. Box 475
West Boothbay Harbor, Maine   04575-0475
http://www.bigelow.org/