Skip to content
Prev 78439 / 398502 Next

newbie questions - looping through hierarchial datafille

Well I haven't seen any replies to this, so I have had a stab at the 
problem of getting the data into a data frame.

The approach I took was to break up the data into a list, and then fill in 
a matrix, row by row, "filling down" a la spreadsheet style when necessary, 
taking advantage of the ordering of the data. Then coercing to a 
data.frame. Maybe not a very portable/general solution, but it appears to work.

list.to.data.frame <- function () {
filecon <- file(file.choose()) # open a data file
dat <- strsplit(readLines(filecon, n=-1), split=" ") # read all the data 
into a list,
                                         # 1 line per element, each element is
                                         # a character vector of data 
(variable length)
resultvec <- matrix(rep(NA, 16), nrow=1) # results will be stored here

filldown <- function (x) {
# cluge to simulate fill-down of a vector, spreadsheet style
         if(all(is.na(x)) || all(!is.na(x))) x else {
         last <- min(which(is.na(x)))
         x[last:length(x)] <- x[last-1]
         x
         }
}

#loop through the data
for (vec in dat) {
         f <- switch(vec[1], # what kind of field are we dealing with?
                 "A" = c(vec[-1], rep(NA, 15)),
                 "X" = c(NA, vec[-1], rep(NA, 12)),
                 "P" = c(rep(NA,4), vec[-1], rep(NA, 8)),
                 "T" = c(rep(NA, 8), vec[-1], rep(NA, 6)),
                 "L" = c(rep(NA, 10), vec[-1], rep(NA, 3)),
                 "F" = c(rep(NA, 13), vec[-1]))
         if (any(is.na(resultvec[nrow(resultvec), which(!is.na(f))])))
         # slot the data into the appropriate column
         resultvec[nrow(resultvec),] <- 
ifelse(is.na(resultvec[nrow(resultvec),]), f,
         resultvec[nrow(resultvec),]) else
         # if the row is full, start a new one
         resultvec <- rbind(resultvec, f)
         # if we are at the end of a row, fill down and start a new row
         if (vec[1] == "F") resultvec <- rbind(apply(resultvec, 2, 
filldown), rep(NA, 16))
         }

# coerce to a data frame, and get rid of the last empty row
res <- as.data.frame(resultvec[-nrow(resultvec),], row.names=NULL)
# set column names
names(res) <- c("Inventory", "Stratum_no", "Total", "Ye", "Plot_no", "age", 
"slope",
"species", "tree_no", "frequency", "leader",  "diameter", "height", 
"start_height",
"finish_height", "feature")
#return the result
res
}

Cheers,

Simon.
At 10:36 AM 4/10/2005, you wrote:
Simon Blomberg, B.Sc.(Hons.), Ph.D, M.App.Stat.
Centre for Resource and Environmental Studies
The Australian National University
Canberra ACT 0200
Australia
T: +61 2 6125 7800 email: Simon.Blomberg_at_anu.edu.au
F: +61 2 6125 0757
CRICOS Provider # 00120C