Here's one possibility, if you know the number of species and the numbers of
rows and columns before hand, and the dimension for all species are the
same.
readSpeciesMap <- function(fname, nspecies, nr, nc) {
spcnames <- character(nspecies)
spcdata <- array(0, c(nc, nr, nspecies))
## open the file for reading, and close it upon exit.
f <- file(fname, open="r")
on.exit(close(f))
for (i in seq(along=spcnames)) {
## read the name
spcnames[i] <- readLines(f, 1)[[1]]
## read the grid
spcdata[, , i] <- as.numeric(unlist(strsplit(readLines(f, nr), "")))
## pick up the empty line
readLines(f, 1)
}
## replace the 9s with NAs
spcdata[spcdata == 9] <- NA
dimnames(spcdata)[[3]] <- spcnames
## "transpose" the array in each species
aperm(spcdata, c(2, 1, 3))
}
Using the example you supplied (saved in the file "species.txt"):
readSpeciesMap("species.txt", 3, 6, 9)
, , SPECIES1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA NA NA 0 0 1 0 NA NA
[2,] NA 0 0 1 1 0 1 0 NA
[3,] 0 1 1 1 0 1 0 0 0
[4,] NA 0 1 1 0 0 1 0 1
[5,] 1 1 0 1 0 0 0 1 NA
[6,] NA 0 1 1 1 0 0 1 NA
, , SPECIES2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA NA NA 0 0 0 0 NA NA
[2,] NA 0 0 1 1 0 1 1 NA
[3,] 0 1 1 1 0 1 1 0 0
[4,] NA 0 1 0 1 0 1 0 1
[5,] 1 1 0 0 0 0 0 1 NA
[6,] NA 0 0 0 0 0 0 1 NA
, , SPECIES3
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA NA NA 0 0 1 0 NA NA
[2,] NA 0 0 1 0 0 1 0 NA
[3,] 0 1 1 1 0 0 0 1 0
[4,] NA 0 1 1 0 0 1 0 0
[5,] 1 1 0 1 0 0 0 1 NA
[6,] NA 0 1 1 1 0 0 1 NA
Andy
From: Colin Beale
Hi,
I'm needing some help finding a function to read a large text
file into an array in R. The data are essentially presence /
absence / na data for many species and come as a grid with
each species name (after two spaces) at the beginning of the
matrix defining the map for that species. An excerpt could
therefore be:
SPECIES1
999001099
900110109
011101000
901100101
110100019
901110019
SPECIES2
999000099
900110119
011101100
901010101
110000019
900000019
SPECIES3
999001099
900100109
011100010
901100100
110100019
901110019
where 9 is actually na, 0 is absence and 1 presence. The
final array I want to create should have dimensions that are
the x and y coordinates and the number of species (known in
advance). (In this example dim = c(9,6,3)). It would be sort
of neat if the code could also read the species name into the
appropriate names attribute, but this is a refinement that I
could probably do if someone can help me read the data into R
and into an array in the first place. I'm currently thinking
a line by line approach using readLines might be the best
option, but I've got a very long file - well over 100
species, each a matrix of 70 x 100 datapoints. making this
option rther time consuming, I expect - especially as the
next dataset has 1300 species and a much larger grid...
Any hints would be gratefully recieved.
Colin Beale
Macaulay Land Use Research Institute