Matrix::spMatrix can help.
Read your data file with lns <- readLines("fileName") to get
something like
lns <- c("1 5:15 7:17 9:19",
"2 2:22 8:28",
"4 6:46")
Then use a function like the following that reformats the
data to the i=row,j=col,x=value vectors that spMatrix can use.
f <- function(lns, nrow=NULL, ncol=NULL)
{
# expect lines of the form "rowNum<whiteSpace>colNum:value[<whiteSpace>colNum:value ...]"
triples <- unlist(lapply(strsplit(lns, "[ \t]+"), function(ln)paste(sep=":",ln[1],ln[-1]))))
triples <- strsplit(triples, ":")
if (any(which <- vapply(triples, length, 0) != 3)) stop("formatting error")
ijx <- matrix(as.numeric(unlist(triples)), ncol=3, byrow=TRUE)
if (is.null(nrow)) nrow <- max(ijx[,1])
if (is.null(ncol)) ncol <- max(ijx[,2])
spMatrix(nrow=nrow, ncol=ncol, i=ijx[,1], j=ijx[,2], x=ijx[,3])
}
Use it as
f(lns)
4 x 9 sparse Matrix of class "dgTMatrix"
[1,] . . . . 15 . 17 . 19
[2,] . 22 . . . . . 28 .
[3,] . . . . . . . . .
[4,] . . . . . 46 . . .
or, if you know the number of rows and columns, tell it:
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
Of Noah Silverman
Sent: Monday, October 08, 2012 9:57 PM
To: r-help
Subject: [R] Convert COLON separated format
I have a bunch of data sets that were created for the libsvm tool. They are in "colon
separated sparse format".
i.e.
1 5:1 27:3 345:10
Is a row with the label of "1" and only has values in columns 5, 27, and 345.
I want to read these into a data.frame in R.
Is there a simple way to do this?
--
Noah Silverman, M.S.
UCLA Department of Statistics
8117 Math Sciences Building
Los Angeles, CA 90095