problem extracting data from a set of list vectors
I think that instead of: obj = all.comps[[i]]; you should have obj <- get(all.comps[i]) Test out your programs step by step manually. Use the 'all.comps' object and see what happens with the various indexing modes. This is "debugging 101". On Thu, Apr 19, 2012 at 1:01 PM, Vining, Kelly
<Kelly.Vining at oregonstate.edu> wrote:
Thanks for the help, Don. Lots of good suggestions there. Unfortunately, I'm still not able to access the data object. Still looking for a solution. Here's the error I'm getting when I try your suggestion: [1] "res.Callus.Explant" "res.Callus.Regen" ? "res.Explant.Regen"
all.comps <- ls(pattern="^res")
for(i in all.comps){
+ obj = all.comps[[i]];
+ gene.ids = rownames(obj$counts);
+ x = data.frame(gene.ids = gene.ids, obj$e1, obj$e2, obj$log.fc,
+ obj$p.value, obj$q.value);
+ x = subset(x, x$obj.p.value<0.05 | x$obj.q.value<=0.1);
+ cat("output object name is: ",paste("Diffgenes",i,sep="."),"\n");
+ cat("output object data is: \n");
+ print(tmp);
+ cat("\n");
+ }
Error in all.comps[[i]] : subscript out of bounds
In response to another helpful suggestion, here's the structure of this data list:
str(res.Callus.Explant)
List of 18 ?$ name ? ? ? ? : chr "two group comparison" ?$ group1 ? ? ? : chr "Callus" ?$ group2 ? ? ? : chr "Explant" ?$ alternative ?: chr "two.sided" ?$ rows ? ? ? ? : int [1:39009] 1 2 3 4 5 6 7 8 9 10 ... ?$ counts ? ? ? : num [1:39009, 1:6] 0 121 237 6 7 116 6 2 860 0 ... ?..- attr(*, "dimnames")=List of 2 ?.. ..$ : chr [1:39009] "POPTR_0018s00200" "POPTR_0008s00200" "POPTR_0004s00200" "POPTR_0019s00200" ... ?.. ..$ : chr [1:6] "Callus_BiolRep1" "Callus_BiolRep2" "Callus_BiolRep3" "Explant_BiolRep1" ... ?$ eff.lib.sizes: Named num [1:6] 3120288 2788297 2425164 3653109 3810261 ... ?..- attr(*, "names")= chr [1:6] "V3" "V4" "V5" "V6" ... ?$ dispersion ? : num [1:39009, 1:6] NA 0.0743 0.0434 0.6423 0.3554 ... ?..- attr(*, "dimnames")=List of 2 ?.. ..$ : chr [1:39009] "POPTR_0018s00200" "POPTR_0008s00200" "POPTR_0004s00200" "POPTR_0019s00200" ... ?.. ..$ : chr [1:6] "Callus_BiolRep1" "Callus_BiolRep2" "Callus_BiolRep3" "Explant_BiolRep1" ... ?$ x ? ? ? ? ? ?: num [1:6, 1:2] 1 1 1 1 1 1 1 1 1 0 ... ?..- attr(*, "dimnames")=List of 2 ?.. ..$ : chr [1:6] "Callus" "Callus" "Callus" "Explant" ... ?.. ..$ : chr [1:2] "Intercept" "Callus-Explant" ?$ beta0 ? ? ? ?: num [1:2] NA 0 ?$ beta.hat ? ? : num [1:39009, 1:2] NA -10.13 -9.65 -13 -12.2 ... ?$ beta.tilde ? : num [1:39009, 1:2] NA -10.26 -9.74 -13.11 -12.33 ... ?$ e ? ? ? ? ? ?: num [1:39009] NA 35.08 58.82 2.03 4.43 ... ?$ e1 ? ? ? ? ? : num [1:39009] NA 30.23 53.77 1.78 3.89 ... ?$ e2 ? ? ? ? ? : num [1:39009] NA 39.83 64.46 2.27 5.01 ... ?$ log.fc ? ? ? : num [1:39009] NA 0.398 0.262 0.353 0.366 ... ?$ p.values ? ? : num [1:39009] NA 0.246 0.33 0.748 0.645 ... ?$ q.values ? ? : num [1:39009] NA 1 1 1 1 1 1 1 1 1 ...
________________________________________
From: MacQueen, Don [macqueen1 at llnl.gov]
Sent: Wednesday, April 18, 2012 2:42 PM
To: Vining, Kelly; r-help at r-project.org
Subject: Re: [R] problem extracting data from a set of list vectors
Try this (NOT tested) or something similar:
all.comps <- ls(pattern="^res")
for(i in all.comps) {
?obj <- all.comops[[i]]
?gene.ids <- rownames(obj$counts)
?x <- data.frame(gene.ids = gene.ids, obj$counts,
? ? ? ? obj$e1, obj$e2,
? ? ? ? obj$log.fc, ?obj$p.value,
? ? ? ? obj$q.value)
?x <- ?subset(x, obj.p.value<0.05 | obj.q.value<=0.1)
?assign( paste('DiffGenes',i,sep='.') , x, '.GlobalEnv')
}
Before you try this, make sure you have a copy of everything, or can
reconstruct it. The assign() function is dangerous. With it you can
overwrite other data if you are not careful.
You might test first; instead of using assign() as above, instead do
?cat('output object name is: ?', paste('DiffGenes',i,sep='.'),'\n')
?cat('output object data is:\n')
?print(tmp)
?cat('\n')
To explain a little:
?i is the name of the data structure, not the data structure itself
you extract the data structure from all.comps using [[i]]
The assign() function takes the output object (tmp in this case)
and writes it to the "global environment" using a name that is
constructed using paste().
The global environment is the first place in your search path;
see search().
Note the simplification of the subset() statement.
You don't need semi-colons at the end of each line.
When you construct x, you might find it helpful to name the rest of the
columns, not just the first one. Instead of letting it construct names.
I re-wrapped the lines in the hopes that my email software will not
re-wrap them for me.
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
On 4/18/12 1:13 PM, "Vining, Kelly" <Kelly.Vining at oregonstate.edu> wrote:
Dear useRs,
A colleague has sent me several batches of output I need to process, and
I'm struggling with the format to the point that I don't even know how to
extract a test set to upload here. My apologies, but I think that my
issue is straightforward enough (for some of you, not for me!) that you
can help in the absence of a test set. Here is the scenario:
# Data sets are lists:
ls()
[1] "res.Callus.Explant" "res.Callus.Regen" ? "res.Explant.Regen"
is.list(res.Callus.Explant)
[1] TRUE
# The elements of each list look like this:
names(res.Callus.Explant)
[1] "name" ? ? ? ? ?"group1" ? ? ? ?"group2" ? ? ? ?"alternative"
"rows" ? ? ? ? ?"counts"
[7] "eff.lib.sizes" "dispersion" ? ?"x" ? ? ? ? ? ? "beta0"
"beta.hat" ? ? ?"beta.tilde"
[13] "e" ? ? ? ? ? ? "e1" ? ? ? ? ? ?"e2" ? ? ? ? ? ?"log.fc"
"p.values" ? ? ?"q.values"
I want to 1) extract specific fields from this data structure into a data
frame, 2) subset from this data frame into a new data frame based on
selection criteria. What I've done is this:
all.comps <- ls(pattern="^res")
for(i in all.comps){
obj = i;
gene.ids = rownames(obj$counts);
x = data.frame(gene.ids = gene.ids, obj$counts, obj$e1, obj$e2,
obj$log.fc,
obj$p.value, obj$q.value);
DiffGenes.i = subset(x, x$obj.p.value<0.05 | x$obj.q.value<=0.1)
}
Obviously, this doesn't work because pattern searching in the first line
is not feeding the entire data structure into the all.comps variable. But
how can I accomplish feeding the whole data structure for each one of
these lists into the loop? ?Should I be able to use sapply here? If so,
how? Also, I suspect that "DiffGenes.i" is not going to give me the data
frame I want, which in the example I'm showing would be
"DiffGenes.res.Callus.Explant." How should I name output data frames from
a loop like this (if a loop is even the best way to do this)?
Any help with this will be greatly appreciated.
--Kelly V.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.