[Rcpp-devel] add new components to list without specifying list size initially
Ok, I started with smaller examples. I understand more or less how to
manipulate IntegerVectors, but not StringVectors (see below), and thus I
can't even start manipulating a simple list of StringVectors. Even so I
looked at mailing lists, StackOverflow, package pdf, source code on
R-Forge...
The following code tells me "warning: cannot pass objects of non-POD type
?struct Rcpp::internal::string_proxy<16>? through ?...?; call will abort at
runtime": why does it complain about printing the string in vec_s[i]?
fn <- cxxfunction(signature(l_in="list"),
body='
using namespace Rcpp;
List l = List(l_in);
Rprintf("list size: %d\\n", l.size());
IntegerVector vec_i= IntegerVector(2);
vec_i[0] = 1;
vec_i[1] = 2;
List l2 = List::create(_["vec"] = vec_i);
Rprintf("vec_i size: %d\\n", vec_i.size());
for(int i=0; i<vec_i.size(); ++i)
Rprintf("vec_i[%d]=%d\\n", i, vec_i[i]);
StringVector vec_s = StringVector::create("toto");
vec_s[0] = "toto";
Rprintf("vec_s size: %d\\n", vec_s.size());
for(int i=0; i<vec_s.size(); ++i)
Rprintf("vec_s[%d]=%s\\n", i, vec_s[i]);
return l2;
',
plugin="Rcpp", verbose=TRUE)
print(fn(list(a=c(1,2,3), b=c("a","b","c"))))
Moreover, how can I access the component of a list given as input, as "l_in"
above? Should I use l.begin()? or l[1]? or l["a"]? none of them seems to
compile successfully.
On Thu, Aug 11, 2011 at 8:54 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
Howdy,
On 11 August 2011 at 20:44, Walrus Foolhill wrote:
| Ok, thanks for your answer, but I wasn't clear enough. So here are more
details
| of what I want to do.
|
| I have one list named "probes":
| probes <- list(chr1=data.frame(name=c("p1","p2"),
| start=c(81,95),
| end=c(85,100),
| stringsAsFactors=FALSE))
|
| I also have one list named "genes":
| genes <- list(chr1=data.frame(name=c("g1","g2"), start=c(11,111), end=c
| (90,190)),
| chr2=data.frame(name="g3", start=11, end=90))
|
| I need to compare those two lists in order to obtain the following list
which
| contains, for each gene, the name of the probes included in it:
| links <- list(chr1=list(g1=c("p1")))
|
| Here is my R function (assuming that the probes are sorted based on their
start
| and end coordinates):
|
| fun.l <- function(genes, probes){
| links <- lapply(names(genes), function(chr.name){
| if(! chr.name %in% names(probes))
| return(NULL)
|
| res <- list()
|
| genes.c <- genes[[chr.name]]
| probes.c <- probes[[chr.name]]
|
| for(gene.name in genes.c$name){
| gene <- genes.c[genes.c$name == gene.name,]
| res[[gene.name]] <- vector()
| for(probe.name in probes.c$name){
| probe <- probes.c[probes.c$name == probe.name,]
| if(probe$start >= gene$start && probe$end <= gene$end)
| res[[gene.name]] <- append(res[[gene.name]], probe.name)
| else if(probe$start > gene$end)
| break
| }
| if(length(res[[gene.name]]) == 0)
| res[[gene.name]] <- NULL
| }
|
| if(length(res) == 0)
| res <- NA
| return(res)
| })
| names(links) <- names(genes)
| links <- Filter(function(links.c){!is.null(links.c)}, links)
| return(links)
| }
|
| And here is the beginning of my attempt using Rcpp:
|
| src <- '
| using namespace Rcpp;
|
| List genes = List(genes_in);
| int genes_nb_chr = genes.length();
| std::vector<std::string> genes_chr = genes.names();
|
| List probes = List(probes_in);
| int probes_nb_chr = probes.length();
|
| std::vector< std::vector<std::string> > links;
|
| // the main task is performed in this loop
| for(int chrnum=0; chrnum<genes_nb_chr; ++chrnum){
| DataFrame genes_c = DataFrame(genes[chrnum]);
| // ... add code to map probes on genes, that is fill "links" ...
| }
|
| return wrap(links);
| '
|
| funC <- cxxfunction(signature(genes_in="list",
| probes_in="list"),
| body=src, plugin="Rcpp")
|
| The problem starts quite early: when I compile this piece of code, I get
| "error: call of overloaded ?DataFrame(Rcpp::internal::generic_proxy<19>)?
is
| ambiguous".
Try a simpler mock-up. I don't have it in me to work through this now.
DataFrames are a little different from C++ -- start by trying to summarize
in
just a vector, or collection of vectors.
| What should I do to go through the "probes" and "genes" lists given as
input?
| Maybe more generically, how can we go through a list of lists (of
lists...)
| with Rcpp?
|
| 2nd (small) question, I don't manage to use Rprintf when using inline,
for
| instance Rprintf("%d\n", i);, it complains about the quotes. What should
I do
| to print statement from within the for loop?
The backslashes need escaping as in
R> printing <- cxxfunction(, plugin="Rcpp", body=' Rprintf("foo\\n"); ')
R> printing()
foo
NULL
R>
| Thanks in advance. As my question is very long, I won't mind if you tell
me to
| find another way by myself. But maybe one of you can put me on the good
track.
You are doing good but you have decent size problem. Try breaking into
smaller pieces and a handle on each problem in turn.
Dirk
|
| On Thu, Aug 11, 2011 at 7:00 AM, Dirk Eddelbuettel <edd at debian.org>
wrote:
|
|
| On 11 August 2011 at 03:06, Walrus Foolhill wrote:
| | Hello,
| | I need to create a list and then fill it sequentially by adding
| components in a
| | for loop. Here is an example that works:
| |
| | library(inline)
| | src <- '
| | Rcpp::List mylist(2);
| | for(int i=0; i<2; ++i)
| | mylist[i] = i;
| | mylist.names() = CharacterVector::create("a","b");
| | return mylist;
| | '
| | fun <- cxxfunction(body=src, plugin="Rcpp")
| | print(fun())
| |
| | But what I really want is to create an empty list and then fill it,
that
| is
| | without specifying its number of components before hand... This is
| because I
| | don't know in advance at which step of the for loop I will need to
create
| a new
| | component. Here is an example, that obviously doesn't work, but
that
| should
| | show what I am looking for:
| |
| | Rcpp::List mylist;
| | CharacterVector names = CharacterVector::create("a", "b");
|
| If you know how long names is, you know how long mylist going to be
....
|
| | for(int i=0; i<2; ++i){
| | mylist.add(names[i], IntegerVector::create());
| | mylist[names[i]].push_back(i);
|
| I don't understand what that is trying to do.
|
| | }
| | return mylist;
| |
| | Do you know how I could achieve this? Thanks.
|
| Rcpp::List is an alias for Rcpp::GenericVector, and derives from
Vector.
| You
| can look at the public member functions -- there are things like
|
| push_back()
| push_front()
| insert()
|
| etc that behave like STL functions __but are inefficient as we
(almost
| always) need to copy the whole object__ so they are not recommended.
|
| When I had to deal with 'unknown quantities of data' returning I was
mostly
| able to either turn it into a 'fixed or known columns, unknow rows'
problem
| (easy, just grow row-wise) or I 'cached' in a C++ data structure
first
| before
| returning to R via Rcpp structures -- and then I knew the dimensions
for
| the
| to-be-created object too.
|
| Dirk
|
|
| --
| Two new Rcpp master classes for R and C++ integration scheduled for
| New York (Sep 24) and San Francisco (Oct 8), more details are at
| http://dirk.eddelbuettel.com/blog/2011/08/04#
| rcpp_classes_2011-09_and_2011-10
|
|
--
Two new Rcpp master classes for R and C++ integration scheduled for
New York (Sep 24) and San Francisco (Oct 8), more details are at
http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110812/7e39f407/attachment-0001.htm>