Hello,
I need to create a list and then fill it sequentially by adding components
in a for loop. Here is an example that works:
library(inline)
src <- '
Rcpp::List mylist(2);
for(int i=0; i<2; ++i)
mylist[i] = i;
mylist.names() = CharacterVector::create("a","b");
return mylist;
'
fun <- cxxfunction(body=src, plugin="Rcpp")
print(fun())
But what I really want is to create an empty list and then fill it, that is
without specifying its number of components before hand... This is because I
don't know in advance at which step of the for loop I will need to create a
new component. Here is an example, that obviously doesn't work, but that
should show what I am looking for:
Rcpp::List mylist;
CharacterVector names = CharacterVector::create("a", "b");
for(int i=0; i<2; ++i){
mylist.add(names[i], IntegerVector::create());
mylist[names[i]].push_back(i);
}
return mylist;
Do you know how I could achieve this? Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110811/2fdb9a09/attachment.htm>
[Rcpp-devel] add new components to list without specifying list size initially
10 messages · Walrus Foolhill, Steve Lianoglou, Dirk Eddelbuettel
On 11 August 2011 at 03:06, Walrus Foolhill wrote:
| Hello,
| I need to create a list and then fill it sequentially by adding components in a
| for loop. Here is an example that works:
|
| library(inline)
| src <- '
| Rcpp::List mylist(2);
| for(int i=0; i<2; ++i)
| ? mylist[i] = i;
| mylist.names() = CharacterVector::create("a","b");
| return mylist;
| '
| fun <- cxxfunction(body=src, plugin="Rcpp")
| print(fun())
|
| But what I really want is to create an empty list and then fill it, that is
| without specifying its number of components before hand... This is because I
| don't know in advance at which step of the for loop I will need to create a new
| component. Here is an example, that obviously doesn't work, but that should
| show what I am looking for:
|
| Rcpp::List mylist;
| CharacterVector names = CharacterVector::create("a", "b");
If you know how long names is, you know how long mylist going to be ....
| for(int i=0; i<2; ++i){
| ? mylist.add(names[i], IntegerVector::create());
| ? mylist[names[i]].push_back(i);
I don't understand what that is trying to do.
| }
| return mylist;
|
| Do you know how I could achieve this? Thanks.
Rcpp::List is an alias for Rcpp::GenericVector, and derives from Vector. You
can look at the public member functions -- there are things like
push_back()
push_front()
insert()
etc that behave like STL functions __but are inefficient as we (almost
always) need to copy the whole object__ so they are not recommended.
When I had to deal with 'unknown quantities of data' returning I was mostly
able to either turn it into a 'fixed or known columns, unknow rows' problem
(easy, just grow row-wise) or I 'cached' in a C++ data structure first before
returning to R via Rcpp structures -- and then I knew the dimensions for the
to-be-created object too.
Dirk
Two new Rcpp master classes for R and C++ integration scheduled for New York (Sep 24) and San Francisco (Oct 8), more details are at http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
Ok, thanks for your answer, but I wasn't clear enough. So here are more
details of what I want to do.
I have one list named "probes":
probes <- list(chr1=data.frame(name=c("p1","p2"),
start=c(81,95),
end=c(85,100),
stringsAsFactors=FALSE))
I also have one list named "genes":
genes <- list(chr1=data.frame(name=c("g1","g2"), start=c(11,111),
end=c(90,190)),
chr2=data.frame(name="g3", start=11, end=90))
I need to compare those two lists in order to obtain the following list
which contains, for each gene, the name of the probes included in it:
links <- list(chr1=list(g1=c("p1")))
Here is my R function (assuming that the probes are sorted based on their
start and end coordinates):
fun.l <- function(genes, probes){
links <- lapply(names(genes), function(chr.name){
if(! chr.name %in% names(probes))
return(NULL)
res <- list()
genes.c <- genes[[chr.name]]
probes.c <- probes[[chr.name]]
for(gene.name in genes.c$name){
gene <- genes.c[genes.c$name == gene.name,]
res[[gene.name]] <- vector()
for(probe.name in probes.c$name){
probe <- probes.c[probes.c$name == probe.name,]
if(probe$start >= gene$start && probe$end <= gene$end)
res[[gene.name]] <- append(res[[gene.name]], probe.name)
else if(probe$start > gene$end)
break
}
if(length(res[[gene.name]]) == 0)
res[[gene.name]] <- NULL
}
if(length(res) == 0)
res <- NA
return(res)
})
names(links) <- names(genes)
links <- Filter(function(links.c){!is.null(links.c)}, links)
return(links)
}
And here is the beginning of my attempt using Rcpp:
src <- '
using namespace Rcpp;
List genes = List(genes_in);
int genes_nb_chr = genes.length();
std::vector<std::string> genes_chr = genes.names();
List probes = List(probes_in);
int probes_nb_chr = probes.length();
std::vector< std::vector<std::string> > links;
// the main task is performed in this loop
for(int chrnum=0; chrnum<genes_nb_chr; ++chrnum){
DataFrame genes_c = DataFrame(genes[chrnum]);
// ... add code to map probes on genes, that is fill "links" ...
}
return wrap(links);
'
funC <- cxxfunction(signature(genes_in="list",
probes_in="list"),
body=src, plugin="Rcpp")
The problem starts quite early: when I compile this piece of code, I get
"error: call of overloaded ?DataFrame(Rcpp::internal::generic_proxy<19>)? is
ambiguous".
What should I do to go through the "probes" and "genes" lists given as
input? Maybe more generically, how can we go through a list of lists (of
lists...) with Rcpp?
2nd (small) question, I don't manage to use Rprintf when using inline, for
instance Rprintf("%d\n", i);, it complains about the quotes. What should I
do to print statement from within the for loop?
Thanks in advance. As my question is very long, I won't mind if you tell me
to find another way by myself. But maybe one of you can put me on the good
track.
On Thu, Aug 11, 2011 at 7:00 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 11 August 2011 at 03:06, Walrus Foolhill wrote:
| Hello,
| I need to create a list and then fill it sequentially by adding
components in a
| for loop. Here is an example that works:
|
| library(inline)
| src <- '
| Rcpp::List mylist(2);
| for(int i=0; i<2; ++i)
| mylist[i] = i;
| mylist.names() = CharacterVector::create("a","b");
| return mylist;
| '
| fun <- cxxfunction(body=src, plugin="Rcpp")
| print(fun())
|
| But what I really want is to create an empty list and then fill it, that
is
| without specifying its number of components before hand... This is
because I
| don't know in advance at which step of the for loop I will need to create
a new
| component. Here is an example, that obviously doesn't work, but that
should
| show what I am looking for:
|
| Rcpp::List mylist;
| CharacterVector names = CharacterVector::create("a", "b");
If you know how long names is, you know how long mylist going to be ....
| for(int i=0; i<2; ++i){
| mylist.add(names[i], IntegerVector::create());
| mylist[names[i]].push_back(i);
I don't understand what that is trying to do.
| }
| return mylist;
|
| Do you know how I could achieve this? Thanks.
Rcpp::List is an alias for Rcpp::GenericVector, and derives from Vector.
You
can look at the public member functions -- there are things like
push_back()
push_front()
insert()
etc that behave like STL functions __but are inefficient as we (almost
always) need to copy the whole object__ so they are not recommended.
When I had to deal with 'unknown quantities of data' returning I was mostly
able to either turn it into a 'fixed or known columns, unknow rows' problem
(easy, just grow row-wise) or I 'cached' in a C++ data structure first
before
returning to R via Rcpp structures -- and then I knew the dimensions for
the
to-be-created object too.
Dirk
--
Two new Rcpp master classes for R and C++ integration scheduled for
New York (Sep 24) and San Francisco (Oct 8), more details are at
http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110811/24f6a2f4/attachment.htm>
Howdy,
On 11 August 2011 at 20:44, Walrus Foolhill wrote:
| Ok, thanks for your answer, but I wasn't clear enough. So here are more details
| of what I want to do.
|
| I have one list named "probes":
| probes <- list(chr1=data.frame(name=c("p1","p2"),
| ???????????????? start=c(81,95),
| ???????????????? end=c(85,100),
| ???????????????? stringsAsFactors=FALSE))
|
| I also have one list named "genes":
| genes <- list(chr1=data.frame(name=c("g1","g2"), start=c(11,111), end=c
| (90,190)),
| ??????????????? chr2=data.frame(name="g3", start=11, end=90))
|
| I need to compare those two lists in order to obtain the following list which
| contains, for each gene, the name of the probes included in it:
| links <- list(chr1=list(g1=c("p1")))
|
| Here is my R function (assuming that the probes are sorted based on their start
| and end coordinates):
|
| fun.l <- function(genes, probes){
| ? links <- lapply(names(genes), function(chr.name){
| ??? if(! chr.name %in% names(probes))
| ????? return(NULL)
| ???
| ??? res <- list()
| ???
| ??? genes.c <- genes[[chr.name]]
| ??? probes.c <- probes[[chr.name]]
| ???
| ??? for(gene.name in genes.c$name){
| ????? gene <- genes.c[genes.c$name == gene.name,]
| ????? res[[gene.name]] <- vector()
| ????? for(probe.name in probes.c$name){
| ??????? probe <- probes.c[probes.c$name == probe.name,]
| ??????? if(probe$start >= gene$start && probe$end <= gene$end)
| ????????? res[[gene.name]] <- append(res[[gene.name]], probe.name)
| ??????? else if(probe$start > gene$end)
| ????????? break
| ????? }
| ????? if(length(res[[gene.name]]) == 0)
| ??????? res[[gene.name]] <- NULL
| ??? }
| ???
| ??? if(length(res) == 0)
| ????? res <- NA
| ??? return(res)
| ? })
| ? names(links) <- names(genes)
| ? links <- Filter(function(links.c){!is.null(links.c)}, links)
| ? return(links)
| }
|
| And here is the beginning of my attempt using Rcpp:
|
| src <- '
| using namespace Rcpp;
|
| List genes = List(genes_in);
| int genes_nb_chr = genes.length();
| std::vector<std::string> genes_chr = genes.names();
|
| List probes = List(probes_in);
| int probes_nb_chr = probes.length();
|
| std::vector< std::vector<std::string> > links;
|
| // the main task is performed in this loop
| for(int chrnum=0; chrnum<genes_nb_chr; ++chrnum){
| ? DataFrame genes_c = DataFrame(genes[chrnum]);
| ? // ... add code to map probes on genes, that is fill "links" ...
| }
|
| return wrap(links);
| '
|
| funC <- cxxfunction(signature(genes_in="list",
| ??????????????????????????????? probes_in="list"),
| ????????????????????? body=src, plugin="Rcpp")
|
| The problem starts quite early: when I compile this piece of code, I get
| "error: call of overloaded ?DataFrame(Rcpp::internal::generic_proxy<19>)? is
| ambiguous".
Try a simpler mock-up. I don't have it in me to work through this now.
DataFrames are a little different from C++ -- start by trying to summarize in
just a vector, or collection of vectors.
| What should I do to go through the "probes" and "genes" lists given as input?
| Maybe more generically, how can we go through a list of lists (of lists...)
| with Rcpp?
|
| 2nd (small) question, I don't manage to use Rprintf when using inline, for
| instance Rprintf("%d\n", i);, it complains about the quotes. What should I do
| to print statement from within the for loop?
The backslashes need escaping as in
R> printing <- cxxfunction(, plugin="Rcpp", body=' Rprintf("foo\\n"); ')
R> printing()
foo
NULL
R>
| Thanks in advance. As my question is very long, I won't mind if you tell me to
| find another way by myself. But maybe one of you can put me on the good track.
You are doing good but you have decent size problem. Try breaking into
smaller pieces and a handle on each problem in turn.
Dirk
|
| On Thu, Aug 11, 2011 at 7:00 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
| |
| On 11 August 2011 at 03:06, Walrus Foolhill wrote:
| | Hello,
| | I need to create a list and then fill it sequentially by adding
| components in a
| | for loop. Here is an example that works:
| |
| | library(inline)
| | src <- '
| | Rcpp::List mylist(2);
| | for(int i=0; i<2; ++i)
| | ? mylist[i] = i;
| | mylist.names() = CharacterVector::create("a","b");
| | return mylist;
| | '
| | fun <- cxxfunction(body=src, plugin="Rcpp")
| | print(fun())
| |
| | But what I really want is to create an empty list and then fill it, that
| is
| | without specifying its number of components before hand... This is
| because I
| | don't know in advance at which step of the for loop I will need to create
| a new
| | component. Here is an example, that obviously doesn't work, but that
| should
| | show what I am looking for:
| |
| | Rcpp::List mylist;
| | CharacterVector names = CharacterVector::create("a", "b");
|
| If you know how long names is, you know how long mylist going to be ....
|
| | for(int i=0; i<2; ++i){
| | ? mylist.add(names[i], IntegerVector::create());
| | ? mylist[names[i]].push_back(i);
|
| I don't understand what that is trying to do.
|
| | }
| | return mylist;
| |
| | Do you know how I could achieve this? Thanks.
|
| Rcpp::List is an alias for Rcpp::GenericVector, and derives from Vector.
| You
| can look at the public member functions -- there are things like
|
| ? ?push_back()
| ? ?push_front()
| ? ?insert()
|
| etc that behave like STL functions __but are inefficient as we (almost
| always) need to copy the whole object__ so they are not recommended.
|
| When I had to deal with 'unknown quantities of data' returning I was mostly
| able to either turn it into a 'fixed or known columns, unknow rows' problem
| (easy, just grow row-wise) or I 'cached' in a C++ data structure first
| before
| returning to R via Rcpp structures -- and then I knew the dimensions for
| the
| to-be-created object too.
|
| Dirk
|
|
| --
| Two new Rcpp master classes for R and C++ integration scheduled for
| New York (Sep 24) and San Francisco (Oct 8), more details are at
| http://dirk.eddelbuettel.com/blog/2011/08/04#
| rcpp_classes_2011-09_and_2011-10
|
|
Two new Rcpp master classes for R and C++ integration scheduled for New York (Sep 24) and San Francisco (Oct 8), more details are at http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10 http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php
Ok, I started with smaller examples. I understand more or less how to
manipulate IntegerVectors, but not StringVectors (see below), and thus I
can't even start manipulating a simple list of StringVectors. Even so I
looked at mailing lists, StackOverflow, package pdf, source code on
R-Forge...
The following code tells me "warning: cannot pass objects of non-POD type
?struct Rcpp::internal::string_proxy<16>? through ?...?; call will abort at
runtime": why does it complain about printing the string in vec_s[i]?
fn <- cxxfunction(signature(l_in="list"),
body='
using namespace Rcpp;
List l = List(l_in);
Rprintf("list size: %d\\n", l.size());
IntegerVector vec_i= IntegerVector(2);
vec_i[0] = 1;
vec_i[1] = 2;
List l2 = List::create(_["vec"] = vec_i);
Rprintf("vec_i size: %d\\n", vec_i.size());
for(int i=0; i<vec_i.size(); ++i)
Rprintf("vec_i[%d]=%d\\n", i, vec_i[i]);
StringVector vec_s = StringVector::create("toto");
vec_s[0] = "toto";
Rprintf("vec_s size: %d\\n", vec_s.size());
for(int i=0; i<vec_s.size(); ++i)
Rprintf("vec_s[%d]=%s\\n", i, vec_s[i]);
return l2;
',
plugin="Rcpp", verbose=TRUE)
print(fn(list(a=c(1,2,3), b=c("a","b","c"))))
Moreover, how can I access the component of a list given as input, as "l_in"
above? Should I use l.begin()? or l[1]? or l["a"]? none of them seems to
compile successfully.
On Thu, Aug 11, 2011 at 8:54 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
Howdy,
On 11 August 2011 at 20:44, Walrus Foolhill wrote:
| Ok, thanks for your answer, but I wasn't clear enough. So here are more
details
| of what I want to do.
|
| I have one list named "probes":
| probes <- list(chr1=data.frame(name=c("p1","p2"),
| start=c(81,95),
| end=c(85,100),
| stringsAsFactors=FALSE))
|
| I also have one list named "genes":
| genes <- list(chr1=data.frame(name=c("g1","g2"), start=c(11,111), end=c
| (90,190)),
| chr2=data.frame(name="g3", start=11, end=90))
|
| I need to compare those two lists in order to obtain the following list
which
| contains, for each gene, the name of the probes included in it:
| links <- list(chr1=list(g1=c("p1")))
|
| Here is my R function (assuming that the probes are sorted based on their
start
| and end coordinates):
|
| fun.l <- function(genes, probes){
| links <- lapply(names(genes), function(chr.name){
| if(! chr.name %in% names(probes))
| return(NULL)
|
| res <- list()
|
| genes.c <- genes[[chr.name]]
| probes.c <- probes[[chr.name]]
|
| for(gene.name in genes.c$name){
| gene <- genes.c[genes.c$name == gene.name,]
| res[[gene.name]] <- vector()
| for(probe.name in probes.c$name){
| probe <- probes.c[probes.c$name == probe.name,]
| if(probe$start >= gene$start && probe$end <= gene$end)
| res[[gene.name]] <- append(res[[gene.name]], probe.name)
| else if(probe$start > gene$end)
| break
| }
| if(length(res[[gene.name]]) == 0)
| res[[gene.name]] <- NULL
| }
|
| if(length(res) == 0)
| res <- NA
| return(res)
| })
| names(links) <- names(genes)
| links <- Filter(function(links.c){!is.null(links.c)}, links)
| return(links)
| }
|
| And here is the beginning of my attempt using Rcpp:
|
| src <- '
| using namespace Rcpp;
|
| List genes = List(genes_in);
| int genes_nb_chr = genes.length();
| std::vector<std::string> genes_chr = genes.names();
|
| List probes = List(probes_in);
| int probes_nb_chr = probes.length();
|
| std::vector< std::vector<std::string> > links;
|
| // the main task is performed in this loop
| for(int chrnum=0; chrnum<genes_nb_chr; ++chrnum){
| DataFrame genes_c = DataFrame(genes[chrnum]);
| // ... add code to map probes on genes, that is fill "links" ...
| }
|
| return wrap(links);
| '
|
| funC <- cxxfunction(signature(genes_in="list",
| probes_in="list"),
| body=src, plugin="Rcpp")
|
| The problem starts quite early: when I compile this piece of code, I get
| "error: call of overloaded ?DataFrame(Rcpp::internal::generic_proxy<19>)?
is
| ambiguous".
Try a simpler mock-up. I don't have it in me to work through this now.
DataFrames are a little different from C++ -- start by trying to summarize
in
just a vector, or collection of vectors.
| What should I do to go through the "probes" and "genes" lists given as
input?
| Maybe more generically, how can we go through a list of lists (of
lists...)
| with Rcpp?
|
| 2nd (small) question, I don't manage to use Rprintf when using inline,
for
| instance Rprintf("%d\n", i);, it complains about the quotes. What should
I do
| to print statement from within the for loop?
The backslashes need escaping as in
R> printing <- cxxfunction(, plugin="Rcpp", body=' Rprintf("foo\\n"); ')
R> printing()
foo
NULL
R>
| Thanks in advance. As my question is very long, I won't mind if you tell
me to
| find another way by myself. But maybe one of you can put me on the good
track.
You are doing good but you have decent size problem. Try breaking into
smaller pieces and a handle on each problem in turn.
Dirk
|
| On Thu, Aug 11, 2011 at 7:00 AM, Dirk Eddelbuettel <edd at debian.org>
wrote:
|
|
| On 11 August 2011 at 03:06, Walrus Foolhill wrote:
| | Hello,
| | I need to create a list and then fill it sequentially by adding
| components in a
| | for loop. Here is an example that works:
| |
| | library(inline)
| | src <- '
| | Rcpp::List mylist(2);
| | for(int i=0; i<2; ++i)
| | mylist[i] = i;
| | mylist.names() = CharacterVector::create("a","b");
| | return mylist;
| | '
| | fun <- cxxfunction(body=src, plugin="Rcpp")
| | print(fun())
| |
| | But what I really want is to create an empty list and then fill it,
that
| is
| | without specifying its number of components before hand... This is
| because I
| | don't know in advance at which step of the for loop I will need to
create
| a new
| | component. Here is an example, that obviously doesn't work, but
that
| should
| | show what I am looking for:
| |
| | Rcpp::List mylist;
| | CharacterVector names = CharacterVector::create("a", "b");
|
| If you know how long names is, you know how long mylist going to be
....
|
| | for(int i=0; i<2; ++i){
| | mylist.add(names[i], IntegerVector::create());
| | mylist[names[i]].push_back(i);
|
| I don't understand what that is trying to do.
|
| | }
| | return mylist;
| |
| | Do you know how I could achieve this? Thanks.
|
| Rcpp::List is an alias for Rcpp::GenericVector, and derives from
Vector.
| You
| can look at the public member functions -- there are things like
|
| push_back()
| push_front()
| insert()
|
| etc that behave like STL functions __but are inefficient as we
(almost
| always) need to copy the whole object__ so they are not recommended.
|
| When I had to deal with 'unknown quantities of data' returning I was
mostly
| able to either turn it into a 'fixed or known columns, unknow rows'
problem
| (easy, just grow row-wise) or I 'cached' in a C++ data structure
first
| before
| returning to R via Rcpp structures -- and then I knew the dimensions
for
| the
| to-be-created object too.
|
| Dirk
|
|
| --
| Two new Rcpp master classes for R and C++ integration scheduled for
| New York (Sep 24) and San Francisco (Oct 8), more details are at
| http://dirk.eddelbuettel.com/blog/2011/08/04#
| rcpp_classes_2011-09_and_2011-10
|
|
--
Two new Rcpp master classes for R and C++ integration scheduled for
New York (Sep 24) and San Francisco (Oct 8), more details are at
http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110812/7e39f407/attachment-0001.htm>
On 12 August 2011 at 01:22, Walrus Foolhill wrote:
| Ok, I started with smaller examples. I understand more or less how to
| manipulate IntegerVectors, but not StringVectors (see below), and thus I can't
| even start manipulating a simple list of StringVectors. Even so I looked at
| mailing lists, StackOverflow, package pdf, source code on R-Forge...
|
| The following code tells me "warning: cannot pass objects of non-POD type
| ?struct Rcpp::internal::string_proxy<16>? through ?...?; call will abort at
| runtime": why does it complain about printing the string in vec_s[i]?
Again, simpler helps. That is the standard C / C++ error message of
std:string foo = "bar";
printf("String is %s \n", foo);
where you need foo.c_str() to pass a char* to printf.
| fn <- cxxfunction(signature(l_in="list"),
| ????????????????? body='
| using namespace Rcpp;
| List l = List(l_in);
| Rprintf("list size: %d\\n", l.size());
|
| IntegerVector vec_i= IntegerVector(2);
| vec_i[0] = 1;
| vec_i[1] = 2;
| List l2 = List::create(_["vec"] = vec_i);
| Rprintf("vec_i size: %d\\n", vec_i.size());
| for(int i=0; i<vec_i.size(); ++i)
| ? Rprintf("vec_i[%d]=%d\\n", i, vec_i[i]);
|
| StringVector vec_s = StringVector::create("toto");
| vec_s[0] = "toto";
| Rprintf("vec_s size: %d\\n", vec_s.size());
| for(int i=0; i<vec_s.size(); ++i)
| ? Rprintf("vec_s[%d]=%s\\n", i, vec_s[i]);
Try vec_s[i].c_str() instead.
Dirk
| return l2;
| ',
| ????????????????? plugin="Rcpp", verbose=TRUE)
| print(fn(list(a=c(1,2,3), b=c("a","b","c"))))
|
| Moreover, how can I access the component of a list given as input, as "l_in"
| above? Should I use l.begin()? or l[1]? or l["a"]? none of them seems to
| compile successfully.
|
| On Thu, Aug 11, 2011 at 8:54 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
| | | Howdy, |
| On 11 August 2011 at 20:44, Walrus Foolhill wrote:
| | Ok, thanks for your answer, but I wasn't clear enough. So here are more
| details
| | of what I want to do.
| |
| | I have one list named "probes":
| | probes <- list(chr1=data.frame(name=c("p1","p2"),
| | ???????????????? start=c(81,95),
| | ???????????????? end=c(85,100),
| | ???????????????? stringsAsFactors=FALSE))
| |
| | I also have one list named "genes":
| | genes <- list(chr1=data.frame(name=c("g1","g2"), start=c(11,111), end=c
| | (90,190)),
| | ??????????????? chr2=data.frame(name="g3", start=11, end=90))
| |
| | I need to compare those two lists in order to obtain the following list
| which
| | contains, for each gene, the name of the probes included in it:
| | links <- list(chr1=list(g1=c("p1")))
| |
| | Here is my R function (assuming that the probes are sorted based on their
| start
| | and end coordinates):
| |
| | fun.l <- function(genes, probes){
| | ? links <- lapply(names(genes), function(chr.name){
| | ??? if(! chr.name %in% names(probes))
| | ????? return(NULL)
| | ???
| | ??? res <- list()
| | ???
| | ??? genes.c <- genes[[chr.name]]
| | ??? probes.c <- probes[[chr.name]]
| | ???
| | ??? for(gene.name in genes.c$name){
| | ????? gene <- genes.c[genes.c$name == gene.name,]
| | ????? res[[gene.name]] <- vector()
| | ????? for(probe.name in probes.c$name){
| | ??????? probe <- probes.c[probes.c$name == probe.name,]
| | ??????? if(probe$start >= gene$start && probe$end <= gene$end)
| | ????????? res[[gene.name]] <- append(res[[gene.name]], probe.name)
| | ??????? else if(probe$start > gene$end)
| | ????????? break
| | ????? }
| | ????? if(length(res[[gene.name]]) == 0)
| | ??????? res[[gene.name]] <- NULL
| | ??? }
| | ???
| | ??? if(length(res) == 0)
| | ????? res <- NA
| | ??? return(res)
| | ? })
| | ? names(links) <- names(genes)
| | ? links <- Filter(function(links.c){!is.null(links.c)}, links)
| | ? return(links)
| | }
| |
| | And here is the beginning of my attempt using Rcpp:
| |
| | src <- '
| | using namespace Rcpp;
| |
| | List genes = List(genes_in);
| | int genes_nb_chr = genes.length();
| | std::vector<std::string> genes_chr = genes.names();
| |
| | List probes = List(probes_in);
| | int probes_nb_chr = probes.length();
| |
| | std::vector< std::vector<std::string> > links;
| |
| | // the main task is performed in this loop
| | for(int chrnum=0; chrnum<genes_nb_chr; ++chrnum){
| | ? DataFrame genes_c = DataFrame(genes[chrnum]);
| | ? // ... add code to map probes on genes, that is fill "links" ...
| | }
| |
| | return wrap(links);
| | '
| |
| | funC <- cxxfunction(signature(genes_in="list",
| | ??????????????????????????????? probes_in="list"),
| | ????????????????????? body=src, plugin="Rcpp")
| |
| | The problem starts quite early: when I compile this piece of code, I get
| | "error: call of overloaded ?DataFrame(Rcpp::internal::generic_proxy<19>)?
| is
| | ambiguous".
|
| Try a simpler mock-up. I don't have it in me to work through this now.
| DataFrames are a little different from C++ -- start by trying to summarize
| in
| just a vector, or collection of vectors.
|
| | What should I do to go through the "probes" and "genes" lists given as
| input?
| | Maybe more generically, how can we go through a list of lists (of
| lists...)
| | with Rcpp?
| |
| | 2nd (small) question, I don't manage to use Rprintf when using inline,
| for
| | instance Rprintf("%d\n", i);, it complains about the quotes. What should
| I do
| | to print statement from within the for loop?
|
| The backslashes need escaping as in
|
| ?R> printing <- cxxfunction(, plugin="Rcpp", body=' Rprintf("foo\\n"); ')
| ?R> printing()
| ?foo
| ?NULL
| ?R>
|
| | Thanks in advance. As my question is very long, I won't mind if you tell
| me to
| | find another way by myself. But maybe one of you can put me on the good
| track.
|
| You are doing good but you have decent size problem. Try breaking into
| smaller pieces and a handle on each problem in turn.
|
| Dirk
|
| |
| | On Thu, Aug 11, 2011 at 7:00 AM, Dirk Eddelbuettel <edd at debian.org>
| wrote:
| | | |
| | ? ? On 11 August 2011 at 03:06, Walrus Foolhill wrote:
| | ? ? | Hello,
| | ? ? | I need to create a list and then fill it sequentially by adding
| | ? ? components in a
| | ? ? | for loop. Here is an example that works:
| | ? ? |
| | ? ? | library(inline)
| | ? ? | src <- '
| | ? ? | Rcpp::List mylist(2);
| | ? ? | for(int i=0; i<2; ++i)
| | ? ? | ? mylist[i] = i;
| | ? ? | mylist.names() = CharacterVector::create("a","b");
| | ? ? | return mylist;
| | ? ? | '
| | ? ? | fun <- cxxfunction(body=src, plugin="Rcpp")
| | ? ? | print(fun())
| | ? ? |
| | ? ? | But what I really want is to create an empty list and then fill it,
| that
| | ? ? is
| | ? ? | without specifying its number of components before hand... This is
| | ? ? because I
| | ? ? | don't know in advance at which step of the for loop I will need to
| create
| | ? ? a new
| | ? ? | component. Here is an example, that obviously doesn't work, but
| that
| | ? ? should
| | ? ? | show what I am looking for:
| | ? ? |
| | ? ? | Rcpp::List mylist;
| | ? ? | CharacterVector names = CharacterVector::create("a", "b");
| |
| | ? ? If you know how long names is, you know how long mylist going to be
| ....
| |
| | ? ? | for(int i=0; i<2; ++i){
| | ? ? | ? mylist.add(names[i], IntegerVector::create());
| | ? ? | ? mylist[names[i]].push_back(i);
| |
| | ? ? I don't understand what that is trying to do.
| |
| | ? ? | }
| | ? ? | return mylist;
| | ? ? |
| | ? ? | Do you know how I could achieve this? Thanks.
| |
| | ? ? Rcpp::List is an alias for Rcpp::GenericVector, and derives from
| Vector.
| | ? ? You
| | ? ? can look at the public member functions -- there are things like
| |
| | ? ? ? ?push_back()
| | ? ? ? ?push_front()
| | ? ? ? ?insert()
| |
| | ? ? etc that behave like STL functions __but are inefficient as we
| (almost
| | ? ? always) need to copy the whole object__ so they are not recommended.
| |
| | ? ? When I had to deal with 'unknown quantities of data' returning I was
| mostly
| | ? ? able to either turn it into a 'fixed or known columns, unknow rows'
| problem
| | ? ? (easy, just grow row-wise) or I 'cached' in a C++ data structure
| first
| | ? ? before
| | ? ? returning to R via Rcpp structures -- and then I knew the dimensions
| for
| | ? ? the
| | ? ? to-be-created object too.
| |
| | ? ? Dirk
| |
| |
| | ? ? --
| | ? ? Two new Rcpp master classes for R and C++ integration scheduled for
| | ? ? New York (Sep 24) and San Francisco (Oct 8), more details are at
| | ? ? http://dirk.eddelbuettel.com/blog/2011/08/04#
| | ? ? rcpp_classes_2011-09_and_2011-10
| |
| |
|
| --
| Two new Rcpp master classes for R and C++ integration scheduled for
| New York (Sep 24) and San Francisco (Oct 8), more details are at
| http://dirk.eddelbuettel.com/blog/2011/08/04#
| rcpp_classes_2011-09_and_2011-10
| http://www.revolutionanalytics.com/products/training/public/
| rcpp-master-class.php
|
|
Two new Rcpp master classes for R and C++ integration scheduled for New York (Sep 24) and San Francisco (Oct 8), more details are at http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10 http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php
Hi Walrus,
While I'm a huge fan of Rcpp, I think you'll be a bit better served
(for the time being) to read up on some of the bioconductor packages
that are suited for these types of things.
In particular I am thinking about the IRanges and GenomicRanges
packages. They R wrappers to what is basically an IntervalTree that
you can annotate, and then use to perform fast overlap/intersection
queries.
For instance:
R> library(GenomicRanges)
R> probes <- GRanges('chr1', IRanges(c(81,85), c(85,100)), strand='*')
R> genes <- GRanges(c('chr1', 'chr1', 'chr2'), IRanges(c(11, 111, 11),
c(90, 190, 90)), strand='*',
name=c('g1', 'g2', 'g3'))
R> genes
GRanges with 3 ranges and 1 elementMetadata value
seqnames ranges strand | name
<Rle> <IRanges> <Rle> | <character>
[1] chr1 [ 11, 90] * | g1
[2] chr1 [111, 190] * | g2
[3] chr2 [ 11, 90] * | g3
## How many probes does each gene have land in it?
R> countOverlaps(genes, probes)
[1] 2 0 0
## Which probes are these?
R> subsetByOverlaps(probes, genes)
GRanges with 2 ranges and 0 elementMetadata values
seqnames ranges strand |
<Rle> <IRanges> <Rle> |
[1] chr1 [81, 85] * |
[2] chr1 [85, 100] * |
## and much more stuff
There's a mess load of functionality in IRanges, GenomicRanges,
Biostrings packages that you'll likely find very useful, and efficient
(much of the core of these packages are written in C) if you're doing
a lot of bioinformatics/genomics work. So, taking some time to get
familiar with those will be useful -- you'll find that you'll also
need to drop into Rcpp for other stuff (as I do, too), so it will
still be useful for you in the future.
That's just my 2 cents.
-steve
On Thu, Aug 11, 2011 at 9:44 PM, Walrus Foolhill
<walrus.foolhill at gmail.com> wrote:
Ok, thanks for your answer, but I wasn't clear enough. So here are more
details of what I want to do.
I have one list named "probes":
probes <- list(chr1=data.frame(name=c("p1","p2"),
???????????????? start=c(81,95),
???????????????? end=c(85,100),
???????????????? stringsAsFactors=FALSE))
I also have one list named "genes":
genes <- list(chr1=data.frame(name=c("g1","g2"), start=c(11,111),
end=c(90,190)),
??????????????? chr2=data.frame(name="g3", start=11, end=90))
I need to compare those two lists in order to obtain the following list
which contains, for each gene, the name of the probes included in it:
links <- list(chr1=list(g1=c("p1")))
Here is my R function (assuming that the probes are sorted based on their
start and end coordinates):
fun.l <- function(genes, probes){
? links <- lapply(names(genes), function(chr.name){
??? if(! chr.name %in% names(probes))
????? return(NULL)
??? res <- list()
??? genes.c <- genes[[chr.name]]
??? probes.c <- probes[[chr.name]]
??? for(gene.name in genes.c$name){
????? gene <- genes.c[genes.c$name == gene.name,]
????? res[[gene.name]] <- vector()
????? for(probe.name in probes.c$name){
??????? probe <- probes.c[probes.c$name == probe.name,]
??????? if(probe$start >= gene$start && probe$end <= gene$end)
????????? res[[gene.name]] <- append(res[[gene.name]], probe.name)
??????? else if(probe$start > gene$end)
????????? break
????? }
????? if(length(res[[gene.name]]) == 0)
??????? res[[gene.name]] <- NULL
??? }
??? if(length(res) == 0)
????? res <- NA
??? return(res)
? })
? names(links) <- names(genes)
? links <- Filter(function(links.c){!is.null(links.c)}, links)
? return(links)
}
And here is the beginning of my attempt using Rcpp:
src <- '
using namespace Rcpp;
List genes = List(genes_in);
int genes_nb_chr = genes.length();
std::vector<std::string> genes_chr = genes.names();
List probes = List(probes_in);
int probes_nb_chr = probes.length();
std::vector< std::vector<std::string> > links;
// the main task is performed in this loop
for(int chrnum=0; chrnum<genes_nb_chr; ++chrnum){
? DataFrame genes_c = DataFrame(genes[chrnum]);
? // ... add code to map probes on genes, that is fill "links" ...
}
return wrap(links);
'
funC <- cxxfunction(signature(genes_in="list",
??????????????????????????????? probes_in="list"),
????????????????????? body=src, plugin="Rcpp")
The problem starts quite early: when I compile this piece of code, I get
"error: call of overloaded ?DataFrame(Rcpp::internal::generic_proxy<19>)? is
ambiguous".
What should I do to go through the "probes" and "genes" lists given as
input? Maybe more generically, how can we go through a list of lists (of
lists...) with Rcpp?
2nd (small) question, I don't manage to use Rprintf when using inline, for
instance Rprintf("%d\n", i);, it complains about the quotes. What should I
do to print statement from within the for loop?
Thanks in advance. As my question is very long, I won't mind if you tell me
to find another way by myself. But maybe one of you can put me on the good
track.
On Thu, Aug 11, 2011 at 7:00 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 11 August 2011 at 03:06, Walrus Foolhill wrote:
| Hello,
| I need to create a list and then fill it sequentially by adding
components in a
| for loop. Here is an example that works:
|
| library(inline)
| src <- '
| Rcpp::List mylist(2);
| for(int i=0; i<2; ++i)
| ? mylist[i] = i;
| mylist.names() = CharacterVector::create("a","b");
| return mylist;
| '
| fun <- cxxfunction(body=src, plugin="Rcpp")
| print(fun())
|
| But what I really want is to create an empty list and then fill it, that
is
| without specifying its number of components before hand... This is
because I
| don't know in advance at which step of the for loop I will need to
create a new
| component. Here is an example, that obviously doesn't work, but that
should
| show what I am looking for:
|
| Rcpp::List mylist;
| CharacterVector names = CharacterVector::create("a", "b");
If you know how long names is, you know how long mylist going to be ....
| for(int i=0; i<2; ++i){
| ? mylist.add(names[i], IntegerVector::create());
| ? mylist[names[i]].push_back(i);
I don't understand what that is trying to do.
| }
| return mylist;
|
| Do you know how I could achieve this? Thanks.
Rcpp::List is an alias for Rcpp::GenericVector, and derives from Vector.
You
can look at the public member functions -- there are things like
? ?push_back()
? ?push_front()
? ?insert()
etc that behave like STL functions __but are inefficient as we (almost
always) need to copy the whole object__ so they are not recommended.
When I had to deal with 'unknown quantities of data' returning I was
mostly
able to either turn it into a 'fixed or known columns, unknow rows'
problem
(easy, just grow row-wise) or I 'cached' in a C++ data structure first
before
returning to R via Rcpp structures -- and then I knew the dimensions for
the
to-be-created object too.
Dirk
--
Two new Rcpp master classes for R and C++ integration scheduled for
New York (Sep 24) and San Francisco (Oct 8), more details are at
http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
Thanks for your advice, I now understand how to manipulate one-level lists:
fn <- cxxfunction(signature(l_in="list"),
body='
using namespace Rcpp;
List l(l_in);
IntegerVector lf = l["foo"];
CharacterVector lb = l["bar"];
for(int i=0; i<lf.size(); ++i)
Rprintf("l[%s][%i] %i\\n", "foo", i, lf[i]);
for(int i=0; i<lb.size(); ++i)
Rprintf("l[%s][%i] %s\\n", "bar", i, std::string(lb[i]).c_str());
', plugin="Rcpp", verbose=TRUE)
z <- fn(list(foo=c(1,2,3,4),bar=c("bar1","bar2")))
But what about 2-level lists? Why the following code doesn't compile?
fn <- cxxfunction(signature(l_in="list"),
body='
using namespace Rcpp;
List l(l_in);
List lf(l["foo"]);
', plugin="Rcpp", verbose=TRUE)
z <- fn(list(foo=list(bar=1)))
And what the following message mean? "error: call of overloaded
?Vector(Rcpp::internal::generic_name_proxy<19>)? is ambiguous"
I had a look at "runit.Vector.R" on r-forge, but couldn't find any test
involving 2-level (or more) lists, although on SO in June 2010 (
http://stackoverflow.com/questions/3088650/how-do-i-create-a-list-of-vectors-in-rcpp/3088744#3088744),
you said that it should work.
I checked that I can create a 2-level list, but the code below doesn't
compile if I uncomment the last Rprintf line:
fn <- cxxfunction(signature(),
body='
using namespace Rcpp;
IntegerVector vi(2);
vi[0] = 2;
vi[1] = 8;
List ll = List::create(Named("bar")=vi);
Rprintf("ll.size %i\\n", ll.size());
List l = List::create(Named("foo")=ll);
Rprintf("l.size %i\\n", l.size());
//Rprintf("l.ll.size %i\\n", l["foo"].size());
return l;
', plugin="Rcpp", verbose=TRUE)
print(fn())
Thus once again I'm stuck, but if I know how to access 2-level lists, I
think I will be able to go back to my original problem, and stop sending
emails on this mailing list ;)
On Fri, Aug 12, 2011 at 8:09 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 12 August 2011 at 01:22, Walrus Foolhill wrote:
| Ok, I started with smaller examples. I understand more or less how to
| manipulate IntegerVectors, but not StringVectors (see below), and thus I
can't
| even start manipulating a simple list of StringVectors. Even so I looked
at
| mailing lists, StackOverflow, package pdf, source code on R-Forge...
|
| The following code tells me "warning: cannot pass objects of non-POD type
| ?struct Rcpp::internal::string_proxy<16>? through ?...?; call will abort
at
| runtime": why does it complain about printing the string in vec_s[i]?
Again, simpler helps. That is the standard C / C++ error message of
std:string foo = "bar";
printf("String is %s \n", foo);
where you need foo.c_str() to pass a char* to printf.
| fn <- cxxfunction(signature(l_in="list"),
| body='
| using namespace Rcpp;
| List l = List(l_in);
| Rprintf("list size: %d\\n", l.size());
|
| IntegerVector vec_i= IntegerVector(2);
| vec_i[0] = 1;
| vec_i[1] = 2;
| List l2 = List::create(_["vec"] = vec_i);
| Rprintf("vec_i size: %d\\n", vec_i.size());
| for(int i=0; i<vec_i.size(); ++i)
| Rprintf("vec_i[%d]=%d\\n", i, vec_i[i]);
|
| StringVector vec_s = StringVector::create("toto");
| vec_s[0] = "toto";
| Rprintf("vec_s size: %d\\n", vec_s.size());
| for(int i=0; i<vec_s.size(); ++i)
| Rprintf("vec_s[%d]=%s\\n", i, vec_s[i]);
Try vec_s[i].c_str() instead.
Dirk
| return l2;
| ',
| plugin="Rcpp", verbose=TRUE)
| print(fn(list(a=c(1,2,3), b=c("a","b","c"))))
|
| Moreover, how can I access the component of a list given as input, as
"l_in"
| above? Should I use l.begin()? or l[1]? or l["a"]? none of them seems to
| compile successfully.
|
| On Thu, Aug 11, 2011 at 8:54 PM, Dirk Eddelbuettel <edd at debian.org>
wrote:
|
|
| Howdy,
|
| On 11 August 2011 at 20:44, Walrus Foolhill wrote:
| | Ok, thanks for your answer, but I wasn't clear enough. So here are
more
| details
| | of what I want to do.
| |
| | I have one list named "probes":
| | probes <- list(chr1=data.frame(name=c("p1","p2"),
| | start=c(81,95),
| | end=c(85,100),
| | stringsAsFactors=FALSE))
| |
| | I also have one list named "genes":
| | genes <- list(chr1=data.frame(name=c("g1","g2"), start=c(11,111),
end=c
| | (90,190)),
| | chr2=data.frame(name="g3", start=11, end=90))
| |
| | I need to compare those two lists in order to obtain the following
list
| which
| | contains, for each gene, the name of the probes included in it:
| | links <- list(chr1=list(g1=c("p1")))
| |
| | Here is my R function (assuming that the probes are sorted based on
their
| start
| | and end coordinates):
| |
| | fun.l <- function(genes, probes){
| | links <- lapply(names(genes), function(chr.name){
| | if(! chr.name %in% names(probes))
| | return(NULL)
| |
| | res <- list()
| |
| | genes.c <- genes[[chr.name]]
| | probes.c <- probes[[chr.name]]
| |
| | for(gene.name in genes.c$name){
| | gene <- genes.c[genes.c$name == gene.name,]
| | res[[gene.name]] <- vector()
| | for(probe.name in probes.c$name){
| | probe <- probes.c[probes.c$name == probe.name,]
| | if(probe$start >= gene$start && probe$end <= gene$end)
| | res[[gene.name]] <- append(res[[gene.name]], probe.name)
| | else if(probe$start > gene$end)
| | break
| | }
| | if(length(res[[gene.name]]) == 0)
| | res[[gene.name]] <- NULL
| | }
| |
| | if(length(res) == 0)
| | res <- NA
| | return(res)
| | })
| | names(links) <- names(genes)
| | links <- Filter(function(links.c){!is.null(links.c)}, links)
| | return(links)
| | }
| |
| | And here is the beginning of my attempt using Rcpp:
| |
| | src <- '
| | using namespace Rcpp;
| |
| | List genes = List(genes_in);
| | int genes_nb_chr = genes.length();
| | std::vector<std::string> genes_chr = genes.names();
| |
| | List probes = List(probes_in);
| | int probes_nb_chr = probes.length();
| |
| | std::vector< std::vector<std::string> > links;
| |
| | // the main task is performed in this loop
| | for(int chrnum=0; chrnum<genes_nb_chr; ++chrnum){
| | DataFrame genes_c = DataFrame(genes[chrnum]);
| | // ... add code to map probes on genes, that is fill "links" ...
| | }
| |
| | return wrap(links);
| | '
| |
| | funC <- cxxfunction(signature(genes_in="list",
| | probes_in="list"),
| | body=src, plugin="Rcpp")
| |
| | The problem starts quite early: when I compile this piece of code,
I get
| | "error: call of overloaded
?DataFrame(Rcpp::internal::generic_proxy<19>)?
| is
| | ambiguous".
|
| Try a simpler mock-up. I don't have it in me to work through this
now.
| DataFrames are a little different from C++ -- start by trying to
summarize
| in
| just a vector, or collection of vectors.
|
| | What should I do to go through the "probes" and "genes" lists given
as
| input?
| | Maybe more generically, how can we go through a list of lists (of
| lists...)
| | with Rcpp?
| |
| | 2nd (small) question, I don't manage to use Rprintf when using
inline,
| for
| | instance Rprintf("%d\n", i);, it complains about the quotes. What
should
| I do
| | to print statement from within the for loop?
|
| The backslashes need escaping as in
|
| R> printing <- cxxfunction(, plugin="Rcpp", body='
Rprintf("foo\\n"); ')
| R> printing()
| foo
| NULL
| R>
|
| | Thanks in advance. As my question is very long, I won't mind if you
tell
| me to
| | find another way by myself. But maybe one of you can put me on the
good
| track.
|
| You are doing good but you have decent size problem. Try breaking
into
| smaller pieces and a handle on each problem in turn.
|
| Dirk
|
| |
| | On Thu, Aug 11, 2011 at 7:00 AM, Dirk Eddelbuettel <edd at debian.org
| wrote:
| |
| |
| | On 11 August 2011 at 03:06, Walrus Foolhill wrote:
| | | Hello,
| | | I need to create a list and then fill it sequentially by
adding
| | components in a
| | | for loop. Here is an example that works:
| | |
| | | library(inline)
| | | src <- '
| | | Rcpp::List mylist(2);
| | | for(int i=0; i<2; ++i)
| | | mylist[i] = i;
| | | mylist.names() = CharacterVector::create("a","b");
| | | return mylist;
| | | '
| | | fun <- cxxfunction(body=src, plugin="Rcpp")
| | | print(fun())
| | |
| | | But what I really want is to create an empty list and then
fill it,
| that
| | is
| | | without specifying its number of components before hand...
This is
| | because I
| | | don't know in advance at which step of the for loop I will
need to
| create
| | a new
| | | component. Here is an example, that obviously doesn't work,
but
| that
| | should
| | | show what I am looking for:
| | |
| | | Rcpp::List mylist;
| | | CharacterVector names = CharacterVector::create("a", "b");
| |
| | If you know how long names is, you know how long mylist going
to be
| ....
| |
| | | for(int i=0; i<2; ++i){
| | | mylist.add(names[i], IntegerVector::create());
| | | mylist[names[i]].push_back(i);
| |
| | I don't understand what that is trying to do.
| |
| | | }
| | | return mylist;
| | |
| | | Do you know how I could achieve this? Thanks.
| |
| | Rcpp::List is an alias for Rcpp::GenericVector, and derives
from
| Vector.
| | You
| | can look at the public member functions -- there are things
like
| |
| | push_back()
| | push_front()
| | insert()
| |
| | etc that behave like STL functions __but are inefficient as we
| (almost
| | always) need to copy the whole object__ so they are not
recommended.
| |
| | When I had to deal with 'unknown quantities of data' returning
I was
| mostly
| | able to either turn it into a 'fixed or known columns, unknow
rows'
| problem
| | (easy, just grow row-wise) or I 'cached' in a C++ data
structure
| first
| | before
| | returning to R via Rcpp structures -- and then I knew the
dimensions
| for
| | the
| | to-be-created object too.
| |
| | Dirk
| |
| |
| | --
| | Two new Rcpp master classes for R and C++ integration scheduled
for
| | New York (Sep 24) and San Francisco (Oct 8), more details are
at
| | http://dirk.eddelbuettel.com/blog/2011/08/04#
| | rcpp_classes_2011-09_and_2011-10
| |
| |
|
| --
| Two new Rcpp master classes for R and C++ integration scheduled for
| New York (Sep 24) and San Francisco (Oct 8), more details are at
| http://dirk.eddelbuettel.com/blog/2011/08/04#
| rcpp_classes_2011-09_and_2011-10
| http://www.revolutionanalytics.com/products/training/public/
| rcpp-master-class.php
|
|
--
Two new Rcpp master classes for R and C++ integration scheduled for
New York (Sep 24) and San Francisco (Oct 8), more details are at
http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110812/684966d5/attachment-0001.htm>
On 12 August 2011 at 16:26, Walrus Foolhill wrote:
| Thanks for your advice, I now understand how to manipulate one-level lists:
|
| fn <- cxxfunction(signature(l_in="list"),
| ????????????????? body='
| using namespace Rcpp;
| List l(l_in);
| IntegerVector lf = l["foo"];
| CharacterVector lb = l["bar"];
| for(int i=0; i<lf.size(); ++i)
| ? Rprintf("l[%s][%i] %i\\n", "foo", i, lf[i]);
| for(int i=0; i<lb.size(); ++i)
| ? Rprintf("l[%s][%i] %s\\n", "bar", i, std::string(lb[i]).c_str());
| ', plugin="Rcpp", verbose=TRUE)
| z <- fn(list(foo=c(1,2,3,4),bar=c("bar1","bar2")))
|
| But what about 2-level lists? Why the following code doesn't compile?
|
| fn <- cxxfunction(signature(l_in="list"),
| ????????????????? body='
| using namespace Rcpp;
| List l(l_in);
| List lf(l["foo"]);
| ', plugin="Rcpp", verbose=TRUE)
| z <- fn(list(foo=list(bar=1)))
|
| And what the following message mean? "error: call of overloaded ?Vector
| (Rcpp::internal::generic_name_proxy<19>)? is ambiguous"
|
| I had a look at "runit.Vector.R" on r-forge, but couldn't find any test
| involving 2-level (or more) lists, although on SO in June 2010 (http://
| stackoverflow.com/questions/3088650/how-do-i-create-a-list-of-vectors-in-rcpp/
| 3088744#3088744), you said that it should work.
|
| I checked that I can create a 2-level list, but the code below doesn't compile
| if I uncomment the last Rprintf line:
There can be times when the C++ templating gets in the way, so if this
doesn't work in a single statement, decompose it into two (one to assign to a
temp, another to print them temp) and move on.
I have done two-level lists in the past; one key is that a list ... is just
another SEXP, or can be wrap()'ed to a SEXP, and you can hence assign a list
to be a component of another. And then another and so on...
Dirk
|
| fn <- cxxfunction(signature(),
| ????????????????? body='
| using namespace Rcpp;
| IntegerVector vi(2);
| vi[0] = 2;
| vi[1] = 8;
| List ll = List::create(Named("bar")=vi);
| Rprintf("ll.size %i\\n", ll.size());
| List l = List::create(Named("foo")=ll);
| Rprintf("l.size %i\\n", l.size());
| //Rprintf("l.ll.size %i\\n", l["foo"].size());
| return l;
| ', plugin="Rcpp", verbose=TRUE)
| print(fn())
|
| Thus once again I'm stuck, but if I know how to access 2-level lists, I think I
| will be able to go back to my original problem, and stop sending emails on this
| mailing list ;)
|
| On Fri, Aug 12, 2011 at 8:09 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
| |
| On 12 August 2011 at 01:22, Walrus Foolhill wrote:
| | Ok, I started with smaller examples. I understand more or less how to
| | manipulate IntegerVectors, but not StringVectors (see below), and thus I
| can't
| | even start manipulating a simple list of StringVectors. Even so I looked
| at
| | mailing lists, StackOverflow, package pdf, source code on R-Forge...
| |
| | The following code tells me "warning: cannot pass objects of non-POD type
| | ?struct Rcpp::internal::string_proxy<16>? through ?...?; call will abort
| at
| | runtime": why does it complain about printing the string in vec_s[i]?
|
| Again, simpler helps. That is the standard C / C++ error message of
|
| ? ? ? ?std:string foo = "bar";
| ? ? ? ?printf("String is %s \n", foo);
|
| where you need foo.c_str() to pass a char* to printf.
|
| | fn <- cxxfunction(signature(l_in="list"),
| | ????????????????? body='
| | using namespace Rcpp;
| | List l = List(l_in);
| | Rprintf("list size: %d\\n", l.size());
| |
| | IntegerVector vec_i= IntegerVector(2);
| | vec_i[0] = 1;
| | vec_i[1] = 2;
| | List l2 = List::create(_["vec"] = vec_i);
| | Rprintf("vec_i size: %d\\n", vec_i.size());
| | for(int i=0; i<vec_i.size(); ++i)
| | ? Rprintf("vec_i[%d]=%d\\n", i, vec_i[i]);
| |
| | StringVector vec_s = StringVector::create("toto");
| | vec_s[0] = "toto";
| | Rprintf("vec_s size: %d\\n", vec_s.size());
| | for(int i=0; i<vec_s.size(); ++i)
| | ? Rprintf("vec_s[%d]=%s\\n", i, vec_s[i]);
|
| Try vec_s[i].c_str() instead.
|
| Dirk
|
| | return l2;
| | ',
| | ????????????????? plugin="Rcpp", verbose=TRUE)
| | print(fn(list(a=c(1,2,3), b=c("a","b","c"))))
| |
| | Moreover, how can I access the component of a list given as input, as
| "l_in"
| | above? Should I use l.begin()? or l[1]? or l["a"]? none of them seems to
| | compile successfully.
| |
| | On Thu, Aug 11, 2011 at 8:54 PM, Dirk Eddelbuettel <edd at debian.org>
| wrote:
| | | | | | ? ? Howdy, | |
| | ? ? On 11 August 2011 at 20:44, Walrus Foolhill wrote:
| | ? ? | Ok, thanks for your answer, but I wasn't clear enough. So here are
| more
| | ? ? details
| | ? ? | of what I want to do.
| | ? ? |
| | ? ? | I have one list named "probes":
| | ? ? | probes <- list(chr1=data.frame(name=c("p1","p2"),
| | ? ? | ???????????????? start=c(81,95),
| | ? ? | ???????????????? end=c(85,100),
| | ? ? | ???????????????? stringsAsFactors=FALSE))
| | ? ? |
| | ? ? | I also have one list named "genes":
| | ? ? | genes <- list(chr1=data.frame(name=c("g1","g2"), start=c(11,111),
| end=c
| | ? ? | (90,190)),
| | ? ? | ??????????????? chr2=data.frame(name="g3", start=11, end=90))
| | ? ? |
| | ? ? | I need to compare those two lists in order to obtain the following
| list
| | ? ? which
| | ? ? | contains, for each gene, the name of the probes included in it:
| | ? ? | links <- list(chr1=list(g1=c("p1")))
| | ? ? |
| | ? ? | Here is my R function (assuming that the probes are sorted based on
| their
| | ? ? start
| | ? ? | and end coordinates):
| | ? ? |
| | ? ? | fun.l <- function(genes, probes){
| | ? ? | ? links <- lapply(names(genes), function(chr.name){
| | ? ? | ??? if(! chr.name %in% names(probes))
| | ? ? | ????? return(NULL)
| | ? ? | ???
| | ? ? | ??? res <- list()
| | ? ? | ???
| | ? ? | ??? genes.c <- genes[[chr.name]]
| | ? ? | ??? probes.c <- probes[[chr.name]]
| | ? ? | ???
| | ? ? | ??? for(gene.name in genes.c$name){
| | ? ? | ????? gene <- genes.c[genes.c$name == gene.name,]
| | ? ? | ????? res[[gene.name]] <- vector()
| | ? ? | ????? for(probe.name in probes.c$name){
| | ? ? | ??????? probe <- probes.c[probes.c$name == probe.name,]
| | ? ? | ??????? if(probe$start >= gene$start && probe$end <= gene$end)
| | ? ? | ????????? res[[gene.name]] <- append(res[[gene.name]], probe.name)
| | ? ? | ??????? else if(probe$start > gene$end)
| | ? ? | ????????? break
| | ? ? | ????? }
| | ? ? | ????? if(length(res[[gene.name]]) == 0)
| | ? ? | ??????? res[[gene.name]] <- NULL
| | ? ? | ??? }
| | ? ? | ???
| | ? ? | ??? if(length(res) == 0)
| | ? ? | ????? res <- NA
| | ? ? | ??? return(res)
| | ? ? | ? })
| | ? ? | ? names(links) <- names(genes)
| | ? ? | ? links <- Filter(function(links.c){!is.null(links.c)}, links)
| | ? ? | ? return(links)
| | ? ? | }
| | ? ? |
| | ? ? | And here is the beginning of my attempt using Rcpp:
| | ? ? |
| | ? ? | src <- '
| | ? ? | using namespace Rcpp;
| | ? ? |
| | ? ? | List genes = List(genes_in);
| | ? ? | int genes_nb_chr = genes.length();
| | ? ? | std::vector<std::string> genes_chr = genes.names();
| | ? ? |
| | ? ? | List probes = List(probes_in);
| | ? ? | int probes_nb_chr = probes.length();
| | ? ? |
| | ? ? | std::vector< std::vector<std::string> > links;
| | ? ? |
| | ? ? | // the main task is performed in this loop
| | ? ? | for(int chrnum=0; chrnum<genes_nb_chr; ++chrnum){
| | ? ? | ? DataFrame genes_c = DataFrame(genes[chrnum]);
| | ? ? | ? // ... add code to map probes on genes, that is fill "links" ...
| | ? ? | }
| | ? ? |
| | ? ? | return wrap(links);
| | ? ? | '
| | ? ? |
| | ? ? | funC <- cxxfunction(signature(genes_in="list",
| | ? ? | ??????????????????????????????? probes_in="list"),
| | ? ? | ????????????????????? body=src, plugin="Rcpp")
| | ? ? |
| | ? ? | The problem starts quite early: when I compile this piece of code,
| I get
| | ? ? | "error: call of overloaded ?DataFrame(Rcpp::internal::generic_proxy
| <19>)?
| | ? ? is
| | ? ? | ambiguous".
| |
| | ? ? Try a simpler mock-up. I don't have it in me to work through this
| now.
| | ? ? DataFrames are a little different from C++ -- start by trying to
| summarize
| | ? ? in
| | ? ? just a vector, or collection of vectors.
| |
| | ? ? | What should I do to go through the "probes" and "genes" lists given
| as
| | ? ? input?
| | ? ? | Maybe more generically, how can we go through a list of lists (of
| | ? ? lists...)
| | ? ? | with Rcpp?
| | ? ? |
| | ? ? | 2nd (small) question, I don't manage to use Rprintf when using
| inline,
| | ? ? for
| | ? ? | instance Rprintf("%d\n", i);, it complains about the quotes. What
| should
| | ? ? I do
| | ? ? | to print statement from within the for loop?
| |
| | ? ? The backslashes need escaping as in
| |
| | ? ? ?R> printing <- cxxfunction(, plugin="Rcpp", body=' Rprintf("foo\\
| n"); ')
| | ? ? ?R> printing()
| | ? ? ?foo
| | ? ? ?NULL
| | ? ? ?R>
| |
| | ? ? | Thanks in advance. As my question is very long, I won't mind if you
| tell
| | ? ? me to
| | ? ? | find another way by myself. But maybe one of you can put me on the
| good
| | ? ? track.
| |
| | ? ? You are doing good but you have decent size problem. Try breaking
| into
| | ? ? smaller pieces and a handle on each problem in turn.
| |
| | ? ? Dirk
| |
| | ? ? |
| | ? ? | On Thu, Aug 11, 2011 at 7:00 AM, Dirk Eddelbuettel <edd at debian.org>
| | ? ? wrote:
| | ? ? | | | ? ? |
| | ? ? | ? ? On 11 August 2011 at 03:06, Walrus Foolhill wrote:
| | ? ? | ? ? | Hello,
| | ? ? | ? ? | I need to create a list and then fill it sequentially by
| adding
| | ? ? | ? ? components in a
| | ? ? | ? ? | for loop. Here is an example that works:
| | ? ? | ? ? |
| | ? ? | ? ? | library(inline)
| | ? ? | ? ? | src <- '
| | ? ? | ? ? | Rcpp::List mylist(2);
| | ? ? | ? ? | for(int i=0; i<2; ++i)
| | ? ? | ? ? | ? mylist[i] = i;
| | ? ? | ? ? | mylist.names() = CharacterVector::create("a","b");
| | ? ? | ? ? | return mylist;
| | ? ? | ? ? | '
| | ? ? | ? ? | fun <- cxxfunction(body=src, plugin="Rcpp")
| | ? ? | ? ? | print(fun())
| | ? ? | ? ? |
| | ? ? | ? ? | But what I really want is to create an empty list and then
| fill it,
| | ? ? that
| | ? ? | ? ? is
| | ? ? | ? ? | without specifying its number of components before hand...
| This is
| | ? ? | ? ? because I
| | ? ? | ? ? | don't know in advance at which step of the for loop I will
| need to
| | ? ? create
| | ? ? | ? ? a new
| | ? ? | ? ? | component. Here is an example, that obviously doesn't work,
| but
| | ? ? that
| | ? ? | ? ? should
| | ? ? | ? ? | show what I am looking for:
| | ? ? | ? ? |
| | ? ? | ? ? | Rcpp::List mylist;
| | ? ? | ? ? | CharacterVector names = CharacterVector::create("a", "b");
| | ? ? |
| | ? ? | ? ? If you know how long names is, you know how long mylist going
| to be
| | ? ? ....
| | ? ? |
| | ? ? | ? ? | for(int i=0; i<2; ++i){
| | ? ? | ? ? | ? mylist.add(names[i], IntegerVector::create());
| | ? ? | ? ? | ? mylist[names[i]].push_back(i);
| | ? ? |
| | ? ? | ? ? I don't understand what that is trying to do.
| | ? ? |
| | ? ? | ? ? | }
| | ? ? | ? ? | return mylist;
| | ? ? | ? ? |
| | ? ? | ? ? | Do you know how I could achieve this? Thanks.
| | ? ? |
| | ? ? | ? ? Rcpp::List is an alias for Rcpp::GenericVector, and derives
| from
| | ? ? Vector.
| | ? ? | ? ? You
| | ? ? | ? ? can look at the public member functions -- there are things
| like
| | ? ? |
| | ? ? | ? ? ? ?push_back()
| | ? ? | ? ? ? ?push_front()
| | ? ? | ? ? ? ?insert()
| | ? ? |
| | ? ? | ? ? etc that behave like STL functions __but are inefficient as we
| | ? ? (almost
| | ? ? | ? ? always) need to copy the whole object__ so they are not
| recommended.
| | ? ? |
| | ? ? | ? ? When I had to deal with 'unknown quantities of data' returning
| I was
| | ? ? mostly
| | ? ? | ? ? able to either turn it into a 'fixed or known columns, unknow
| rows'
| | ? ? problem
| | ? ? | ? ? (easy, just grow row-wise) or I 'cached' in a C++ data
| structure
| | ? ? first
| | ? ? | ? ? before
| | ? ? | ? ? returning to R via Rcpp structures -- and then I knew the
| dimensions
| | ? ? for
| | ? ? | ? ? the
| | ? ? | ? ? to-be-created object too.
| | ? ? |
| | ? ? | ? ? Dirk
| | ? ? |
| | ? ? |
| | ? ? | ? ? --
| | ? ? | ? ? Two new Rcpp master classes for R and C++ integration scheduled
| for
| | ? ? | ? ? New York (Sep 24) and San Francisco (Oct 8), more details are
| at
| | ? ? | ? ? http://dirk.eddelbuettel.com/blog/2011/08/04#
| | ? ? | ? ? rcpp_classes_2011-09_and_2011-10
| | ? ? |
| | ? ? |
| |
| | ? ? --
| | ? ? Two new Rcpp master classes for R and C++ integration scheduled for
| | ? ? New York (Sep 24) and San Francisco (Oct 8), more details are at
| | ? ? http://dirk.eddelbuettel.com/blog/2011/08/04#
| | ? ? rcpp_classes_2011-09_and_2011-10
| | ? ? http://www.revolutionanalytics.com/products/training/public/
| | ? ? rcpp-master-class.php
| |
| |
|
| --
| Two new Rcpp master classes for R and C++ integration scheduled for
| New York (Sep 24) and San Francisco (Oct 8), more details are at
| http://dirk.eddelbuettel.com/blog/2011/08/04#
| rcpp_classes_2011-09_and_2011-10
| http://www.revolutionanalytics.com/products/training/public/
| rcpp-master-class.php
|
|
Two new Rcpp master classes for R and C++ integration scheduled for New York (Sep 24) and San Francisco (Oct 8), more details are at http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10 http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php
Steve and Dirk, thanks again, I will look into the GenomicRanges package for my immediate usage, but for my future need, I will also keep trying to manipulate nested lists with Rcpp, as all my data structures are as such.
On Fri, Aug 12, 2011 at 4:37 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 12 August 2011 at 16:26, Walrus Foolhill wrote:
| Thanks for your advice, I now understand how to manipulate one-level
lists:
|
| fn <- cxxfunction(signature(l_in="list"),
| body='
| using namespace Rcpp;
| List l(l_in);
| IntegerVector lf = l["foo"];
| CharacterVector lb = l["bar"];
| for(int i=0; i<lf.size(); ++i)
| Rprintf("l[%s][%i] %i\\n", "foo", i, lf[i]);
| for(int i=0; i<lb.size(); ++i)
| Rprintf("l[%s][%i] %s\\n", "bar", i, std::string(lb[i]).c_str());
| ', plugin="Rcpp", verbose=TRUE)
| z <- fn(list(foo=c(1,2,3,4),bar=c("bar1","bar2")))
|
| But what about 2-level lists? Why the following code doesn't compile?
|
| fn <- cxxfunction(signature(l_in="list"),
| body='
| using namespace Rcpp;
| List l(l_in);
| List lf(l["foo"]);
| ', plugin="Rcpp", verbose=TRUE)
| z <- fn(list(foo=list(bar=1)))
|
| And what the following message mean? "error: call of overloaded ?Vector
| (Rcpp::internal::generic_name_proxy<19>)? is ambiguous"
|
| I had a look at "runit.Vector.R" on r-forge, but couldn't find any test
| involving 2-level (or more) lists, although on SO in June 2010 (http://
|
stackoverflow.com/questions/3088650/how-do-i-create-a-list-of-vectors-in-rcpp/
| 3088744#3088744), you said that it should work.
|
| I checked that I can create a 2-level list, but the code below doesn't
compile
| if I uncomment the last Rprintf line:
There can be times when the C++ templating gets in the way, so if this
doesn't work in a single statement, decompose it into two (one to assign to
a
temp, another to print them temp) and move on.
I have done two-level lists in the past; one key is that a list ... is just
another SEXP, or can be wrap()'ed to a SEXP, and you can hence assign a
list
to be a component of another. And then another and so on...
Dirk
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110813/0480f3fa/attachment.htm>