Skip to content

[Rcpp-devel] fun(Times) with STL

15 messages · Silkworth,David J., Douglas Bates, Wray, Christopher +1 more

#
I ran into this yesterday as I needed to get the position of the least
time value in a small Que of times that I had extracted from a more
complex structure.   I ended up with this problem during a separate
work-around, which is a story I'd love to tell also.

But, to the point.  I explored using STL in areas I had seen no examples
previously.

Nothing in the Rcpp sources suggested that min_element() or
max_element() functions were supported, but I tried anyway.

Only by taking note of the many compiler errors I generated was I able
to come up with this stable line of code:
double* myIterator = std::min_element (TimeQ.begin(), TimeQ.end());

Now, in C++ training I flunked pointers and I skipped class for
templates.  But now I need to know what the heck is myIterator.
So, I managed to put it into a single element NumericVector which I
could return to R and examine.

Okay, so this mysterious "iterator" thing is quite intuitively the
expected result of a min_element function (with some pointer witchcraft
included).

To get the position of this item in the TimeQ I ended up building a
small loop.

for(int col=0; col<TimeQ.size(); col++)  {			
if(TimeQ[col] == *myIterator) {			
show_position[0]=col;			
break;  }   }

My question is, "Is there a more elegant way to get the position value
that I need"? 	This code will be traversed 10's of thousands of times
in the function I am developing.

Here is the full example as I have distilled it down:
src <- '			
Rcpp::NumericVector TimeQ(arg1);			
Rcpp::NumericVector show_iterator(1);			
Rcpp::IntegerVector show_position(1);			
			
double* myIterator = std::min_element (TimeQ.begin(), TimeQ.end());

show_iterator[0] = *myIterator;			
			
for(int col=0; col<TimeQ.size(); col++)  {			
if(TimeQ[col] == *myIterator) {			
show_position[0]=col;			
break;  }   }			
		         	
return show_position;	// alternatively: return show_iterator;	
'			
 fun <- cxxfunction(signature(arg1="numeric"),src,plugin="Rcpp")

			
			
Times<-c(1944.285,2920.969,1720.230,1264.438,3607.507,1720.230,25176.020
);			
fun_test<- fun(Times)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110604/15f84012/attachment-0001.htm>
#
On Sat, Jun 4, 2011 at 11:08 AM, Silkworth,David J.
<SILKWODJ at airproducts.com> wrote:
The usual idiom is

double TQmin = *std::min_element(TimeQ.begin(), TimeQ.end());
#
On Sat, Jun 4, 2011 at 11:26 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
Sorry.  I didn't read through to the end of your message.  If you want
the index of the minimum element you can use

int min_el_ind = *std::min_element(TimeQ.begin(), TimeQ.end()) - TimeQ.begin();

The beauty of iterators is that they have more flexibility than simple
pointers and they also carry more information, so that the difference
between two iterators is the index of an element in an array-like
structure.

Dirk and Romain recommended the freely-available book "C++ Annotaions"
to me (just google the title to find out where to download it) and
that contains several chapters explaining iterators, STL storage
classes (on which much of the Rcpp class structures are patterned) and
the STL algorithms.  Definitely worth reading if you have the time.
#
On 4 June 2011 at 11:37, Douglas Bates wrote:
| Dirk and Romain recommended the freely-available book "C++ Annotaions"
| to me (just google the title to find out where to download it) and

Also:

edd at max:~$ apt-cache search "c\+\+-annotations"
c++-annotations - Extensive tutorial and documentation about C++
c++-annotations-contrib - Extensive tutorial and documentation about C++ - contributed files
c++-annotations-dvi - Extensive tutorial and documentation about C++ - DVI output
c++-annotations-html - Extensive tutorial and documentation about C++ - html output
c++-annotations-latex - Extensive tutorial and documentation about C++ - LaTeX output
c++-annotations-pdf - Extensive tutorial and documentation about C++ - PDF output
c++-annotations-ps - Extensive tutorial and documentation about C++ - Postscript output
c++-annotations-txt - Extensive tutorial and documentation about C++ - text output
edd at max:~$ 

so if you have a suitable Linux distribution then an electronic copy is just
an apt-get command away.

Dirk
#
Many thanks, Doug, I feel so-o-o close, but the revised example does not compile. 
I have taken the liberty to condense our discussion a little here.
Here is the revised example as I have distilled it down and edited (does NOT compile):

src <- '
Rcpp::NumericVector TimeQ(arg1);
Rcpp::IntegerVector show_position(1);
int min_el_ind = *std::min_element(TimeQ.begin(), TimeQ.end()) - TimeQ.begin();
show_position[0] = min_el_ind;
return show_position;?? 
'
?fun <- cxxfunction(signature(arg1="numeric"),src,plugin="Rcpp")
Times<-c(1944.285,2920.969,1720.230,1264.438,3607.507,1720.230,25176.020);
fun_test<- fun(Times)
#
Thanks, Dirk, the rest of us can just access this URL:

http://cppannotations.sourceforge.net/cppannotations/html/

I was also able to get a used copy of "STL Tutorial and Reference Guide"
for 12 cents (plus $3.99 shipping) through Amazon.

Examples are still most precious, however.

-----Original Message-----
From: Dirk Eddelbuettel [mailto:edd at debian.org] 
Sent: Saturday, June 04, 2011 12:59 PM
To: Douglas Bates
Cc: Silkworth,David J.; rcpp-devel at r-forge.wu-wien.ac.at
Subject: Re: [Rcpp-devel] fun(Times) with STL
On 4 June 2011 at 11:37, Douglas Bates wrote:
| Dirk and Romain recommended the freely-available book "C++ Annotaions"
| to me (just google the title to find out where to download it) and

Also:

edd at max:~$ apt-cache search "c\+\+-annotations"
c++-annotations - Extensive tutorial and documentation about C++
c++-annotations-contrib - Extensive tutorial and documentation about C++
- contributed files
c++-annotations-dvi - Extensive tutorial and documentation about C++ -
DVI output
c++-annotations-html - Extensive tutorial and documentation about C++ -
html output
c++-annotations-latex - Extensive tutorial and documentation about C++ -
LaTeX output
c++-annotations-pdf - Extensive tutorial and documentation about C++ -
PDF output
c++-annotations-ps - Extensive tutorial and documentation about C++ -
Postscript output
c++-annotations-txt - Extensive tutorial and documentation about C++ -
text output
edd at max:~$ 

so if you have a suitable Linux distribution then an electronic copy is
just
an apt-get command away.

Dirk
#
try:
int min_el_ind = std::min_element(TimeQ.begin(), TimeQ.end()) - TimeQ.begin();
#
A thinko on my part.  Remove the '*' in front of the std::min_element.


On Sat, Jun 4, 2011 at 12:52 PM, Silkworth,David J.
<SILKWODJ at airproducts.com> wrote:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: foo.Rout
Type: application/octet-stream
Size: 1230 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110604/daa93807/attachment.obj>
2 days later
#
I want you guys to know that I appreciate all the effort that you put
into the Rcpp package and this list.  I am obviously one of the early
people to hurdle the new "lowered bar" for C++ development in R.

I have developed a function that builds a series of vectors and a matrix
of undetermined size.  Rather than attempt dynamic objects (which I
wouldn't know how to do anyway) I have been able to initialize with row
dimensions that cannot be exceeded on these objects.  I expect that code
that merely assigns values to these addresses will run faster than code
that must allocate space for each row entry anyway.  But, at the end of
the process, my code realizes the true extent of these objects.  It
would be really nice to clean these up before return to R.

I am aware of a Dimension class, but really don't know how to go about
using this in this case.

For the vectors it was relatively simple to execute a loop of erase()
methods:

// this works perfectly
for(int t=newTimes.size()-1; t>row;t--)  {
newTimes.erase(t);
}

Alas, for the Rcpp::IntegerMatrix this was not so easy.

Rcpp::IntegerMatrix opd (exists and has been populated)

for(int v=0; v<num_opl v++)  {
for(int t=newTimes.size()-1; t>row;t--)  {
opd(_,0).erase(t);
}
}

This results in the compiler error: 'struct Rcpp::Matrix<13>::Column'
has no member named 'erase'

I'm sorry, but I am stuck.
#
On Tue, Jun 7, 2011 at 8:01 AM, Silkworth,David J.
<SILKWODJ at airproducts.com> wrote:
I think we are stuck too in that we can't tell what you are trying to
do.  (Well, at least I can't.).

Why do you bother erasing elements beyond the size() of the vector?
I'm pretty sure that will have no effect.

Perhaps if you were a bit more explicit about what you are trying to
do we could help.  As it stands you are telling us that you have tried
to use a method that isn't defined and all we can say is, "Yep, it
isn't defined."
#
Further to our personal exchange I have reviewed Romain's recent post in reply to Fatemeh Riahi
http://lists.r-forge.r-project.org/pipermail/rcpp-devel/2011-May/002256.html

I've attempted to re-write my simple example in the list of vectors format as shown by Romain.

src<-'
int s = 7;  // result of original oversize estimate before process runs
int c=3;  //known column count established from a list argument (variable to function)
int r = 4;  // number of rows that more complex process found necessary to fill
int i=0;
     Rcpp::List Lmv(c) ;  // list of matrix column vectors
     for(  i=0; i<c; i++)  {
         Rcpp::IntegerVector m(s) ;
// insertion of next three lines partially fills the matrix column vectors
// the same way as previous sample code
for( int j=0; j<r; j++){
m[j]= (i+1)*(j+1);
}
Lmv[i]=m;
}
// this code does not access the vector elements as I wish
//for(int j=0; j<r;j++)  { 
//for( i=0;i<r;i++)  {
//Lmv[i][j]= (i+1)*(j+1);} }

return Lmv;

'

 fun <- cxxfunction(signature(),src, plugin = "Rcpp")

I can create the partially filled matrix as a list of vectors.

But I don't know how to access the individual vector elements outside of the initialization loop.

Then, when I tried to perform the vector erasures I had more trouble:

Adding these lines before the return call does NOT work as I wanted either.
for(i=0; i<c; i++)  {
for(int e=s-1; e>r-1;e--) ?{
 Lmv[i].erase(e); }  // redimension the vector
}

I'm trying, but still stuck.



-----Original Message-----
From: Silkworth,David J. 
Sent: Tuesday, June 07, 2011 3:42 PM
To: 'Douglas Bates'
Subject: RE: [Rcpp-devel] redimension help

My apologies, Doug.

I've tried to distill the issue to a "simple", but complete example.

src <- '
int s = 7;  // result of original oversize estimate before process runs
int c=3;  //known column count established from a list argument (variable to function)
Rcpp::IntegerVector v(s);
Rcpp::IntegerMatrix m(s,c);

int r = 4;  // number of rows that more complex process found necessary to fill
for(int x=1; x<r+1;x++)  { v[x-1]=x; }  // just partial fill as process would
for(int j=0; j<r;j++)  { for(int i=0;i<r;i++)  {m(i,j)= (i+1)*(j+1);} }

for(int e=s-1; e>r-1;e--) ?{ v.erase(e); }  // redimension the vector

Rcpp::List L=Rcpp::List::create(v,m);
return L;
'

 fun <- cxxfunction(signature(),src, plugin = "Rcpp")

The erase loop on the vector performs a redimension "clean-up" so to speak from my over-estimate of required dimension.  The estimate is made at run time, just that it is made before the matrix is dimensioned and the rest of the function has executed.  The reason that there is shrinkage is that there are duplicate entries that the process finds and adjusts its matrix fill operation for.

I cannot duplicate such an erase operation on the matrix.  But I think that there is probably a different approach that would work on both the vector and matrix, if I was only smart enough.

I needed the matrix, because its number of columns is only determined at run time, so I need a way to have indexed labels for the column vectors that it creates.



Douglas Bates replied:

I think we are stuck too in that we can't tell what you are trying to
do.  (Well, at least I can't.).

Why do you bother erasing elements beyond the size() of the vector?
I'm pretty sure that will have no effect.

Perhaps if you were a bit more explicit about what you are trying to
do we could help.  As it stands you are telling us that you have tried
to use a method that isn't defined and all we can say is, "Yep, it
isn't defined."
#
On 7 June 2011 at 22:09, Silkworth,David J. wrote:
| Further to our personal exchange I have reviewed Romain's recent post in reply to Fatemeh Riahi
| http://lists.r-forge.r-project.org/pipermail/rcpp-devel/2011-May/002256.html
| 
| I've attempted to re-write my simple example in the list of vectors format as shown by Romain.
| 
| src<-'
| int s = 7;  // result of original oversize estimate before process runs
| int c=3;  //known column count established from a list argument (variable to function)
| int r = 4;  // number of rows that more complex process found necessary to fill
| int i=0;
|      Rcpp::List Lmv(c) ;  // list of matrix column vectors
|      for(  i=0; i<c; i++)  {
|          Rcpp::IntegerVector m(s) ;
| // insertion of next three lines partially fills the matrix column vectors
| // the same way as previous sample code
| for( int j=0; j<r; j++){
| m[j]= (i+1)*(j+1);
| }
| Lmv[i]=m;
| }
| // this code does not access the vector elements as I wish
| //for(int j=0; j<r;j++)  { 
| //for( i=0;i<r;i++)  {
| //Lmv[i][j]= (i+1)*(j+1);} }

Try this:

require(inline)

src <- '
  int s = 7;  // result of original oversize estimate before process runs
  int c = 3;  //known column count established from a list argument (variable to function)
  int r = 4;  // number of rows that more complex process found necessary to fill
  int i = 0;
  Rcpp::List Lmv(c) ;  // list of matrix column vectors
  for(  i=0; i<c; i++)  {
     Rcpp::IntegerVector m(s) ;
     // insertion of next three lines partially fills the matrix column vectors
     // the same way as previous sample code
     for( int j=0; j<r; j++){
       m(j)=  (i+1)*(j+1);
     }
     Lmv[i] = m;
  }
  // this code does not access the vector elements as I wish
  for(int j=0; j<c; j++)  {
     for( i=0; i<r; i++)  {
        Rcpp::IntegerVector v = Lmv(j);
        v(i) = (i+1)*(j+1);
     }
  }
  return Lmv;
'

fun <- cxxfunction(signature(), src, plugin = "Rcpp")
print(fun())


And please do search the list archives. I explained a few times already why
using [][] cannot work.

| 
| return Lmv;
| 
| '
| 
|  fun <- cxxfunction(signature(),src, plugin = "Rcpp")
| 
| I can create the partially filled matrix as a list of vectors.
| 
| But I don't know how to access the individual vector elements outside of the initialization loop.
| 
| Then, when I tried to perform the vector erasures I had more trouble:
| 
| Adding these lines before the return call does NOT work as I wanted either.
| for(i=0; i<c; i++)  {
| for(int e=s-1; e>r-1;e--) ?{
|  Lmv[i].erase(e); }  // redimension the vector
| }
| 
| I'm trying, but still stuck.


Not all STL methods exist in all Rcpp classes.

Dirk



| 
| 
| 
| -----Original Message-----
| From: Silkworth,David J. 
| Sent: Tuesday, June 07, 2011 3:42 PM
| To: 'Douglas Bates'
| Subject: RE: [Rcpp-devel] redimension help
| 
| My apologies, Doug.
| 
| I've tried to distill the issue to a "simple", but complete example.
| 
| src <- '
| int s = 7;  // result of original oversize estimate before process runs
| int c=3;  //known column count established from a list argument (variable to function)
| Rcpp::IntegerVector v(s);
| Rcpp::IntegerMatrix m(s,c);
| 
| int r = 4;  // number of rows that more complex process found necessary to fill
| for(int x=1; x<r+1;x++)  { v[x-1]=x; }  // just partial fill as process would
| for(int j=0; j<r;j++)  { for(int i=0;i<r;i++)  {m(i,j)= (i+1)*(j+1);} }
| 
| for(int e=s-1; e>r-1;e--) ?{ v.erase(e); }  // redimension the vector
| 
| Rcpp::List L=Rcpp::List::create(v,m);
| return L;
| '
| 
|  fun <- cxxfunction(signature(),src, plugin = "Rcpp")
| 
| The erase loop on the vector performs a redimension "clean-up" so to speak from my over-estimate of required dimension.  The estimate is made at run time, just that it is made before the matrix is dimensioned and the rest of the function has executed.  The reason that there is shrinkage is that there are duplicate entries that the process finds and adjusts its matrix fill operation for.
| 
| I cannot duplicate such an erase operation on the matrix.  But I think that there is probably a different approach that would work on both the vector and matrix, if I was only smart enough.
| 
| I needed the matrix, because its number of columns is only determined at run time, so I need a way to have indexed labels for the column vectors that it creates.
| 
| 
| 
| Douglas Bates replied:
| 
| I think we are stuck too in that we can't tell what you are trying to
| do.  (Well, at least I can't.).
| 
| Why do you bother erasing elements beyond the size() of the vector?
| I'm pretty sure that will have no effect.
| 
| Perhaps if you were a bit more explicit about what you are trying to
| do we could help.  As it stands you are telling us that you have tried
| to use a method that isn't defined and all we can say is, "Yep, it
| isn't defined."
| 
| 
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
#
I have not been slack on reviewing the archives, but they ARE hard to
search.

So far, I am led to believe that a list of vectors can only be accessed
during initialization.
This would seem to be a real problem for entering elements in looped
increments 10's of thousands of times.

The original matrix scheme DOES work, but I think I have to improve the
estimated size of rows, so that I don't create an excessive memory foot
print.  Once dimensioned a matrix apparently is not going to be resized.

I am aware that as long as my individual vectors and my matrix row
lengths are the same size that these will combine into an
Rcpp::DataFrame.

Once in R I would have to either find a way to ignore the unfilled
elements or hopefully the following code would not be too oppressive:

RcppDF<-RcppDF[1:r,]

I just thought it would be possible to do in C++






Try this:

require(inline)

src <- '
  int s = 7;  // result of original oversize estimate before process
runs
  int c = 3;  //known column count established from a list argument
(variable to function)
  int r = 4;  // number of rows that more complex process found
necessary to fill
  int i = 0;
  Rcpp::List Lmv(c) ;  // list of matrix column vectors
  for(  i=0; i<c; i++)  {
     Rcpp::IntegerVector m(s) ;
     // insertion of next three lines partially fills the matrix column
vectors
     // the same way as previous sample code
     for( int j=0; j<r; j++){
       m(j)=  (i+1)*(j+1);
     }
     Lmv[i] = m;
  }
  // this code does not access the vector elements as I wish
  for(int j=0; j<c; j++)  {
     for( i=0; i<r; i++)  {
        Rcpp::IntegerVector v = Lmv(j);
        v(i) = (i+1)*(j+1);
     }
  }
  return Lmv;
'

fun <- cxxfunction(signature(), src, plugin = "Rcpp")
print(fun())


And please do search the list archives. I explained a few times already
why
using [][] cannot work.



Not all STL methods exist in all Rcpp classes.

Dirk
#
On 7 June 2011 at 23:18, Silkworth,David J. wrote:
| I have not been slack on reviewing the archives, but they ARE hard to
| search.

A Google query such as

    site:gmane.org gmane.comp.lang.r.rcpp Rcpp::List

is not that hard. YMMV.
 
| So far, I am led to believe that a list of vectors can only be accessed
| during initialization.

Wrong.

| This would seem to be a real problem for entering elements in looped
| increments 10's of thousands of times.
| 
| The original matrix scheme DOES work, but I think I have to improve the
| estimated size of rows, so that I don't create an excessive memory foot
| print.  Once dimensioned a matrix apparently is not going to be resized.
| 
| I am aware that as long as my individual vectors and my matrix row
| lengths are the same size that these will combine into an
| Rcpp::DataFrame.
| 
| Once in R I would have to either find a way to ignore the unfilled
| elements or hopefully the following code would not be too oppressive:
| 
| RcppDF<-RcppDF[1:r,]

What's your obsession with [] ?  They _cannot take multiple arguments_ as
you'd need for i,j coordinates.
 
| I just thought it would be possible to do in C++

Wishing alone does not make it so.  

Dirk
 
 
| 
| 
| 
| Try this:
| 
| require(inline)
| 
| src <- '
|   int s = 7;  // result of original oversize estimate before process
| runs
|   int c = 3;  //known column count established from a list argument
| (variable to function)
|   int r = 4;  // number of rows that more complex process found
| necessary to fill
|   int i = 0;
|   Rcpp::List Lmv(c) ;  // list of matrix column vectors
|   for(  i=0; i<c; i++)  {
|      Rcpp::IntegerVector m(s) ;
|      // insertion of next three lines partially fills the matrix column
| vectors
|      // the same way as previous sample code
|      for( int j=0; j<r; j++){
|        m(j)=  (i+1)*(j+1);
|      }
|      Lmv[i] = m;
|   }
|   // this code does not access the vector elements as I wish
|   for(int j=0; j<c; j++)  {
|      for( i=0; i<r; i++)  {
|         Rcpp::IntegerVector v = Lmv(j);
|         v(i) = (i+1)*(j+1);
|      }
|   }
|   return Lmv;
| '
| 
| fun <- cxxfunction(signature(), src, plugin = "Rcpp")
| print(fun())
| 
| 
| And please do search the list archives. I explained a few times already
| why
| using [][] cannot work.
| 
| 
| 
| Not all STL methods exist in all Rcpp classes.
| 
| Dirk
| 
| 
|
#
David,

One last thing: you are trying something difficult with large
multi-dimensional objects.

I really recommend that you try to become more familiar with a more STL-ish
way of doing things. Try something simpler on std::vector<> et al -- how you
can change dimension, expand, remove, ... without _ever_ having to worry
about manual memory allocation via new / delete (or, worse, malloc /
free). That is a good thing.  If you really know the size of objects, try
reserve() or size().

Our Rcpp objects are pretty similar in some aspects, but because they really
shadow the underlying R objects (those SEXPs) they are still different.  It
takes some getting used, and I have no better recommendation than to read
more documentation and working code -- there are 20+ packages on CRAN using
Rcpp.  You may find something close to your needs for closer study.

Hope this helps, Dirk