Hi Folks,
R has known speed issues for recursive definitions.
There was a thread "Extremely slow recursion in R?" in
August 2006 (24 Aug, from Jason Liao), with some
interesting comparisons between programming languages.
I'm re-opening the topic, with an ulterior motive
(stated at the end).
The starting point was the question "How many distinct
increasing sequences of length n can be made from k
integers (1:k)?", i.e. what is the size of the sample
space of
sort(sample((1:k),n,replace=TRUE))
Let this number be Nnk(n,k). I aptly observed that
Nnk(1,k) = k
Nnk(n,k) = Nnk(n-1,k) + sum[r= 1 to k](Nnk(n-1,r))
So I slickly wrote a recursive definition:
Nnk<-function(n,k){
if(n==1) {return(k)} else {
R<-0;
for(r in (1:k)) R<-(R+Nnk(n-1,k-r+1,depth))
}
return(R)
}
I then ran a few test cases, to check timing etc.:
Nnk(21, 7) 34 seconds
Nnk(21, 8) 120 seconds
Nnk(21, 9) 384 seconds
Nnk(21,10) 1141 seconds
I then complacently set it to work out the number I happened
to really want:
Nnk(21,20)
24 hours later, no result; and an extra "test line" (not
shown above) at the start of the function had prodced no
output, so the function had not even returned to the top
level for the first time.
Then, with heavy thunderstorms rolling in, and my mains
power beginning to flicker, I closed down the computer.
A quick extrapolation based on the above results, and a
few others, suggested that the time for completion of
Nnk(21,20) would be about 2 years.
Then I thought about it.
The recursion relation in fact shows that Nnk(n,k) is
the maximum value in cumsum(Nnk(n,r)) over r = 1:k.
Hence (for n>=1 and r>=1):
Nnk<-function(n,k){
K<-rep(1,k)
for(i in (1:n)){
K<-cumsum(K)
}
return(max(K))
}
This definition then gave me, in so brief a blink of the
eye that I could make no estimate of it, the result:
Nnk(21,20) = 131282408400
In fact, I had to go up to things like
Nnk(1000,200) = 3.335367e+232
before I could sensibly perceive the delay (about 0.5 seconds
in this case).
On the old recursive definition, the computation might have
outlasted the Universe.
ON THAT BASIS: I hereby claim the all-time record for inefficient
programming in R.
Challengers are invited to strut their stuff ...
Best wishes to all, and happy end of week,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 06-Jul-07 Time: 17:53:55
------------------------------ XFMail ------------------------------
Recursion in R ...
5 messages · (Ted Harding), Alberto Monteiro, Uwe Ligges +1 more
Ted Harding wrote:
So I slickly wrote a recursive definition:
Nnk<-function(n,k){
if(n==1) {return(k)} else {
R<-0;
for(r in (1:k)) R<-(R+Nnk(n-1,k-r+1)) # ,depth))
}
return(R)
}
You are aware that this is equivalent to:
Nnk1 <- function(n, k) { prod(1:(n+k-1)) / prod(1:n) / prod(1:(k-1)) }
aren't you?
ON THAT BASIS: I hereby claim the all-time record for inefficient programming in R. Challengers are invited to strut their stuff ...
No, I don't think I can bet you unintentionally, even though my second computer program that I ever wrote in life had to be aborted, because it consumed all the resources of the computer. Alberto Monteiro
Alberto Monteiro wrote:
Ted Harding wrote:
So I slickly wrote a recursive definition:
Nnk<-function(n,k){
if(n==1) {return(k)} else {
R<-0;
for(r in (1:k)) R<-(R+Nnk(n-1,k-r+1)) # ,depth))
}
return(R)
}
You are aware that this is equivalent to:
Nnk1 <- function(n, k) { prod(1:(n+k-1)) / prod(1:n) / prod(1:(k-1)) }
or
Nnk2 <- function(n, k) { gamma(n+k) / gamma(n+1) / gamma(k) }
or most easily:
Nnk3 <- function(n, k) choose(n+k-1, n)
Uwe Ligges
aren't you?
ON THAT BASIS: I hereby claim the all-time record for inefficient programming in R. Challengers are invited to strut their stuff ...
No, I don't think I can bet you unintentionally, even though my second computer program that I ever wrote in life had to be aborted, because it consumed all the resources of the computer. Alberto Monteiro
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On 07-Jul-07 10:34:03, Uwe Ligges wrote:
Alberto Monteiro wrote:
Ted Harding wrote:
So I slickly wrote a recursive definition:
Nnk<-function(n,k){
if(n==1) {return(k)} else {
R<-0;
for(r in (1:k)) R<-(R+Nnk(n-1,k-r+1)) # ,depth))
}
return(R)
}
You are aware that this is equivalent to:
Nnk1 <- function(n, k) { prod(1:(n+k-1)) / prod(1:n) / prod(1:(k-1)) }
or
Nnk2 <- function(n, k) { gamma(n+k) / gamma(n+1) / gamma(k) }
or most easily:
Nnk3 <- function(n, k) choose(n+k-1, n)
Uwe Ligges
OK, I can recognise a negative binomial coefficient when I see one! I'm wondering, though, what is the "natural" connection between choose(n+k-1, n) and the statement of the original question: What is the number of distinct non-decreasing sequences of length n which can be made using integers chosen from (1:k)? (repetition allowed, of course) (The fact that this leads to a recurrence relationship which is satisfied by choose(n+k-1,n) is not what I mean by "natural" -- I'm looking for a correspondence between such a sequence, and a choice of n out of (n+k-1) somethings). Ted.
aren't you?
ON THAT BASIS: I hereby claim the all-time record for inefficient programming in R. Challengers are invited to strut their stuff ...
No, I don't think I can bet you unintentionally, even though my second computer program that I ever wrote in life had to be aborted, because it consumed all the resources of the computer. Alberto Monteiro
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 07-Jul-07 Time: 12:15:21 ------------------------------ XFMail ------------------------------
On 07/07/2007 7:15 AM, (Ted Harding) wrote:
On 07-Jul-07 10:34:03, Uwe Ligges wrote:
Alberto Monteiro wrote:
Ted Harding wrote:
So I slickly wrote a recursive definition:
Nnk<-function(n,k){
if(n==1) {return(k)} else {
R<-0;
for(r in (1:k)) R<-(R+Nnk(n-1,k-r+1)) # ,depth))
}
return(R)
}
You are aware that this is equivalent to:
Nnk1 <- function(n, k) { prod(1:(n+k-1)) / prod(1:n) / prod(1:(k-1)) }
or
Nnk2 <- function(n, k) { gamma(n+k) / gamma(n+1) / gamma(k) }
or most easily:
Nnk3 <- function(n, k) choose(n+k-1, n)
Uwe Ligges
OK, I can recognise a negative binomial coefficient when I see one! I'm wondering, though, what is the "natural" connection between choose(n+k-1, n) and the statement of the original question: What is the number of distinct non-decreasing sequences of length n which can be made using integers chosen from (1:k)? (repetition allowed, of course) (The fact that this leads to a recurrence relationship which is satisfied by choose(n+k-1,n) is not what I mean by "natural" -- I'm looking for a correspondence between such a sequence, and a choice of n out of (n+k-1) somethings).
Colour the integers 1:k red and the integers 1:(n-1) black, and throw them in a hat. Select n things out of the hat. Put the red things in order: that's the strictly increasing subsequence. Put the black things in order. Proceeding from smallest to largest, if you see a black i, duplicate the i'th element in the current version of the sequence. For example, if k=5, n=4, you might draw red 2, 3 and black 1, 2, so you'd build your sequence as 2 3 2 2 3 2 2 2 3 or you might draw red 1, 4, 5 and black 2, so you'd output 1 4 4 5 Duncan Murdoch