Skip to content
Prev 4595 / 12125 Next

[R-pkg-devel] install.R running out of memory

On 11/3/19 1:05 PM, Viktor Gal wrote:
R is not optimized for these cases (generated code, source file with 
 >100,000 lines of code), but R has bindings for a large number of 
external libraries - it should be possible to make the bindings several 
orders of magnitude smaller and then they'd likely work well.

Making R work well on files like shogun.R would probably require a large 
amount of non-trivial work on R internals. I would be surprised if it 
were just say a memory leak we could fix and solve the issue quickly, it 
may well be that some data structures and algorithms simply won't scale 
to this extent. If you want to find out, you can debug using the usual R 
means (R profiler ?Rprof, run the script using ?source, perhaps 
disabling source references), but to interpret the results you may have 
to go deep into the implementation of R and in this case of S4.

Preparation for lazy loading starts by sourcing the file - with some 
details you can find out in the source code of installation and the 
documentation. I tried quickly and saw a lot of time spent in S4, which 
is not surprising as the generated file stresses S4 well beyond what is 
normally the case with R. But I would not be surprised if there were 
other bottlenecks to be seen later and even if you managed to prepare 
the package for lazy loading, there would probably be significant 
overheads at runtime. Still you could experiment with modifying the code 
generator to avoid the bottlenecks you identify.

If your primary goal is to create R bindings for an external library, 
I'd recommend having a look at how other packages do it to see what is 
scalable (there should be a way to make the code way smaller, and easily 
written by hand in most cases, even though some interfaces are 
generated, too).

Best
Tomas