This is a request to increase MAX_NUM_DLLS in Rdynload.c in from 100 to 500. On line 131 of Rdynload.c, changing #define MAX_NUM_DLLS 100 to #define MAX_NUM_DLLS 500 In development of the mlr package, there have been several episodes in the past where we have had to break up unit tests because of the "maximum number of DLLs reached" error. This error has been an inconvenience that is going to keep happening as the package continues to grow. Is there more than meets the eye with this error or would everything be okay if the above line changes? Would that have a larger effect in other parts of R? As R grows, we are likely to see more 'meta-packages' such as the Hadley-verse, caret, mlr, etc. need an increasing amount of DLLs loaded at any point in time to conduct effective unit tests. If MAX_NUM_DLLS is set to 100 for a very particular reason than I apologize, but if it is possible to increase MAX_NUM_DLLS it would at least make the testing at mlr much easier. I understand you are all very busy and thank you for your time. Regards, Steve Bronder Website: stevebronder.com Phone: 412-719-1282 Email: sbronder at stevebronder.com
Request: Increasing MAX_NUM_DLLS in Rdynload.c
12 messages · Henrik Bengtsson, Steve Bronder, Jeroen Ooms +4 more
On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some packages don't unload their DLLs when they being unloaded themselves. In other words, there may be left-over DLLs just sitting there doing nothing but occupying space. You can remove these, using: R.utils::gcDLLs() Maybe that will help you get through your tests (as long as you're unloading packages). gcDLLs() will look at base::getLoadedDLLs() and its content and compare to loadedNamespaces() and unregister any "stray" DLLs that remain after corresponding packages have been unloaded. I think it would be useful if R CMD check would also check that DLLs are unregistered when a package is unloaded (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29), but of course, someone needs to write the code / a patch for this to happen. /Henrik On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder
<sbronder at stevebronder.com> wrote:
This is a request to increase MAX_NUM_DLLS in Rdynload.c in from 100 to 500.
On line 131 of Rdynload.c, changing
#define MAX_NUM_DLLS 100
to
#define MAX_NUM_DLLS 500
In development of the mlr package, there have been several episodes in the
past where we have had to break up unit tests because of the "maximum
number of DLLs reached" error. This error has been an inconvenience that is
going to keep happening as the package continues to grow. Is there more
than meets the eye with this error or would everything be okay if the above
line changes? Would that have a larger effect in other parts of R?
As R grows, we are likely to see more 'meta-packages' such as the
Hadley-verse, caret, mlr, etc. need an increasing amount of DLLs loaded at
any point in time to conduct effective unit tests. If MAX_NUM_DLLS is set
to 100 for a very particular reason than I apologize, but if it is possible
to increase MAX_NUM_DLLS it would at least make the testing at mlr much
easier.
I understand you are all very busy and thank you for your time.
Regards,
Steve Bronder
Website: stevebronder.com
Phone: 412-719-1282
Email: sbronder at stevebronder.com
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Thanks Henrik this is very helpful! I will try this out on our tests and see if gcDLLs() has a positive effect. mlr currently has tests broken down by learner type such as classification, regression, forecasting, clustering, etc.. There are 83 classifiers alone so even when loading and unloading across learner types we can still hit the MAX_NUM_DLLS error, meaning we'll have to break them down further (or maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff and Bernd Bischl to make sure I am representing the issue well. Regards, Steve Bronder Website: stevebronder.com Phone: 412-719-1282 Email: sbronder at stevebronder.com On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson <
henrik.bengtsson at gmail.com> wrote:
On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some packages don't unload their DLLs when they being unloaded themselves. In other words, there may be left-over DLLs just sitting there doing nothing but occupying space. You can remove these, using: R.utils::gcDLLs() Maybe that will help you get through your tests (as long as you're unloading packages). gcDLLs() will look at base::getLoadedDLLs() and its content and compare to loadedNamespaces() and unregister any "stray" DLLs that remain after corresponding packages have been unloaded. I think it would be useful if R CMD check would also check that DLLs are unregistered when a package is unloaded (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29), but of course, someone needs to write the code / a patch for this to happen. /Henrik On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder <sbronder at stevebronder.com> wrote:
This is a request to increase MAX_NUM_DLLS in Rdynload.c in from 100 to
500.
On line 131 of Rdynload.c, changing #define MAX_NUM_DLLS 100 to #define MAX_NUM_DLLS 500 In development of the mlr package, there have been several episodes in
the
past where we have had to break up unit tests because of the "maximum number of DLLs reached" error. This error has been an inconvenience that
is
going to keep happening as the package continues to grow. Is there more than meets the eye with this error or would everything be okay if the
above
line changes? Would that have a larger effect in other parts of R? As R grows, we are likely to see more 'meta-packages' such as the Hadley-verse, caret, mlr, etc. need an increasing amount of DLLs loaded
at
any point in time to conduct effective unit tests. If MAX_NUM_DLLS is
set
to 100 for a very particular reason than I apologize, but if it is
possible
to increase MAX_NUM_DLLS it would at least make the testing at mlr much
easier.
I understand you are all very busy and thank you for your time.
Regards,
Steve Bronder
Website: stevebronder.com
Phone: 412-719-1282
Email: sbronder at stevebronder.com
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On Tue, Dec 20, 2016 at 7:04 AM, Henrik Bengtsson
<henrik.bengtsson at gmail.com> wrote:
On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some packages don't unload their DLLs when they being unloaded themselves.
I am surprised by this. Why does R not do this automatically? What is the case for keeping the DLL loaded after the package has been unloaded? What happens if you reload another version of the same package from a different library after unloading?
It's not always clear when it's safe to remove the DLL. The main problem that I'm aware of is that native objects with finalizers might still exist (created by R_RegisterCFinalizer etc). Even if there are no live references to such objects (which would be hard to verify), it still wouldn't be safe to unload the DLL until a full garbage collection has been done. If the DLL is unloaded, then the function pointer that was registered now becomes a pointer into the memory where the DLL was, leading to an almost certain crash when such objects get garbage collected. A better approach would be to just remove the limit on the number of DLLs, dynamically expanding the array if/when needed.
On Tue, Dec 20, 2016 at 3:40 AM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> wrote:
On Tue, Dec 20, 2016 at 7:04 AM, Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote:
On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some packages don't unload their DLLs when they being unloaded themselves.
I am surprised by this. Why does R not do this automatically? What is the case for keeping the DLL loaded after the package has been unloaded? What happens if you reload another version of the same package from a different library after unloading?
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Steve Bronder <sbronder at stevebronder.com>
on Tue, 20 Dec 2016 01:34:31 -0500 writes:
> Thanks Henrik this is very helpful! I will try this out on our tests and
> see if gcDLLs() has a positive effect.
> mlr currently has tests broken down by learner type such as classification,
> regression, forecasting, clustering, etc.. There are 83 classifiers alone
> so even when loading and unloading across learner types we can still hit
> the MAX_NUM_DLLS error, meaning we'll have to break them down further (or
> maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff and Bernd
> Bischl to make sure I am representing the issue well.
This came up *here* in May 2015
and then May 2016 ... did you not find it when googling.
Hint: Use
site:stat.ethz.ch MAX_NUM_DLLS
as search string in Google, so it will basically only search the
R mailing list archives
Here's the start of that thread :
https://stat.ethz.ch/pipermail/r-devel/2016-May/072637.html
There was not a clear conclusion back then, notably as
Prof Brian Ripley noted that 100 had already been an increase
and that a large number of loaded DLLs decreases look up speed.
OTOH (I think others have noted that) a large number of DLLs
only penalizes those who *do* load many, and we should probably
increase it.
Your use case of "hyper packages" which load many others
simultaneously is somewhat convincing to me... in so far as the
general feeling is that memory should be cheap and limits should
not be low.
(In spite of Brian Ripleys good reasons against it, I'd still
aim for a *dynamic*, i.e. automatically increased list here).
Martin Maechler
> Regards,
> Steve Bronder
> Website: stevebronder.com
> Phone: 412-719-1282
> Email: sbronder at stevebronder.com
> On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson <
> henrik.bengtsson at gmail.com> wrote:
>> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some
>> packages don't unload their DLLs when they being unloaded themselves.
>> In other words, there may be left-over DLLs just sitting there doing
>> nothing but occupying space. You can remove these, using:
>>
>> R.utils::gcDLLs()
>>
>> Maybe that will help you get through your tests (as long as you're
>> unloading packages). gcDLLs() will look at base::getLoadedDLLs() and
>> its content and compare to loadedNamespaces() and unregister any
>> "stray" DLLs that remain after corresponding packages have been
>> unloaded.
>>
>> I think it would be useful if R CMD check would also check that DLLs
>> are unregistered when a package is unloaded
>> (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29), but of
>> course, someone needs to write the code / a patch for this to happen.
>>
>> /Henrik
>>
>> On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder
>> <sbronder at stevebronder.com> wrote:
>> > This is a request to increase MAX_NUM_DLLS in Rdynload.c in from 100 to
>> 500.
>> >
>> > On line 131 of Rdynload.c, changing
>> >
>> > #define MAX_NUM_DLLS 100
>> >
>> > to
>> >
>> > #define MAX_NUM_DLLS 500
>> >
>> >
>> > In development of the mlr package, there have been several episodes in
>> the
>> > past where we have had to break up unit tests because of the "maximum
>> > number of DLLs reached" error. This error has been an inconvenience that
>> is
>> > going to keep happening as the package continues to grow. Is there more
>> > than meets the eye with this error or would everything be okay if the
>> above
>> > line changes? Would that have a larger effect in other parts of R?
>> >
>> > As R grows, we are likely to see more 'meta-packages' such as the
>> > Hadley-verse, caret, mlr, etc. need an increasing amount of DLLs loaded
>> at
>> > any point in time to conduct effective unit tests. If MAX_NUM_DLLS is
>> set
>> > to 100 for a very particular reason than I apologize, but if it is
>> possible
>> > to increase MAX_NUM_DLLS it would at least make the testing at mlr much
>> > easier.
>> >
>> > I understand you are all very busy and thank you for your time.
>> >
>> >
>> > Regards,
>> >
>> > Steve Bronder
>> > Website: stevebronder.com
>> > Phone: 412-719-1282
>> > Email: sbronder at stevebronder.com
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> [[alternative HTML version deleted]]
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
On 20 December 2016 at 17:40, Martin Maechler wrote:
| >>>>> Steve Bronder <sbronder at stevebronder.com> | >>>>> on Tue, 20 Dec 2016 01:34:31 -0500 writes: | | > Thanks Henrik this is very helpful! I will try this out on our tests and | > see if gcDLLs() has a positive effect. | | > mlr currently has tests broken down by learner type such as classification, | > regression, forecasting, clustering, etc.. There are 83 classifiers alone | > so even when loading and unloading across learner types we can still hit | > the MAX_NUM_DLLS error, meaning we'll have to break them down further (or | > maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff and Bernd | > Bischl to make sure I am representing the issue well. | | This came up *here* in May 2015 | and then May 2016 ... did you not find it when googling. | | Hint: Use | site:stat.ethz.ch MAX_NUM_DLLS | as search string in Google, so it will basically only search the | R mailing list archives | | Here's the start of that thread : | | https://stat.ethz.ch/pipermail/r-devel/2016-May/072637.html | | There was not a clear conclusion back then, notably as | Prof Brian Ripley noted that 100 had already been an increase | and that a large number of loaded DLLs decreases look up speed. | | OTOH (I think others have noted that) a large number of DLLs | only penalizes those who *do* load many, and we should probably | increase it. | | Your use case of "hyper packages" which load many others | simultaneously is somewhat convincing to me... in so far as the | general feeling is that memory should be cheap and limits should | not be low. | | (In spite of Brian Ripleys good reasons against it, I'd still | aim for a *dynamic*, i.e. automatically increased list here). Yes. Start with 10 or 20, add 10 as needed. Still fast in the 'small N' case and no longer a road block for the 'big N' case required by mlr et al. As a C++ programmer, I am now going to hug my std::vector and quietly retreat. Dirk | Martin Maechler | | > Regards, | | > Steve Bronder | > Website: stevebronder.com | > Phone: 412-719-1282 | > Email: sbronder at stevebronder.com | | | > On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson <
| > henrik.bengtsson at gmail.com> wrote:
| | >> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some | >> packages don't unload their DLLs when they being unloaded themselves. | >> In other words, there may be left-over DLLs just sitting there doing | >> nothing but occupying space. You can remove these, using: | >> | >> R.utils::gcDLLs() | >> | >> Maybe that will help you get through your tests (as long as you're | >> unloading packages). gcDLLs() will look at base::getLoadedDLLs() and | >> its content and compare to loadedNamespaces() and unregister any | >> "stray" DLLs that remain after corresponding packages have been | >> unloaded. | >> | >> I think it would be useful if R CMD check would also check that DLLs | >> are unregistered when a package is unloaded | >> (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29), but of | >> course, someone needs to write the code / a patch for this to happen. | >> | >> /Henrik | >> | >> On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder
| >> <sbronder at stevebronder.com> wrote:
| >> > This is a request to increase MAX_NUM_DLLS in Rdynload.c in from 100 to | >> 500. | >> > | >> > On line 131 of Rdynload.c, changing | >> > | >> > #define MAX_NUM_DLLS 100 | >> > | >> > to | >> > | >> > #define MAX_NUM_DLLS 500 | >> > | >> > | >> > In development of the mlr package, there have been several episodes in | >> the | >> > past where we have had to break up unit tests because of the "maximum | >> > number of DLLs reached" error. This error has been an inconvenience that | >> is | >> > going to keep happening as the package continues to grow. Is there more | >> > than meets the eye with this error or would everything be okay if the | >> above | >> > line changes? Would that have a larger effect in other parts of R? | >> > | >> > As R grows, we are likely to see more 'meta-packages' such as the | >> > Hadley-verse, caret, mlr, etc. need an increasing amount of DLLs loaded | >> at | >> > any point in time to conduct effective unit tests. If MAX_NUM_DLLS is | >> set | >> > to 100 for a very particular reason than I apologize, but if it is | >> possible | >> > to increase MAX_NUM_DLLS it would at least make the testing at mlr much | >> > easier. | >> > | >> > I understand you are all very busy and thank you for your time. | >> > | >> > | >> > Regards, | >> > | >> > Steve Bronder | >> > Website: stevebronder.com | >> > Phone: 412-719-1282 | >> > Email: sbronder at stevebronder.com | >> > | >> > [[alternative HTML version deleted]] | >> > | >> > ______________________________________________ | >> > R-devel at r-project.org mailing list | >> > https://stat.ethz.ch/mailman/listinfo/r-devel | >> | | > [[alternative HTML version deleted]] | | > ______________________________________________ | > R-devel at r-project.org mailing list | > https://stat.ethz.ch/mailman/listinfo/r-devel | | ______________________________________________ | R-devel at r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-devel
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
Hi, Dirk:
On 12/20/2016 10:56 AM, Dirk Eddelbuettel wrote:
On 20 December 2016 at 17:40, Martin Maechler wrote: | >>>>> Steve Bronder <sbronder at stevebronder.com> | >>>>> on Tue, 20 Dec 2016 01:34:31 -0500 writes: | | > Thanks Henrik this is very helpful! I will try this out on our tests and | > see if gcDLLs() has a positive effect. | | > mlr currently has tests broken down by learner type such as classification, | > regression, forecasting, clustering, etc.. There are 83 classifiers alone | > so even when loading and unloading across learner types we can still hit | > the MAX_NUM_DLLS error, meaning we'll have to break them down further (or | > maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff and Bernd | > Bischl to make sure I am representing the issue well. | | This came up *here* in May 2015 | and then May 2016 ... did you not find it when googling. | | Hint: Use | site:stat.ethz.ch MAX_NUM_DLLS | as search string in Google, so it will basically only search the | R mailing list archives | | Here's the start of that thread : | | https://stat.ethz.ch/pipermail/r-devel/2016-May/072637.html | | There was not a clear conclusion back then, notably as | Prof Brian Ripley noted that 100 had already been an increase | and that a large number of loaded DLLs decreases look up speed. | | OTOH (I think others have noted that) a large number of DLLs | only penalizes those who *do* load many, and we should probably | increase it. | | Your use case of "hyper packages" which load many others | simultaneously is somewhat convincing to me... in so far as the | general feeling is that memory should be cheap and limits should | not be low. | | (In spite of Brian Ripleys good reasons against it, I'd still | aim for a *dynamic*, i.e. automatically increased list here). Yes. Start with 10 or 20, add 10 as needed. Still fast in the 'small N' case and no longer a road block for the 'big N' case required by mlr et al. As a C++ programmer, I am now going to hug my std::vector and quietly retreat.
May I humbly request a translation of "std::vector" for people like me
who are not familiar with C++?
I got the following:
> install.packages('std')
Warning in install.packages :
package ?std? is not available (for R version 3.3.2)
Thanks,
Spencer Graves
Dirk | Martin Maechler | | > Regards, | | > Steve Bronder | > Website: stevebronder.com | > Phone: 412-719-1282 | > Email: sbronder at stevebronder.com | | | > On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson < | > henrik.bengtsson at gmail.com> wrote: | | >> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some | >> packages don't unload their DLLs when they being unloaded themselves. | >> In other words, there may be left-over DLLs just sitting there doing | >> nothing but occupying space. You can remove these, using: | >> | >> R.utils::gcDLLs() | >> | >> Maybe that will help you get through your tests (as long as you're | >> unloading packages). gcDLLs() will look at base::getLoadedDLLs() and | >> its content and compare to loadedNamespaces() and unregister any | >> "stray" DLLs that remain after corresponding packages have been | >> unloaded. | >> | >> I think it would be useful if R CMD check would also check that DLLs | >> are unregistered when a package is unloaded | >> (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29), but of | >> course, someone needs to write the code / a patch for this to happen. | >> | >> /Henrik | >> | >> On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder | >> <sbronder at stevebronder.com> wrote: | >> > This is a request to increase MAX_NUM_DLLS in Rdynload.c in from 100 to | >> 500. | >> > | >> > On line 131 of Rdynload.c, changing | >> > | >> > #define MAX_NUM_DLLS 100 | >> > | >> > to | >> > | >> > #define MAX_NUM_DLLS 500 | >> > | >> > | >> > In development of the mlr package, there have been several episodes in | >> the | >> > past where we have had to break up unit tests because of the "maximum | >> > number of DLLs reached" error. This error has been an inconvenience that | >> is | >> > going to keep happening as the package continues to grow. Is there more | >> > than meets the eye with this error or would everything be okay if the | >> above | >> > line changes? Would that have a larger effect in other parts of R? | >> > | >> > As R grows, we are likely to see more 'meta-packages' such as the | >> > Hadley-verse, caret, mlr, etc. need an increasing amount of DLLs loaded | >> at | >> > any point in time to conduct effective unit tests. If MAX_NUM_DLLS is | >> set | >> > to 100 for a very particular reason than I apologize, but if it is | >> possible | >> > to increase MAX_NUM_DLLS it would at least make the testing at mlr much | >> > easier. | >> > | >> > I understand you are all very busy and thank you for your time. | >> > | >> > | >> > Regards, | >> > | >> > Steve Bronder | >> > Website: stevebronder.com | >> > Phone: 412-719-1282 | >> > Email: sbronder at stevebronder.com | >> > | >> > [[alternative HTML version deleted]] | >> > | >> > ______________________________________________ | >> > R-devel at r-project.org mailing list | >> > https://stat.ethz.ch/mailman/listinfo/r-devel | >> | | > [[alternative HTML version deleted]] | | > ______________________________________________ | > R-devel at r-project.org mailing list | > https://stat.ethz.ch/mailman/listinfo/r-devel | | ______________________________________________ | R-devel at r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-devel
See inlin ?e? On Tue, Dec 20, 2016 at 12:14 PM, Spencer Graves <
spencer.graves at prodsyse.com> wrote:
Hi, Dirk: On 12/20/2016 10:56 AM, Dirk Eddelbuettel wrote:
On 20 December 2016 at 17:40, Martin Maechler wrote: | >>>>> Steve Bronder <sbronder at stevebronder.com> | >>>>> on Tue, 20 Dec 2016 01:34:31 -0500 writes: | | > Thanks Henrik this is very helpful! I will try this out on our tests and | > see if gcDLLs() has a positive effect. | | > mlr currently has tests broken down by learner type such as classification, | > regression, forecasting, clustering, etc.. There are 83 classifiers alone | > so even when loading and unloading across learner types we can still hit | > the MAX_NUM_DLLS error, meaning we'll have to break them down further (or | > maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff and Bernd | > Bischl to make sure I am representing the issue well. | | This came up *here* in May 2015 | and then May 2016 ... did you not find it when googling.
|
| Hint: Use | site:stat.ethz.ch MAX_NUM_DLLS | as search string in Google, so it will basically only search the | R mailing list archives
?I did not know this and apologize. I starred this email so I can use it
next time I have a question or request. I did find (and left a comment) on the stackoverflow question in which you left an answer to this question. http://stackoverflow.com/a/37021455/2269255
|
| Here's the start of that thread : | | ?? ?? https://stat.ethz.ch/pipermail/r-devel/2016-May/072637.html | | There was not a clear conclusion back then, notably as | Prof Brian Ripley noted that 100 had already been an increase | and that a large number of loaded DLLs decreases look up speed.
|
| OTOH (I think others have noted that) a large number of DLLs | only penalizes those who *do* load many, and we should probably | increase it.
?Am I correct in understanding that the decrease in lookup speed only
happens when a large number of DLLs are loaded? If so, this is an expected cost to having many DLLs and one that I, and I would guess other developers, would be willing to pay to have more DLLs available. If increasing MAX_NUM_DLLS would increase R's fixed memory footprint a significant amount then I think that's a reasonable argument against the increase in MAX_NUM_DLLS. ?
|
| Your use case of "hyper packages" which load many others | simultaneously is somewhat convincing to me... in so far as the | general feeling is that memory should be cheap and limits should | not be low.
?It should also be pointed out that even in the case of "hyper packages"
like mlr, this is only an issue during unit testing. I wonder if there is some middle ground here? Would it be difficult to have a compile flag that would change the number of MAX_NUM_DLLS when compiling R from source? I believe this would allow us to increase MAX_NUM_DLLS when testing in Travis and Jenkins while keeping the same footprint for regular users.?
|
| (In spite of Brian Ripleys good reasons against it, I'd still | aim for a *dynamic*, i.e. automatically increased list here). Yes. Start with 10 or 20, add 10 as needed. Still fast in the 'small N' case and no longer a road block for the 'big N' case required by mlr et al.
?This would be nice! Though my concern is the R-core team's time. This is
the best answer, but I don't feel comfortable requesting it because I can't help with this and do not want to take up R-core's time without a very significant reason.? ?Unit testing for a meta-package is a particular case, though I think an important one which will impact R over the long term. The answers from least to most complex are something like: 1. Do nothing 2. Increase MAX_NUM_DLLS 3. Compiler flag for MAX_NUM_DLLS ( I actually have no reference to how difficult this would be) 4. Change to dynamic loading I'm requesting (2) because I think it's a simple short term answer until someone has time to sit down and work out (4).?
As a C++ programmer, I am now going to hug my ?? std::vector and quietly retreat.
May I humbly request a translation of "std::vector" for people like me who are not familiar with C++? I got the following:
install.packages('std')
Warning in install.packages :
package ?std? is not available (for R version 3.3.2)
Thanks,
Spencer Graves
Dirk | Martin Maechler | | > Regards, | | > Steve Bronder | > Website: stevebronder.com | > Phone: 412-719-1282 | > Email: sbronder at stevebronder.com | | | > On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson < | > henrik.bengtsson at gmail.com> wrote: | | >> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some | >> packages don't unload their DLLs when they being unloaded themselves. | >> In other words, there may be left-over DLLs just sitting there doing | >> nothing but occupying space. You can remove these, using: | >> | >> R.utils::gcDLLs() | >> | >> Maybe that will help you get through your tests (as long as you're | >> unloading packages). gcDLLs() will look at base::getLoadedDLLs() and | >> its content and compare to loadedNamespaces() and unregister any | >> "stray" DLLs that remain after corresponding packages have been | >> unloaded. | >> | >> I think it would be useful if R CMD check would also check that DLLs | >> are unregistered when a package is unloaded | >> (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29), but of | >> course, someone needs to write the code / a patch for this to happen. | >> | >> /Henrik | >> | >> On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder | >> <sbronder at stevebronder.com> wrote: | >> > This is a request to increase MAX_NUM_DLLS in Rdynload.c in from 100 to | >> 500. | >> > | >> > On line 131 of Rdynload.c, changing | >> > | >> > #define MAX_NUM_DLLS 100 | >> > | >> > to | >> > | >> > #define MAX_NUM_DLLS 500 | >> > | >> > | >> > In development of the mlr package, there have been several episodes in | >> the | >> > past where we have had to break up unit tests because of the "maximum | >> > number of DLLs reached" error. This error has been an inconvenience that | >> is | >> > going to keep happening as the package continues to grow. Is there more | >> > than meets the eye with this error or would everything be okay if the | >> above | >> > line changes? Would that have a larger effect in other parts of R? | >> > | >> > As R grows, we are likely to see more 'meta-packages' such as the | >> > Hadley-verse, caret, mlr, etc. need an increasing amount of DLLs loaded | >> at | >> > any point in time to conduct effective unit tests. If MAX_NUM_DLLS is | >> set | >> > to 100 for a very particular reason than I apologize, but if it is | >> possible | >> > to increase MAX_NUM_DLLS it would at least make the testing at mlr much | >> > easier. | >> > | >> > I understand you are all very busy and thank you for your time. | >> > | >> > | >> > Regards, | >> > | >> > Steve Bronder | >> > Website: stevebronder.com | >> > Phone: 412-719-1282 | >> > Email: sbronder at stevebronder.com | >> > | >> > [[alternative HTML version deleted]] | >> > | >> > ______________________________________________ | >> > R-devel at r-project.org mailing list | >> > https://stat.ethz.ch/mailman/listinfo/r-devel | >> | | > [[alternative HTML version deleted]] | | > ______________________________________________ | > R-devel at r-project.org mailing list | > https://stat.ethz.ch/mailman/listinfo/r-devel | | ______________________________________________ | R-devel at r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
- ? Steve Bronder?
On Tue, Dec 20, 2016 at 7:39 AM, Karl Millar <kmillar at google.com> wrote:
It's not always clear when it's safe to remove the DLL. The main problem that I'm aware of is that native objects with finalizers might still exist (created by R_RegisterCFinalizer etc). Even if there are no live references to such objects (which would be hard to verify), it still wouldn't be safe to unload the DLL until a full garbage collection has been done. If the DLL is unloaded, then the function pointer that was registered now becomes a pointer into the memory where the DLL was, leading to an almost certain crash when such objects get garbage collected.
Very good point. Does base::gc() perform such a *full* garbage collection and thereby trigger all remaining finalizers to be called? In other words, do you think an explicit call to base::gc() prior to cleaning out left-over DLLs (e.g. R.utils::gcDLLs()) would be sufficient? /Henrik
A better approach would be to just remove the limit on the number of DLLs, dynamically expanding the array if/when needed. On Tue, Dec 20, 2016 at 3:40 AM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> wrote:
On Tue, Dec 20, 2016 at 7:04 AM, Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote:
On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some packages don't unload their DLLs when they being unloaded themselves.
I am surprised by this. Why does R not do this automatically? What is the case for keeping the DLL loaded after the package has been unloaded? What happens if you reload another version of the same package from a different library after unloading?
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
It does, but you'd still be relying on the R code ensuring that all of these objects are dead prior to unloading the DLL, otherwise they'll survive the GC. Maybe if the package counted how many such objects exist, it could work out when it's safe to remove the DLL. I'm not sure that it can be done automatically. What could be done is to to keep the DLL loaded, but remove it from R's table of loaded DLLs. That way, there's no risk of dangling function pointers and a new DLL of the same name could be loaded. You could still run into issues though as some DLLs assume that the associated namespace exists. Currently what I do is to never unload DLLs. If I need to replace one, then I just restart R. It's less convenient, but it's always correct. On Wed, Dec 21, 2016 at 9:10 AM, Henrik Bengtsson
<henrik.bengtsson at gmail.com> wrote:
On Tue, Dec 20, 2016 at 7:39 AM, Karl Millar <kmillar at google.com> wrote:
It's not always clear when it's safe to remove the DLL. The main problem that I'm aware of is that native objects with finalizers might still exist (created by R_RegisterCFinalizer etc). Even if there are no live references to such objects (which would be hard to verify), it still wouldn't be safe to unload the DLL until a full garbage collection has been done. If the DLL is unloaded, then the function pointer that was registered now becomes a pointer into the memory where the DLL was, leading to an almost certain crash when such objects get garbage collected.
Very good point. Does base::gc() perform such a *full* garbage collection and thereby trigger all remaining finalizers to be called? In other words, do you think an explicit call to base::gc() prior to cleaning out left-over DLLs (e.g. R.utils::gcDLLs()) would be sufficient? /Henrik
A better approach would be to just remove the limit on the number of DLLs, dynamically expanding the array if/when needed. On Tue, Dec 20, 2016 at 3:40 AM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> wrote:
On Tue, Dec 20, 2016 at 7:04 AM, Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote:
On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some packages don't unload their DLLs when they being unloaded themselves.
I am surprised by this. Why does R not do this automatically? What is the case for keeping the DLL loaded after the package has been unloaded? What happens if you reload another version of the same package from a different library after unloading?
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On 21 December 2016 at 09:42, Karl Millar via R-devel wrote:
| Currently what I do is to never unload DLLs. If I need to replace | one, then I just restart R. It's less convenient, but it's always | correct. Same here. Ever since we built littler in 2006 (!!) I have been doing tests at the command-line with fresh 'r' processes. No surprises, no side effects. Dirk PS Spencer, if you are still reading, std::vector is describe inter alia here http://en.cppreference.com/w/cpp/container/vector My point of bringing it up was a deeper one because that (really widely used) data structure grows as needed. No pointers, no malloc, no horror stories you may have heard from C.
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org