Kevin,
I was glad to see your list. Some of the items were reasons for creating
some of the functions in Hmisc. summarize and mApply in conjunction with
llist handle labeling of output - this is actually quite tricky and the
Hmisc solution isn't perfect. Dropping unused factor levels by default
(with easy override) is an old battle and I agree with you completely that
for everyday data analysis I almost always want to do this. But I haven't
been able to convince anyone else about that, despite repeated attempts.
[.factor in Hmisc drops unused levels by default. To be honest, the one
place I've gotten into trouble with this default occasionally is in
multiple panels in lattice related to consistent assignment of line styles
and symbols across strata when the "groups" variable has missing cells in
some panels.
I also share your views about namespaces. These have caused numerous
problems for me. It would be nice to have more of a mechanism to put
"feelers" out to the R user community when major changes are planned.
Namespaces seemed to appear on the scene quite quickly. I do see some
advantages for them though. By contrast, I have been very relieved that
S4 classes have not posed a problem for my code that relies on "old"
classes (totally unlike my experience with S-Plus) but any time changes
are made that involve some incompatibilies with old code there should be
some pause.
In Hmisc and Design I reference several functions that were not exported
from packages that now use namespaces. There is an elegant solution with
the package:::function notation, but I have been unable to use this
solution because I use one code base for all versions of R and S-Plus.
This notation generates syntax errors in all but late versions of R.
Let me add to the wish list the creation of some mechanism to better track
improvements and bug fixes in packages, such as a change log link by each
package's area in CRAN, or easy access to CVS information from there.
When I report bugs (e.g., in read.xport in foreign [due somewhat to
problems inherent with SAS's format] or ace or avas in acepack) it would
be nice to see some announcement when the bugs are resolved, or to easily
track this. Even a checkbox that the package maintainer has seen the bug
report even if she/he currently does not have time to work on it would be
very helpful, as would a notation that the bug report was found to be
"buggy".
Frank
---
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
Wish list
23 messages · A.J. Rossini, Peter Dalgaard, Thomas Lumley +6 more
1 day later
Frank E Harrell Jr writes:
Let me add to the wish list the creation of some mechanism to better track improvements and bug fixes in packages, such as a change log link by each package's area in CRAN, or easy access to CVS information from there. When I report bugs (e.g., in read.xport in foreign [due somewhat to problems inherent with SAS's format] or ace or avas in acepack) it would be nice to see some announcement when the bugs are resolved, or to easily track this. Even a checkbox that the package maintainer has seen the bug report even if she/he currently does not have time to work on it would be very helpful, as would a notation that the bug report was found to be "buggy".
I highly agree with you on this. It would be very nice having a fully featured bug reporting system, where you could upload patches, discuss improvements on existing packages or on the R-core itself, request for features and so on. I think that Bugzilla (www.bugzilla.org) would suit these expectations very well. It is the bug tracking system used by huge projects like Mozilla (bugzilla.mozilla.org), Gentoo (bugs.gentoo.org), and Redhat (bugzilla.redhat.com), and based on my own experience I'd say it addresses most of the things you pointed out. It works (at least in Gentoo, which is the one I'm more used to working with) like this: someone files in a bug report. In the bug report itself one informs the type of the bug report (a bug, a feature improvement, a request to the developer), the severity, and any other relevant information. It is also possible to upload attachments (like proposed patches) or additional information on the report. The bug report then is assigned to a given group or, in the case of packages to the person who is in charge for mantaining it. Anyone then can read the bug report and make suggestions or propose fixes (see for example: http://bugs.gentoo.org/show_bug.cgi?id=30784 ). [As opposed to the current system, where the bug report can't even be linked to a website, and all the discussion should be done via the mailing lists]. The maintainer or any other other authorized developer can then accept or reject the proposed suggestions, close the bug as duplicate, as invalid or at least inform that he is aware of the problem and will work on it some time later. Just my two cents, -- []'s Fernando Henrique Ferraz P. da Rosa
Bugzilla is a pain-in-the-arse to maintain, unless they've improved it in the last 9 months. Just my two cents... best, -tony Fernando Henrique Ferraz <feferraz@ime.usp.br> writes:
Frank E Harrell Jr writes:
Let me add to the wish list the creation of some mechanism to better track improvements and bug fixes in packages, such as a change log link by each package's area in CRAN, or easy access to CVS information from there. When I report bugs (e.g., in read.xport in foreign [due somewhat to problems inherent with SAS's format] or ace or avas in acepack) it would be nice to see some announcement when the bugs are resolved, or to easily track this. Even a checkbox that the package maintainer has seen the bug report even if she/he currently does not have time to work on it would be very helpful, as would a notation that the bug report was found to be "buggy".
I highly agree with you on this. It would be very nice having a fully featured bug reporting system, where you could upload patches, discuss improvements on existing packages or on the R-core itself, request for features and so on. I think that Bugzilla (www.bugzilla.org) would suit these expectations very well. It is the bug tracking system used by huge projects like Mozilla (bugzilla.mozilla.org), Gentoo (bugs.gentoo.org), and Redhat (bugzilla.redhat.com), and based on my own experience I'd say it addresses most of the things you pointed out. It works (at least in Gentoo, which is the one I'm more used to working with) like this: someone files in a bug report. In the bug report itself one informs the type of the bug report (a bug, a feature improvement, a request to the developer), the severity, and any other relevant information. It is also possible to upload attachments (like proposed patches) or additional information on the report. The bug report then is assigned to a given group or, in the case of packages to the person who is in charge for mantaining it. Anyone then can read the bug report and make suggestions or propose fixes (see for example: http://bugs.gentoo.org/show_bug.cgi?id=30784 ). [As opposed to the current system, where the bug report can't even be linked to a website, and all the discussion should be done via the mailing lists]. The maintainer or any other other authorized developer can then accept or reject the proposed suggestions, close the bug as duplicate, as invalid or at least inform that he is aware of the problem and will work on it some time later. Just my two cents, -- []'s Fernando Henrique Ferraz P. da Rosa
______________________________________________ R-devel@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}
rossini@blindglobe.net (A.J. Rossini) writes:
Bugzilla is a pain-in-the-arse to maintain, unless they've improved it in the last 9 months. Just my two cents...
It might have (and I wouldn't mind replacing Jitterbug with something that is actually maintained itself!), but there's another rear-end problem... Who's going to do the actual work? This means both fixing bugs and keeping the bug repository current. This is not easy, even for base R. We have the Jitterbug r-bugs site, which at least helps us not to forget bugs that have been reported, but we often forget to close bugs as they are fixed and it often takes a while before someone gets around to sorting the incoming directory. This work is not going away with a more advanced system, and getting the - hmmm - "varied" group of package maintainers to participate sounds like a can of worms to me.
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
On Sun, 18 Jan 2004, Fernando Henrique Ferraz wrote:
Frank E Harrell Jr writes:
Let me add to the wish list the creation of some mechanism to better track improvements and bug fixes in packages, such as a change log link by each package's area in CRAN, or easy access to CVS information from there. When I report bugs (e.g., in read.xport in foreign [due somewhat to problems inherent with SAS's format] or ace or avas in acepack) it would be nice to see some announcement when the bugs are resolved, or to easily track this.
A lot of wishlist suggestions need at least cooperation from R-core, who may not agree that a change is desirable even if someone else were to write the code. A bug-tracking system for contributed packages is one of the exceptions. There's nothing to stop some package developer(s) created a bug tracking system and making accounts available to other people (except the time, resources, security issues, etc). Keeping track of changes is harder. The CVS commit logs for foreign and survival are with the log for R itself on http://developer.r-project.org. It's not even that hard to write R code to read the page and extract entries relevant to that package. For CRAN to list changes to other packages would require cooperation from all the package developers. If the maintainer of acepack isn't sufficiently together to reply to your messages, he probably won't be keeping up with other aspects of change tracking. Even trying to extract a NEWS or Changelog file might not work -- eg for survival the Changelog file is Terry Therneau's change log, not my log for changes to the R port. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley@u.washington.edu University of Washington, Seattle
Peter Dalgaard <p.dalgaard@biostat.ku.dk> writes:
rossini@blindglobe.net (A.J. Rossini) writes:
Bugzilla is a pain-in-the-arse to maintain, unless they've improved it in the last 9 months. Just my two cents...
It might have (and I wouldn't mind replacing Jitterbug with something that is actually maintained itself!), but there's another rear-end problem... Who's going to do the actual work?
That's my point. Maintainance is critical, and any time spent on systems administration (systems in the generic sense, here it's bug-tracking) is less time spent on other more useful, or interesting, or high-payoff items. I'm probably not the only one who wished Peter had more time for other things... best, -tony
rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}
A.J. Rossini writes:
That's my point. Maintainance is critical, and any time spent on systems administration (systems in the generic sense, here it's bug-tracking) is less time spent on other more useful, or interesting, or high-payoff items. I'm probably not the only one who wished Peter had more time for other things...
I agree that it would probably take some time to set up the new system, and maintaince at the beggining would take time too. On the other hand I believe that the producitivy gain fostered by the new system on the long term could perhaps counterbalance the initial effort. The main effort would be setting up the new system and handdling accounts to the package mantainers, after that, most of it would be left to the users of the system. I don't have much experience with administrating the Bugzilla system yet but I'm installing in my box and will post my experiences to the list. -- []'s Fernando Henrique Ferraz P. da Rosa
Fernando Henrique Ferraz <feferraz@ime.usp.br> writes:
A.J. Rossini writes:
That's my point. Maintainance is critical, and any time spent on systems administration (systems in the generic sense, here it's bug-tracking) is less time spent on other more useful, or interesting, or high-payoff items. I'm probably not the only one who wished Peter had more time for other things...
I agree that it would probably take some time to set up the new system,
and maintaince at the beggining would take time too. On the other hand I believe
that the producitivy gain fostered by the new system on the long term could perhaps
counterbalance the initial effort. The main effort would be setting up the new
system and handdling accounts to the package mantainers, after that, most of it
would be left to the users of the system. I don't have much experience with
administrating the Bugzilla system yet but I'm installing in my box and will post
my experiences to the list.
Excellent! It would be great to have a place to track bugs for packages. Best of luck! best, -tony
rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}
On Sat, 17 Jan 2004 09:33:10 -0500, you wrote:
I also share your views about namespaces. These have caused numerous problems for me. It would be nice to have more of a mechanism to put "feelers" out to the R user community when major changes are planned.
Changes always show up in r-devel (the main CVS branch, not the mailing list) first. Package developers should be keeping a relatively up to date copy of it around if they're doing things that are likely to break.
In Hmisc and Design I reference several functions that were not exported from packages that now use namespaces. There is an elegant solution with the package:::function notation,
I'd recommend avoiding that as much as you can. If things aren't exported from a package, then the package writer is likely to feel free to change them without warning. It's much better to convince the package writer that they missed something in their export list.
but I have been unable to use this solution because I use one code base for all versions of R and S-Plus. This notation generates syntax errors in all but late versions of R.
I think it's reasonable to restrict the availability of updates to your packages to the currently released R version. There are reasons why people might not be up to date (e.g. only doing upgrades at a specific time of year), but they'll still have access via CRAN to older versions of your package. Compatibility with S-PLUS is a lot harder, of course. Duncan Murdoch
On Sun, 18 Jan 2004 18:47:52 -0500
Duncan Murdoch <dmurdoch@pair.com> wrote:
On Sat, 17 Jan 2004 09:33:10 -0500, you wrote:
I also share your views about namespaces. These have caused numerous problems for me. It would be nice to have more of a mechanism to put "feelers" out to the R user community when major changes are planned.
Changes always show up in r-devel (the main CVS branch, not the mailing list) first. Package developers should be keeping a relatively up to date copy of it around if they're doing things that are likely to break.
I need to do that more often. But sometimes it's hard to know what things I do that are likely to break. That's where there needs to be some other mechanism for user communications.
In Hmisc and Design I reference several functions that were not exported from packages that now use namespaces. There is an elegant solution with the package:::function notation,
I'd recommend avoiding that as much as you can. If things aren't exported from a package, then the package writer is likely to feel free to change them without warning. It's much better to convince the package writer that they missed something in their export list.
That's a good solution in general, but I could see legitimate disagreements about what should be exported, so this will not always solve the problem.
but I have been unable to use this solution because I use one code base for all versions of R and S-Plus. This notation generates syntax errors in all but late versions of R.
I think it's reasonable to restrict the availability of updates to your packages to the currently released R version. There are reasons why people might not be up to date (e.g. only doing upgrades at a specific time of year), but they'll still have access via CRAN to older versions of your package. Compatibility with S-PLUS is a lot harder, of course.
Yes that's the real problem. Thanks Duncan -Frank
Duncan Murdoch
---
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
On Mon, 19 Jan 2004, Frank E Harrell Jr wrote:
On Sun, 18 Jan 2004 18:47:52 -0500 Duncan Murdoch <dmurdoch@pair.com> wrote:
On Sat, 17 Jan 2004 09:33:10 -0500, you wrote:
I also share your views about namespaces. These have caused numerous problems for me. It would be nice to have more of a mechanism to put "feelers" out to the R user community when major changes are planned.
Changes always show up in r-devel (the main CVS branch, not the mailing list) first. Package developers should be keeping a relatively up to date copy of it around if they're doing things that are likely to break.
I need to do that more often. But sometimes it's hard to know what things I do that are likely to break. That's where there needs to be some other mechanism for user communications.
Well, there is a NEWS file that is worth consulting, and we (Kurt in particular) run all the CRAN packages after every major change and daily. See http://cran.r-project.org/src/contrib/checkSummary.html. We do also tend to tell package authors directly if their packages break, at least if they were previously warning-free. It seems the sort of thing you do is to call methods directly where you could equally well call the generic, since that is what is currently failing in Design and Hmisc (if survfit.km is a survfit method).
In Hmisc and Design I reference several functions that were not exported from packages that now use namespaces. There is an elegant
That's not showing up in failures on the tests under R-patched. Of the listed dependencies only grid, lattice and survival have namespaces, and only survival has been added since 1.8.1. (I suspect the R-patched tests are against the 1.8.1 versions of the recommended packages, not the current versions.)
solution with the package:::function notation,
I'd recommend avoiding that as much as you can. If things aren't exported from a package, then the package writer is likely to feel free to change them without warning. It's much better to convince the package writer that they missed something in their export list.
That's a good solution in general, but I could see legitimate disagreements about what should be exported, so this will not always solve the problem.
I think it does. If the package writer wants a function to be private, would-be users should respect that decision. Most of the cases we have encountered have been calling methods directly rather than coercing objects to the right class and calling the generic. In extremis, copy (with permission) the function you want from the package sources and rename it. Unless I made a mistake there are no current uses of ::: in CRAN packages, and there are very few in base R (and quite a lot of the methods::: should probably better be methods::). Brian
Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Mon, 19 Jan 2004 07:51:34 -0500, Frank E Harrell Jr <feh3k@spamcop.net> wrote :
On Sun, 18 Jan 2004 18:47:52 -0500 Duncan Murdoch <dmurdoch@pair.com> wrote:
Changes always show up in r-devel (the main CVS branch, not the mailing list) first. Package developers should be keeping a relatively up to date copy of it around if they're doing things that are likely to break.
I need to do that more often. But sometimes it's hard to know what things I do that are likely to break. That's where there needs to be some other mechanism for user communications.
Generally when people know something is likely to cause trouble, there's a posting to this mailing list --- but it's easy to overlook it among all the other traffic. To make it a bit easier for Windows package developers to test against r-devel, I'm going to keep a reasonably up-to-date Windows build online. For now it's on CRAN, but there are concerns about the impact on the mirrors of the extra file size and download traffic. However, if I have to move it the links on CRAN will be updated, so it's safe to say you should start looking at <http://cran.r-project.org/bin/windows/base>. Duncan Murdoch
Duncan Murdoch wrote:
but they'll still have access via CRAN to older versions of your package.
Are you sure about that? I can't find old contributed packages, but it wouldn't be the first time I've missed something obvious. (src/contrib/Old is hardly a complete archive.) Paul
"PaulG" == Paul Gilbert <pgilbert@bank-banque-canada.ca>
on Mon, 19 Jan 2004 11:18:47 -0500 writes:
PaulG> Duncan Murdoch wrote:
>> but they'll still have access via CRAN to
>> older versions of your package.
PaulG> Are you sure about that? I can't find old contributed packages, but it
PaulG> wouldn't be the first time I've missed something obvious.
yes. You've missed src/contrib/Archive/
PaulG> (src/contrib/Old is hardly a complete archive.)
Martin
On Mon, 19 Jan 2004 14:09:58 +0000 (GMT)
Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote:
On Mon, 19 Jan 2004, Frank E Harrell Jr wrote:
On Sun, 18 Jan 2004 18:47:52 -0500 Duncan Murdoch <dmurdoch@pair.com> wrote:
On Sat, 17 Jan 2004 09:33:10 -0500, you wrote:
I also share your views about namespaces. These have caused numerous problems for me. It would be nice to have more of a mechanism to put"feelers" out to the R user community when major changes are planned.
Changes always show up in r-devel (the main CVS branch, not the mailing list) first. Package developers should be keeping a relatively up to date copy of it around if they're doing things that are likely to break.
I need to do that more often. But sometimes it's hard to know what things I do that are likely to break. That's where there needs to be some other mechanism for user communications.
Well, there is a NEWS file that is worth consulting, and we (Kurt in particular) run all the CRAN packages after every major change and daily. See http://cran.r-project.org/src/contrib/checkSummary.html. We do also tend to tell package authors directly if their packages break, at least if they were previously warning-free.
I will start checking NEWS. The kind of news I need though is more about bugs that do not cause the package to break.
It seems the sort of thing you do is to call methods directly where you could equally well call the generic, since that is what is currently failing in Design and Hmisc (if survfit.km is a survfit method).
The point of calling methods directly is efficiency, otherwise I would not use this dirty practice. When bootstrapping or otherwise calling methods repeatedly, I seek the lowest level functions for speed. This conflicts with the namespace idea. I think this should have been taken into consideration when designing namespaces.
In Hmisc and Design I reference several functions that were not exported from packages that now use namespaces. There is an elegant
That's not showing up in failures on the tests under R-patched. Of the listed dependencies only grid, lattice and survival have namespaces, and only survival has been added since 1.8.1. (I suspect the R-patched tests are against the 1.8.1 versions of the recommended packages, not the current versions.)
Yes, those are the ones.
solution with the package:::function notation,
I'd recommend avoiding that as much as you can. If things aren't exported from a package, then the package writer is likely to feel free to change them without warning. It's much better to convince the package writer that they missed something in their export list.
Right
That's a good solution in general, but I could see legitimate disagreements about what should be exported, so this will not always solve the problem.
I think it does. If the package writer wants a function to be private, would-be users should respect that decision. Most of the cases we have encountered have been calling methods directly rather than coercing objects to the right class and calling the generic. In extremis, copy (with permission) the function you want from the package sources and rename it.
That is a possibility. None of the approaches we've named are without maintenance problems. Frank
Unless I made a mistake there are no current uses of ::: in CRAN packages, and there are very few in base R (and quite a lot of the methods::: should probably better be methods::). Brian -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
---
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
On Mon, 19 Jan 2004 11:18:47 -0500, Paul Gilbert <pgilbert@bank-banque-canada.ca> wrote :
Duncan Murdoch wrote:
but they'll still have access via CRAN to older versions of your package.
Are you sure about that? I can't find old contributed packages, but it wouldn't be the first time I've missed something obvious. (src/contrib/Old is hardly a complete archive.)
Not sure about source versions. Windows binaries are available from 1.6 on, in <http://www.cran.mirrors.pair.com/bin/windows/contrib>. We have talked about when to drop 1.6; I'd like to keep it online for a year after the next version comes out (which would mean 1.6 goes away this spring, 1.7 in the fall, etc.). This means that someone has at least a year to upgrade their R installation. Duncan Murdoch
On Mon, 19 Jan 2004, Prof Brian Ripley wrote:
It seems the sort of thing you do is to call methods directly where you could equally well call the generic, since that is what is currently failing in Design and Hmisc (if survfit.km is a survfit method).
Technically survfit.km() isn't a survival method, since it's called directly by survfit() rather than by UseMethod, but it works like one. It's an internal function that has never been documented, which is why it wasn't exported. If people want it, I can export it. I hadn't heard from anyone wanting it exported. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley@u.washington.edu University of Washington, Seattle
Are you sure there is a measurable difference in calling methods directly? The dispatch overhead on formula (one of your uses) appears to be about 10 microseconds. (Note, negligible even for 10,000 bootstraps.) I believe we took the real performance penalties into account (and namespaces had performance pluses as well as minuses).
On Mon, 19 Jan 2004, Frank E Harrell Jr wrote:
On Mon, 19 Jan 2004 14:09:58 +0000 (GMT) Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote:
On Mon, 19 Jan 2004, Frank E Harrell Jr wrote:
On Sun, 18 Jan 2004 18:47:52 -0500 Duncan Murdoch <dmurdoch@pair.com> wrote:
On Sat, 17 Jan 2004 09:33:10 -0500, you wrote:
I also share your views about namespaces. These have caused numerous problems for me. It would be nice to have more of a mechanism to put"feelers" out to the R user community when major changes are planned.
Changes always show up in r-devel (the main CVS branch, not the mailing list) first. Package developers should be keeping a relatively up to date copy of it around if they're doing things that are likely to break.
I need to do that more often. But sometimes it's hard to know what things I do that are likely to break. That's where there needs to be some other mechanism for user communications.
Well, there is a NEWS file that is worth consulting, and we (Kurt in particular) run all the CRAN packages after every major change and daily. See http://cran.r-project.org/src/contrib/checkSummary.html. We do also tend to tell package authors directly if their packages break, at least if they were previously warning-free.
I will start checking NEWS. The kind of news I need though is more about bugs that do not cause the package to break.
It seems the sort of thing you do is to call methods directly where you could equally well call the generic, since that is what is currently failing in Design and Hmisc (if survfit.km is a survfit method).
The point of calling methods directly is efficiency, otherwise I would not use this dirty practice. When bootstrapping or otherwise calling methods repeatedly, I seek the lowest level functions for speed. This conflicts with the namespace idea. I think this should have been taken into consideration when designing namespaces.
In Hmisc and Design I reference several functions that were not exported from packages that now use namespaces. There is an elegant
That's not showing up in failures on the tests under R-patched. Of the listed dependencies only grid, lattice and survival have namespaces, and only survival has been added since 1.8.1. (I suspect the R-patched tests are against the 1.8.1 versions of the recommended packages, not the current versions.)
Yes, those are the ones.
solution with the package:::function notation,
I'd recommend avoiding that as much as you can. If things aren't exported from a package, then the package writer is likely to feel free to change them without warning. It's much better to convince the package writer that they missed something in their export list.
Right
That's a good solution in general, but I could see legitimate disagreements about what should be exported, so this will not always solve the problem.
I think it does. If the package writer wants a function to be private, would-be users should respect that decision. Most of the cases we have encountered have been calling methods directly rather than coercing objects to the right class and calling the generic. In extremis, copy (with permission) the function you want from the package sources and rename it.
That is a possibility. None of the approaches we've named are without maintenance problems.
What is the problem with coercing to the right class and calling the generic.
Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Mon, 19 Jan 2004 17:17:39 +0000 (GMT)
Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote:
Are you sure there is a measurable difference in calling methods directly? The dispatch overhead on formula (one of your uses) appears to be about 10 microseconds. (Note, negligible even for 10,000 bootstraps.) I believe we took the real performance penalties into account (and namespaces had performance pluses as well as minuses).
Brian, I don't worry about dispatch overhead. I do worry about overhead of assembling model matrices, removing rows with NAs, etc. -Frank
On Mon, 19 Jan 2004 09:10:36 -0800 (PST)
Thomas Lumley <tlumley@u.washington.edu> wrote:
On Mon, 19 Jan 2004, Prof Brian Ripley wrote:
It seems the sort of thing you do is to call methods directly where you could equally well call the generic, since that is what is currently failing in Design and Hmisc (if survfit.km is a survfit method).
Technically survfit.km() isn't a survival method, since it's called directly by survfit() rather than by UseMethod, but it works like one. It's an internal function that has never been documented, which is why it wasn't exported. If people want it, I can export it. I hadn't heard from anyone wanting it exported.
Thomas - I would appreciate getting survfit.km and survreg.fit exported. Thanks, Frank
On Mon, 19 Jan 2004, Frank E Harrell Jr wrote:
On Mon, 19 Jan 2004 17:17:39 +0000 (GMT) Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote:
Are you sure there is a measurable difference in calling methods directly? The dispatch overhead on formula (one of your uses) appears to be about 10 microseconds. (Note, negligible even for 10,000 bootstraps.) I believe we took the real performance penalties into account (and namespaces had performance pluses as well as minuses).
Brian, I don't worry about dispatch overhead. I do worry about overhead of assembling model matrices, removing rows with NAs, etc. -Frank
Here is what you said:
The point of calling methods directly is efficiency, otherwise I would not use this dirty practice. When bootstrapping or otherwise calling methods
and your code is failing in R-devel because you are calling formula.default. So, *why* are you calling formula.default? Calling e.g. glm.fit not glm is not to do with methods, and apparently survfit.km is not a method.
Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Mon, 19 Jan 2004 18:13:42 +0000 (GMT)
Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote:
On Mon, 19 Jan 2004, Frank E Harrell Jr wrote:
On Mon, 19 Jan 2004 17:17:39 +0000 (GMT) Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote:
Are you sure there is a measurable difference in calling methods directly? The dispatch overhead on formula (one of your uses) appears to be about 10 microseconds. (Note, negligible even for 10,000 bootstraps.) I believe we took the real performance penalties into account (and namespaces had performance pluses as well as minuses).
Brian, I don't worry about dispatch overhead. I do worry about overhead of assembling model matrices, removing rows with NAs, etc. -Frank
Here is what you said:
The point of calling methods directly is efficiency, otherwise I would not use this dirty practice. When bootstrapping or otherwise calling methods
and your code is failing in R-devel because you are calling formula.default. So, *why* are you calling formula.default?
That was fixed 16Dec03 for the next version to be submitted to CRAN. I no longer call formula.default.
Calling e.g. glm.fit not glm is not to do with methods, and apparently survfit.km is not a method.
You're right, it just has to do with survfit.km needing to be exported.
Frank
---
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
Peter Dalgaard wrote:
It might have (and I wouldn't mind replacing Jitterbug with something that is actually maintained itself!),
Being somewhat overwhelmed at the moment, I read this quickly as "maintains itself," and thought that would be pretty neat. Back on earth, and speaking with no inside knowledge, I have the impression someone has to spend a lot of time removing R bug reports that really never should have been admitted into the repository. Since this is a wish list, it would be nice if there were a system to allow reports to sit in a temporary "unconfirmed bug reports" area where all users could add comments like "confirmed in R-patched on zzz," "works in Linux," "fixed in R-devel," "here is a work around," "here is a patch," "this is a FAQ," or "read the documentation." Then some volunteer or the package maintainer would occasionally need to sort these, but a lot more of the work would be done by the larger R community. Paul