Skip to content

An update method for lists?

12 messages · Martin Maechler, Hadley Wickham, Gabor Grothendieck +3 more

#
Hi,

since lattice uses nested lists in various situations, it has had an
unexported function called updateList for a while, which looks like
function (x, val)
{
    if (is.null(x))
        x <- list()
    if (!is.list(x))
        stop("x must be NULL or a list")
    if (!is.list(val))
        stop("val must be a list")
    xnames <- names(x)
    for (v in names(val)) {
        existing <- v %in% xnames
        if (existing && is.list(x[[v]]) && is.list(val[[v]]))
            x[[v]] <- updateList(x[[v]], val[[v]])
        else x[[v]] <- val[[v]]
    }
    x
}

Basically, it recursively replaces elements that have been specified
in val, leaving the other components alone. I'm not aware of any other
actual situation where this is useful, but it certainly can be, so I
want to export this functionaliy. At least one other person (Gabor)
has also asked for that.

Now, as the name suggests, I think it might be reasonable to export
this as an update method for "list" objects. Depending on what others
(in particular r-core) think, one of these things might happen:

(1) I export it as updateList (or some other name) in lattice
(2) I export it as an S3 method update.list in lattice
(3) It gets added as an S3 method update.list in one of the base packages

The default option is (1), and I guess Sept 19 is the deadline for any
of these to be included in R 2.4.0.

Comments?

Deepayan
#
DeepS> Hi, since lattice uses nested lists in various
    DeepS> situations, it has had an unexported function called
    DeepS> updateList for a while, which looks like

 >>     > lattice:::updateList
 >>     function (x, val)
 >>     {
 >> 	if (is.null(x))
 >> 	    x <- list()
 >> 	if (!is.list(x))
 >> 	    stop("x must be NULL or a list")
 >> 	if (!is.list(val))
 >> 	    stop("val must be a list")
 >> 	xnames <- names(x)
 >> 	for (v in names(val)) {
 >> 	    existing <- v %in% xnames
 >> 	    if (existing && is.list(x[[v]]) && is.list(val[[v]]))
 >> 		x[[v]] <- updateList(x[[v]], val[[v]])
 >> 	    else x[[v]] <- val[[v]]
 >> 	}
 >> 	x
 >>     }
 
[I'm not sure I'd allow NULL for 'x';  typing list() instead of
 NULL is not much more, but when the function name even includes  'list'
 I'd really require a list for 'x']

You could hence collapse the first 6 lines to the single

   stopifnot(is.list(x), is.list(val))


    DeepS> Basically, it recursively replaces elements that have
    DeepS> been specified in val, leaving the other components
    DeepS> alone. I'm not aware of any other actual situation
    DeepS> where this is useful, but it certainly can be, so I
    DeepS> want to export this functionaliy. At least one other
    DeepS> person (Gabor) has also asked for that.

I've had a similar need only recently:
If a list is used to store "defaults" and you want a safe way to
change only a few of the values...
I presume you use this for manipulating the settings of lattice
parts ?

    DeepS> Now, as the name suggests, I think it might be
    DeepS> reasonable to export this as an update method for
    DeepS> "list" objects. Depending on what others (in
    DeepS> particular r-core) think, one of these things might
    DeepS> happen:

    DeepS> (1) I export it as updateList (or some other name) in lattice
    DeepS> (2) I export it as an S3 method update.list in lattice
    DeepS> (3) It gets added as an S3 method update.list in one of the base packages
or
    (4) it gets added as utility function updateList() to
        'utils' {= one of the base packages}

which I'd favor momentarily.
- update() is typically for updating *models* 
- it's not clear that this is *the* method for update()ing a list

I'm also a bit wondering if it wouldn't make sense to change the name to 
something like assignInList().

    DeepS> The default option is (1), and I guess Sept 19 is the deadline for any
    DeepS> of these to be included in R 2.4.0.

Yes, that's true for (3) & (4) are higher if you provide a patch
to R-devel (not R-alpha) which includes a man page ...  [but
don't hurry, I'd like to see other comments]

Martin
#
I use something similar in ggplot (except not recursive):

defaults <- function(x, y)  {
	c(x, y[setdiff(names(y), names(x))])
}

Hadley
#
On 9/15/06, Martin Maechler <maechler at stat.math.ethz.ch> wrote:
Makes sense.
I'll check if lattice needs some fixes with this.
Yes, it's primarily used inside trellis.par.set, but many other places as well.
Yes, that a good option too (certainly better than (1))
I agree. Part of the reason I brought this up is because it is not
clear to me what justifies a new method for an existing generic. An
argument for is that one doesn't introduce yet another function, which
(I thought) might be enough if the other choice is to not have any
method at all.
I'm open to suggestions for the name. I didn't think too much about it
since it was unexported anyway.
-Deepayan
3 days later
#
On 9/15/06, Deepayan Sarkar <deepayan.sarkar at gmail.com> wrote:
Actually, I do need to allow NULL, because update.trellis does things like

update.trellis <- function(object, ..., par.strip.text, ...)
{
    ...
    object$par.strip.text <- updateList(object$par.strip.text, par.strip.text)
    ...
}

where object$par.strip.text may be initially NULL. But I'll do that
inside a lattice wrapper.
I have checked in

https://svn.r-project.org/R-packages/trunk/lattice/R/modifyList.R
https://svn.r-project.org/R-packages/trunk/lattice/man/modifyList.Rd

which I'm happy to offer for inclusion in utils or wherever might seem
appropriate. I'll upload a version of lattice which includes these
late tomorrow if I don't see any more comments by then.

I've changed the name because I wasn't sure if assignInList might be
confusing, as the semantics are different from those of assign (assign
is like 'fix', while this is more like 'edit'). However, any name is
fine with me.

Deepayan
#
It would be nice if the .Rd file had one or more examples
since its not that easy to otherwise understand what it
does.  Regards.
On 9/18/06, Deepayan Sarkar <deepayan.sarkar at gmail.com> wrote:
#
On 9/18/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
Added now, although it's not very realistic.

Deepayan
#
Perhaps it should have another name, be made generic and extended
to functions in which case it would work on the formal arguments.  e.g.

   read.table.comma <- modify(read.table, list(sep = ","))

would return a function that is the same as read.table but
has "," as the default for sep.

Python has something like this called partial:
   http://docs.python.org/dev/whatsnew/pep-309.html
On 9/18/06, Deepayan Sarkar <deepayan.sarkar at gmail.com> wrote:
#
On 9/19/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
I have had similar thoughts, and even requested a variant a while
back, with a possible implementation:

https://stat.ethz.ch/pipermail/r-devel/2006-March/036696.html

Again, the main question is whether it makes sense to introduce this
in `one of the base packages'.

-Deepayan
1 day later
#

        
DeepS> On 9/19/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
>> Perhaps it should have another name, be made generic and extended
    >> to functions in which case it would work on the formal arguments.  e.g.
    >> 
    >> read.table.comma <- modify(read.table, list(sep = ","))
    >> 
    >> would return a function that is the same as read.table but
    >> has "," as the default for sep.
    >> 
    >> Python has something like this called partial:
    >> http://docs.python.org/dev/whatsnew/pep-309.html

    DeepS> I have had similar thoughts, and even requested a variant a while
    DeepS> back, with a possible implementation:

    DeepS> https://stat.ethz.ch/pipermail/r-devel/2006-March/036696.html

yes, indeed.

    DeepS> Again, the main question is whether it makes sense to introduce this
    DeepS> in `one of the base packages'.

for some reason I had hasted to port your  modifyList() not only
to R-devel, but also to R-alpha (now "beta").

[So now you, Deepayan, need to upgrade lattice before release in
 order to make the warning disappear].

For the mid to longer term I agree a modify() generic might be
nicer, and the method for "function" may even be more useful
than the one for list  [[and modifyList() would eventually be deprecated]].

OTOH, for new generics, I'd tend to argue we should be nice R-izens
and use S4 rather than S3.  For the time being that
would mean it had to go into the methods package.

Or is now {with the dramatic S4 improvements in 2.4.0} a good
time to start thinking about making "utils" depend on "methods"
or even "better" [ ;-) I know, not all agree here ]
think about a dependency tree
   base -> methods -> [everything else]  for the base packages ?
so we could merge 'stats4' into 'stats' ?

   [[yes, I'm now going into deap-sea position, not putting my head
     out to be shot easily ... ]

Martin
#
On Thu, 21 Sep 2006, Martin Maechler wrote:
[...]
Please no: that would have a 'dramatic' effect on startup times for the 
'lean and mean' R used as a scripting engine when building R (that often 
uses utils).

Try

gannet% cat > test.R
proc.time()
library(methods)
proc.time()
q()
gannet% env R_DEFAULT_PACKAGES='utils' Rbeta --slave < test.R 
[1] 0.124 0.016 0.125 0.000 0.000
[1] 0.632 0.040 0.657 0.000 0.000

so methods would increase the startup time ca 5x.  There has been no 
discernable progress on the cost of startup of methods when there are no 
S4 generics in the other packages: it is slower than all the other default 
packages put together.  Similarly, there is a considerable cost to S3 
dispatch of having an S4 take-over of S3 generics (especially internal 
generics).

Good 'R-izens' need to consider the environmental impact of their use of 
S4 code.  One thing we may be getting closer to is not making 'methods' a 
default package at all, but loading it only when needed.  (Now S4 objects 
can be identified easliy, that could be checked when one is 
created/loaded.) It is like the on-going debate on 4x4 vehicles (SUVs to 
Americans), which carry a lot of extra weight for off-road features that 
are almost never used.