Skip to content

expand.grid and the first level of a factor

4 messages · Giovanni Marchetti, John Fox, Uwe Ligges +1 more

#
I do not understand this behaviour of expand.grid:
[1] b a b a
Levels: b a
[1] b a
Levels: a b

Why the first level of the factor x depends on the number
of arguments of expand.grid? Apparently, I can set 
the order of the levels only when the number of 
arguments in > 1. In the second example, the order 
is lexicographic.

-- Giovanni
#
Dear Giovanni,
At 12:53 PM 5/3/2003 +0000, Giovanni Marchetti wrote:
As the help for expand.grid states, expand.grid generates all combinations 
of values of its arguments. Take a look at the entirety of the result:

 > expand.grid(x = c("b", "a"), y = c(1, 2))
   x y
1 b 1
2 a 1
3 b 2
4 a 2


Compare, for example, to

         expand.grid(x = c("b", "a"), y = c(1, 2), z=c(TRUE, FALSE))

which generates 8 rows.

I hope that this helps,
  John


-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox
#
Giovanni Marchetti wrote:
It depends on the number of arguments, because of the implementation 
(look into the code):

In principle, expand.grid(x = c("b", "a")) does the following:

  x <- c("b", "a")
  factor(x)

whereas for expand.grid(x = c("b", "a"), y = c(1, 2)), the levels will 
be specified as in:

  factor(x, levels = unique(x))

Hence the difference.


Uwe Ligges
#

        
UweL> Giovanni Marchetti wrote:
>> I do not understand this behaviour of expand.grid:
    >> 
    >> 
    >>> expand.grid(x = c("b", "a"), y = c(1, 2))$x
    >> 
    >> [1] b a b a
    >> Levels: b a
    >> 
    >>> expand.grid(x = c("b", "a"))$x
    >> 
    >> [1] b a
    >> Levels: a b
    >> 
    >> Why the first level of the factor x depends on the number
    >> of arguments of expand.grid? Apparently, I can set 
    >> the order of the levels only when the number of 
    >> arguments in > 1. In the second example, the order 
    >> is lexicographic.
    >> 
    >> -- Giovanni


    UweL> It depends on the number of arguments, because of the implementation 
    UweL> (look into the code):

    UweL> In principle, expand.grid(x = c("b", "a")) does the following:

    UweL> x <- c("b", "a")
    UweL> factor(x)

    UweL> whereas for expand.grid(x = c("b", "a"), y = c(1, 2)), the levels will 
    UweL> be specified as in:

    UweL>    factor(x, levels = unique(x))

    UweL> Hence the difference.

which seems not perfect to me.
Factor() itself,
  > str(factor)
  function (x, levels = sort(unique.default(x), na.last = TRUE), 
      labels = levels, exclude = NA, ordered = is.ordered(x))  

does sort the levels by default, and that's what happens in the
one argument case via data.frame().

S-plus 6.1 does the same for factor() but it doesn't sort the
levels of expand.grid() arguments in any case.

I'm just now testing a patch to our expand.grid() which doesn't
treat the one argument case specially as now and seems to cure
the whole "infelicity"...
I can not imagine that anyone's code relies on the current
behavior as opposed to the more consistent one.

Martin