Skip to content

converting a list of loglin terms to a model formula

4 messages · Henrique Dallazuanna, Michael Friendly, William Dunlap

#
I'm developing some functions to create symbolic specifications for 
loglinear models of different types.
I don't really know how to 'compute' with model formulas, so I've done 
this in the notation
for stats::loglin(), which is a list of high-order terms in the model.

What I'd like is a function to turn the results of these into a model 
formula, suitable for
MASS::loglm.  That's the reverse of what loglm does.

For example, the simplest versions of models for 3-way tables for joint,
  conditional, and marginal independence can be computed as follows. 
After each, I indicated
the WANTED model formula I'd like from the result

 > joint(3)
$term1
[1] 1 2

$term2
[1] 3

WANTED:  ~ 1:2 + 3

 > condit(3)
$term1
[1] 1 3

$term2
[1] 2 3

WANTED: ~ 1:2 + 2:3

 > mutual(3)
$term1
[1] 1

$term2
[1] 2

$term3
[1] 3

WANTED: ~ 1 + 2 + 3

In case anyone want to play with the code, here are the current, not too 
elegant definitions
of the functions, and some further test cases,

# models of joint independence
   joint <- function(nf, factors=1:nf, with=nf) {
     if (nf == 1) return (list(term1=factors[1]))
     if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
     others <- setdiff(1:nf, with)
     result <- list(term1=factors[others], term2=factors[with])
     result
   }
# conditional independence
   condit <- function(nf, factors=1:nf, with=nf) {
     if (nf == 1) return (list(term1=factors[1]))
     if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
     main <- setdiff(1:nf, with)
     others <- matrix(factors[with], length(with), length(main))
     result <- rbind(factors[main], others)
     result <- as.list(as.data.frame(result, stringsAsFactors=FALSE))
     names(result) <- paste('term', 1:length(result), sep='')
     result
   }
# mutual independence
   mutual <- function(nf, factors=1:nf) {
     result <- sapply(factors[1:nf], list)
     names(result) <- paste('term', 1:length(result), sep='')
     result
   }

### some comparisons

loglin(HairEyeColor, list(c(1, 2), c(1, 3), c(2, 3)))$lrt
loglm(~1:2 + 1:3 +2:3, HairEyeColor)

# use factor names
joint(3, factors=names(dimnames(HairEyeColor)))
condit(3, factors=names(dimnames(HairEyeColor)))

loglin(HairEyeColor, joint(3))$lrt
loglm(~1:2 + 3, HairEyeColor)

loglin(HairEyeColor, condit(3))$lrt
loglm(~1:3 + 2:3, HairEyeColor)
1 day later
#
Since your function made the call
   loglm(form, data=x)
the 'call' component of output is going to show 'form', not '~ Hair:Eye + Sex'.
You can use bquote to pre-evaluate the formula=form argument to get the call
to look nicer, as in:
  form <- mpg ~ wt + hp
  eval(bquote(lm(.(form), data=mtcars)))
instead of
   lm(form, data=mtcars)

In your loglin2formula, I would make by environment of the generated formula
the environment of the caller of loglin2formula (or somewhere else if the user wishes)
by adding the argument
   env = parent.frame()
and replacing
   as.formula( sprintf(...) )
with
   formula( sprintf(...), env=env)
(I don't think you've run into any problems related to having an irrelevant
environment attached to the formula, but they will happen if the formula
involves any variable names that happen to be in loglin2formula.)

I would also change the loglin2formula so it worked with non-syntactic names.
Wrapping them with backquotes would probably do it, but I may have missed
something in the back and forth between character strings and formula.

As for loglin2string, you complain that it works when given a list of character
vectors but not when is given a formula.  That is not surprising.  Did you mean
for test_loglm to pass it 'margins' instead of 'form'?

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com