converting a list of loglin terms to a model formula

Call:
loglm(formula = form, data = x)       ### I want formula = ~Hair:Eye + Sex here
Since your function made the call
   loglm(form, data=x)
the 'call' component of output is going to show 'form', not '~ Hair:Eye + Sex'.
You can use bquote to pre-evaluate the formula=form argument to get the call
to look nicer, as in:
  form <- mpg ~ wt + hp
  eval(bquote(lm(.(form), data=mtcars)))
instead of
   lm(form, data=mtcars)

In your loglin2formula, I would make by environment of the generated formula
the environment of the caller of loglin2formula (or somewhere else if the user wishes)
by adding the argument
   env = parent.frame()
and replacing
   as.formula( sprintf(...) )
with
   formula( sprintf(...), env=env)
(I don't think you've run into any problems related to having an irrelevant
environment attached to the formula, but they will happen if the formula
involves any variable names that happen to be in loglin2formula.)

I would also change the loglin2formula so it worked with non-syntactic names.
Wrapping them with backquotes would probably do it, but I may have missed
something in the back and forth between character strings and formula.

As for loglin2string, you complain that it works when given a list of character
vectors but not when is given a formula.  That is not surprising.  Did you mean
for test_loglm to pass it 'margins' instead of 'form'?

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
Of Michael Friendly
Sent: Friday, July 05, 2013 12:21 PM
To: Henrique Dallazuanna
Cc: R-help
Subject: Re: [R] converting a list of loglin terms to a model formula

On 7/3/2013 6:38 PM, Henrique Dallazuanna wrote:
Try this:

 as.formula(sprintf(" ~ %s", do.call(paste, c(lapply(mutual(3), paste,
collapse = ":"), sep = "+"))))

Thanks for this.  I encapsulated this as a function, loglin2formula()
and a related function,
loglin2string() to give a character string representation.
They seem to work when used from the command line, but don't give
what I need when I use it in another function.  I think it has something
to do with the environment
for the formula or how I pass it to MASS::loglm in my function
test_loglm (below).

I'll demonstrate the problem first, then give the functions & test cases
I'm using as
runnable code.

 > joint(3, table=HairEyeColor)
$term1
[1] "Hair" "Eye"

$term2
[1] "Sex"

 > loglin2formula(joint(3, table=HairEyeColor))
~Hair:Eye + Sex
<environment: 0x0884a640>
 > loglin2string(joint(3, table=HairEyeColor))
[1] "[Hair,Eye] [Sex]"

These look like what I want, more or less; but, when I use these in the
function test_loglm
(also below) , the formula doesn't get resolved when I call loglm (it
appears as
formula = form), and the result of loglin2string gets garbled. The
numerical results are
correct; I'm concerned about the labeling in the computed loglm object.

 > test_loglm(HairEyeColor, type='joint')
formula:
~Hair:Eye + Sex
<environment: 0x08842de0>
model.string:
[1] "[~] [+,Hair:Eye,Sex]"                   ### I want "[Hair,Eye]
[Sex]" here
model:
Call:
loglm(formula = form, data = x)       ### I want formula = ~Hair:Eye +
Sex here

Statistics:
                       X^2 df  P(> X^2)
Likelihood Ratio 19.85656 15 0.1775045
Pearson          19.56712 15 0.1891745
 >
## --- functions and test cases, should be runnable (mod line folding)
--- ###

#' convert a loglin model to a model formula for loglm

#' @param  x a list of terms in a loglinear model, such as returned by
\code{joint}, \code{conditional}, \dots
#' @source Code from Henrique Dallazuanna, <wwwhsd at gmail.com>, R-help
7-4-2013

loglin2formula <- function(x) {
     terms <- lapply(x, paste, collapse = ":")
     as.formula(sprintf(" ~ %s", do.call(paste, c(terms, sep = "+"))))
}

#' convert a loglin model to a string, using bracket notation for the
high-order terms

#' @param x a list of terms in a loglinear model, such as returned by
\code{joint}, \code{conditional}, \dots
#' @param brackets characters to use to surround model terms.
#' @param sep characters used to separate factor names within a term
#' @param collapse  characters used to separate terms
#' @param abbrev

loglin2string <- function(x, brackets = c('[', ']'), sep=',', collapse='
', abbrev) {
     if (length(brackets)==1 && (nchar(brackets)>1)) brackets <-
unlist(strsplit(brackets, ""))
     terms <- lapply(x, paste, collapse=sep)
     terms <- paste(brackets[1], terms, brackets[2], sep='')
     paste(terms, collapse= ' ')
}

#' models of joint independence, of some factors wrt one or more other
factors

#' @param nf number of factors for which to generate model
#' @param table a contingency table used for factor names, typically the
output from \code{\link[base]{table}}
#' @param factors names of factors used in the model when \code{table}
is not specified
#' @param with    indices of the factors against which others are
considered jointly independent
#' @export

joint <- function(nf, table=NULL, factors=1:nf, with=nf) {
     if (!is.null(table)) factors <- names(dimnames(table))
     if (nf == 1) return (list(term1=factors[1]))
     if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
     others <- setdiff(1:nf, with)
     result <- list(term1=factors[others], term2=factors[with])
     result
   }

### Test using these in another function

test_loglm <- function(
     x, type = c("joint", "conditional"),
     ...)
{
     nf <- length(dim(x))
     factors <- names(dimnames(x))
     type <- match.arg(type)
     margins <- switch(type,
       "joint" = joint(nf, factors=factors),
       "conditional" = conditional(nf, factors=factors)
       )

   form <- loglin2formula(margins);
   cat("formula:\n"); print(form)
   model.string <- loglin2string(form)
   cat("model.string:\n"); print(model.string)
   mod <- loglm(formula=form, data=x)
   cat("model:\n"); print(mod)
   invisible(list(form=form, model.string=model.string, mod=mod))
}

## tests

loglin2formula(joint(3, table=HairEyeColor))
loglin2string(joint(3, table=HairEyeColor))
test_loglm(HairEyeColor, type='joint')

On Wed, Jul 3, 2013 at 6:55 PM, Michael Friendly <friendly at yorku.ca
<mailto:friendly at yorku.ca>> wrote:

    I'm developing some functions to create symbolic specifications
    for loglinear models of different types.
    I don't really know how to 'compute' with model formulas, so I've
    done this in the notation
    for stats::loglin(), which is a list of high-order terms in the model.

    What I'd like is a function to turn the results of these into a
    model formula, suitable for
    MASS::loglm.  That's the reverse of what loglm does.

    For example, the simplest versions of models for 3-way tables for
    joint,
     conditional, and marginal independence can be computed as
    follows. After each, I indicated
    the WANTED model formula I'd like from the result

    > joint(3)
    $term1
    [1] 1 2

    $term2
    [1] 3

    WANTED:  ~ 1:2 + 3

    > condit(3)
    $term1
    [1] 1 3

    $term2
    [1] 2 3

    WANTED: ~ 1:2 + 2:3

    > mutual(3)
    $term1
    [1] 1

    $term2
    [1] 2

    $term3
    [1] 3

    WANTED: ~ 1 + 2 + 3

    In case anyone want to play with the code, here are the current,
    not too elegant definitions
    of the functions, and some further test cases,

    # models of joint independence
      joint <- function(nf, factors=1:nf, with=nf) {
        if (nf == 1) return (list(term1=factors[1]))
        if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
        others <- setdiff(1:nf, with)
        result <- list(term1=factors[others], term2=factors[with])
        result
      }
    # conditional independence
      condit <- function(nf, factors=1:nf, with=nf) {
        if (nf == 1) return (list(term1=factors[1]))
        if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
        main <- setdiff(1:nf, with)
        others <- matrix(factors[with], length(with), length(main))
        result <- rbind(factors[main], others)
        result <- as.list(as.data.frame(result, stringsAsFactors=FALSE))
        names(result) <- paste('term', 1:length(result), sep='')
        result
      }
    # mutual independence
      mutual <- function(nf, factors=1:nf) {
        result <- sapply(factors[1:nf], list)
        names(result) <- paste('term', 1:length(result), sep='')
        result
      }

    ### some comparisons

    loglin(HairEyeColor, list(c(1, 2), c(1, 3), c(2, 3)))$lrt
    loglm(~1:2 + 1:3 +2:3, HairEyeColor)

    # use factor names
    joint(3, factors=names(dimnames(HairEyeColor)))
    condit(3, factors=names(dimnames(HairEyeColor)))

    loglin(HairEyeColor, joint(3))$lrt
    loglm(~1:2 + 3, HairEyeColor)

    loglin(HairEyeColor, condit(3))$lrt
    loglm(~1:3 + 2:3, HairEyeColor)

    --
    Michael Friendly     Email: friendly AT yorku DOT ca
    Professor, Psychology Dept. & Chair, Quantitative Methods
    York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
    4700 Keele Street    Web: http://www.datavis.ca
    Toronto, ONT  M3J 1P3 CANADA

    ______________________________________________
    R-help at r-project.org <mailto:R-help at r-project.org> mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

--
Henrique Dallazuanna
Curitiba-Paran?-Brasil
25? 25' 40" S 49? 16' 22" O

--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

	[[alternative HTML version deleted]]

converting a list of loglin terms to a model formula

Thread (4 messages)