-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
Of Michael Friendly
Sent: Friday, July 05, 2013 12:21 PM
To: Henrique Dallazuanna
Cc: R-help
Subject: Re: [R] converting a list of loglin terms to a model formula
On 7/3/2013 6:38 PM, Henrique Dallazuanna wrote:
Try this:
as.formula(sprintf(" ~ %s", do.call(paste, c(lapply(mutual(3), paste,
collapse = ":"), sep = "+"))))
Thanks for this. I encapsulated this as a function, loglin2formula()
and a related function,
loglin2string() to give a character string representation.
They seem to work when used from the command line, but don't give
what I need when I use it in another function. I think it has something
to do with the environment
for the formula or how I pass it to MASS::loglm in my function
test_loglm (below).
I'll demonstrate the problem first, then give the functions & test cases
I'm using as
runnable code.
> joint(3, table=HairEyeColor)
$term1
[1] "Hair" "Eye"
$term2
[1] "Sex"
> loglin2formula(joint(3, table=HairEyeColor))
~Hair:Eye + Sex
<environment: 0x0884a640>
> loglin2string(joint(3, table=HairEyeColor))
[1] "[Hair,Eye] [Sex]"
These look like what I want, more or less; but, when I use these in the
function test_loglm
(also below) , the formula doesn't get resolved when I call loglm (it
appears as
formula = form), and the result of loglin2string gets garbled. The
numerical results are
correct; I'm concerned about the labeling in the computed loglm object.
> test_loglm(HairEyeColor, type='joint')
formula:
~Hair:Eye + Sex
<environment: 0x08842de0>
model.string:
[1] "[~] [+,Hair:Eye,Sex]" ### I want "[Hair,Eye]
[Sex]" here
model:
Call:
loglm(formula = form, data = x) ### I want formula = ~Hair:Eye +
Sex here
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 19.85656 15 0.1775045
Pearson 19.56712 15 0.1891745
## --- functions and test cases, should be runnable (mod line folding)
--- ###
#' convert a loglin model to a model formula for loglm
#' @param x a list of terms in a loglinear model, such as returned by
\code{joint}, \code{conditional}, \dots
#' @source Code from Henrique Dallazuanna, <wwwhsd at gmail.com>, R-help
7-4-2013
loglin2formula <- function(x) {
terms <- lapply(x, paste, collapse = ":")
as.formula(sprintf(" ~ %s", do.call(paste, c(terms, sep = "+"))))
}
#' convert a loglin model to a string, using bracket notation for the
high-order terms
#' @param x a list of terms in a loglinear model, such as returned by
\code{joint}, \code{conditional}, \dots
#' @param brackets characters to use to surround model terms.
#' @param sep characters used to separate factor names within a term
#' @param collapse characters used to separate terms
#' @param abbrev
loglin2string <- function(x, brackets = c('[', ']'), sep=',', collapse='
', abbrev) {
if (length(brackets)==1 && (nchar(brackets)>1)) brackets <-
unlist(strsplit(brackets, ""))
terms <- lapply(x, paste, collapse=sep)
terms <- paste(brackets[1], terms, brackets[2], sep='')
paste(terms, collapse= ' ')
}
#' models of joint independence, of some factors wrt one or more other
factors
#' @param nf number of factors for which to generate model
#' @param table a contingency table used for factor names, typically the
output from \code{\link[base]{table}}
#' @param factors names of factors used in the model when \code{table}
is not specified
#' @param with indices of the factors against which others are
considered jointly independent
#' @export
joint <- function(nf, table=NULL, factors=1:nf, with=nf) {
if (!is.null(table)) factors <- names(dimnames(table))
if (nf == 1) return (list(term1=factors[1]))
if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
others <- setdiff(1:nf, with)
result <- list(term1=factors[others], term2=factors[with])
result
}
### Test using these in another function
test_loglm <- function(
x, type = c("joint", "conditional"),
...)
{
nf <- length(dim(x))
factors <- names(dimnames(x))
type <- match.arg(type)
margins <- switch(type,
"joint" = joint(nf, factors=factors),
"conditional" = conditional(nf, factors=factors)
)
form <- loglin2formula(margins);
cat("formula:\n"); print(form)
model.string <- loglin2string(form)
cat("model.string:\n"); print(model.string)
mod <- loglm(formula=form, data=x)
cat("model:\n"); print(mod)
invisible(list(form=form, model.string=model.string, mod=mod))
}
## tests
loglin2formula(joint(3, table=HairEyeColor))
loglin2string(joint(3, table=HairEyeColor))
test_loglm(HairEyeColor, type='joint')
On Wed, Jul 3, 2013 at 6:55 PM, Michael Friendly <friendly at yorku.ca
<mailto:friendly at yorku.ca>> wrote:
I'm developing some functions to create symbolic specifications
for loglinear models of different types.
I don't really know how to 'compute' with model formulas, so I've
done this in the notation
for stats::loglin(), which is a list of high-order terms in the model.
What I'd like is a function to turn the results of these into a
model formula, suitable for
MASS::loglm. That's the reverse of what loglm does.
For example, the simplest versions of models for 3-way tables for
joint,
conditional, and marginal independence can be computed as
follows. After each, I indicated
the WANTED model formula I'd like from the result
$term1
[1] 1 2
$term2
[1] 3
WANTED: ~ 1:2 + 3
$term1
[1] 1 3
$term2
[1] 2 3
WANTED: ~ 1:2 + 2:3
$term1
[1] 1
$term2
[1] 2
$term3
[1] 3
WANTED: ~ 1 + 2 + 3
In case anyone want to play with the code, here are the current,
not too elegant definitions
of the functions, and some further test cases,
# models of joint independence
joint <- function(nf, factors=1:nf, with=nf) {
if (nf == 1) return (list(term1=factors[1]))
if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
others <- setdiff(1:nf, with)
result <- list(term1=factors[others], term2=factors[with])
result
}
# conditional independence
condit <- function(nf, factors=1:nf, with=nf) {
if (nf == 1) return (list(term1=factors[1]))
if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
main <- setdiff(1:nf, with)
others <- matrix(factors[with], length(with), length(main))
result <- rbind(factors[main], others)
result <- as.list(as.data.frame(result, stringsAsFactors=FALSE))
names(result) <- paste('term', 1:length(result), sep='')
result
}
# mutual independence
mutual <- function(nf, factors=1:nf) {
result <- sapply(factors[1:nf], list)
names(result) <- paste('term', 1:length(result), sep='')
result
}
### some comparisons
loglin(HairEyeColor, list(c(1, 2), c(1, 3), c(2, 3)))$lrt
loglm(~1:2 + 1:3 +2:3, HairEyeColor)
# use factor names
joint(3, factors=names(dimnames(HairEyeColor)))
condit(3, factors=names(dimnames(HairEyeColor)))
loglin(HairEyeColor, joint(3))$lrt
loglm(~1:2 + 3, HairEyeColor)
loglin(HairEyeColor, condit(3))$lrt
loglm(~1:3 + 2:3, HairEyeColor)
--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street Web: http://www.datavis.ca
Toronto, ONT M3J 1P3 CANADA