Tables Package Grouping Factors - R-help

Sat, Nov 9, 2013 10:23 AM #

Visually, the elimination of duplicates in hierarchical tables in the 
tabular function from the tables package is very nice. I would like to do 
the same thing with non-crossed factors, but am perhaps missing some 
conceptual element of how this package is used. The following code 
illustrates my goal (I hope):

library(tables)
sampledf <- data.frame( Sex=rep(c("M","F"),each=6)
            , Name=rep(c("John","Joe","Mark","Alice","Beth","Jane"),each=2)
            , When=rep(c("Before","After"),times=6)
            , Weight=c(180,190,190,180,200,200,140,145,150,140,135,135)
            )
sampledf$SexName <- factor( paste( sampledf$Sex, sampledf$Name ) )

# logically, this is the layout
tabular( Name ~ Heading()* When * Weight * Heading()*identity, 
data=sampledf )

# but I want to augment the Name with the Sex but visually group the
# Sex like
#   tabular( Sex*Name ~ Heading()*When * Weight * Heading()*identity, 
data=sampledf )
# would except that there really is no crossing between sexes.
tabular( SexName ~ Heading()*When * Weight * Heading()*identity, 
data=sampledf )
# this repeats the Sex category excessively.


---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

Duncan Murdoch

Sat, Nov 9, 2013 11:03 AM #

On 13-11-09 1:23 PM, Jeff Newmiller wrote:

I don't think it's easy to get what you want.  The basic assumption is 
that factors are crossed.

One hack that would get you what you want in this case is to make up a 
new variable representing person within sex (running from 1 to 3), then 
treating the Name as a statistic.  Of course, this won't work if you 
don't have equal numbers of each sex.

A better solution is more cumbersome, and only works in LaTeX (and maybe 
HTML).  Draw two tables, first for the female subset, then for the male 
subset.  Put out the headers only on the first one and the footer only 
on the second, and it will be typeset as one big table.
You'll have to fight with the fact that the factors Sex and Name 
remember their levels whether they are present or not, but it should 
work.  For example,

sampledf$Sex <- as.character(sampledf$Sex)
sampledf$Name <- as.character(sampledf$Name)
females <- subset(sampledf, Sex == "F")
males <- subset(sampledf, Sex == "M")

latex( tabular( Factor(Sex)*Factor(Name) ~ Heading()*When * Weight * 
Heading()*identity, data=females),
options = list(doFooter=FALSE, doEnd=FALSE) )

latex( tabular( Factor(Sex)*Factor(Name) ~ Heading()*When * Weight * 
Heading()*identity, data=males),
options = list(doBegin=FALSE, doHeader=FALSE) )

It would probably make sense to support nested factor notation using 
%in% to make this easier, but currently tables doesn't do that.

Duncan Murdoch

Duncan Murdoch

Sat, Nov 9, 2013 11:30 AM #

On 13-11-09 1:23 PM, Jeff Newmiller wrote:

I forgot, there's a simpler way to do this.  Build the full table with 
the junk values, then take a subset:

full <- tabular( Sex*Name ~ Heading()*When * Weight * 
Heading()*identity, data=sampledf )

full[c(1:3, 10:12), ]

Figuring out which rows you want to keep can be a little tricky, but 
doing something like this might be good:

counts <- tabular( Sex*Name ~ 1, data=sampledf )
full[ as.logical(counts), ]

Duncan Murdoch

Jeff Newmiller

Sat, Nov 9, 2013 2:56 PM #

The problem that prompted this question involved manufacturers and their model numbers, so I think the cross everything and throw away most of it will get out of hand quickly. The number of models per manufacturer definitely varies. I think I will work on the print segments of the table successively approach. Thanks for the ideas.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

Duncan Murdoch <murdoch.duncan at gmail.com> wrote:

Duncan Murdoch

Sat, Nov 9, 2013 5:29 PM #

On 13-11-09 5:56 PM, Jeff Newmiller wrote:

I've just added cbind() and rbind() methods for tabular objects, so that 
approach will be a lot easier.  Just do the table of the first subset, 
then rbind on the subsets for the rest.  Will commit to R-forge after a 
bit more testing and documentation.

Duncan Murdoch

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.

Duncan Murdoch <murdoch.duncan at gmail.com> wrote:

On 13-11-09 1:23 PM, Jeff Newmiller wrote:

Visually, the elimination of duplicates in hierarchical tables in the
tabular function from the tables package is very nice. I would like

to do

the same thing with non-crossed factors, but am perhaps missing some
conceptual element of how this package is used. The following code
illustrates my goal (I hope):

library(tables)
sampledf <- data.frame( Sex=rep(c("M","F"),each=6)
              ,

Name=rep(c("John","Joe","Mark","Alice","Beth","Jane"),each=2)

              , When=rep(c("Before","After"),times=6)
              ,

Weight=c(180,190,190,180,200,200,140,145,150,140,135,135)

              )
sampledf$SexName <- factor( paste( sampledf$Sex, sampledf$Name ) )

# logically, this is the layout
tabular( Name ~ Heading()* When * Weight * Heading()*identity,
data=sampledf )

# but I want to augment the Name with the Sex but visually group the
# Sex like
#   tabular( Sex*Name ~ Heading()*When * Weight * Heading()*identity,
data=sampledf )
# would except that there really is no crossing between sexes.
tabular( SexName ~ Heading()*When * Weight * Heading()*identity,
data=sampledf )
# this repeats the Sex category excessively.

I forgot, there's a simpler way to do this.  Build the full table with
the junk values, then take a subset:

full <- tabular( Sex*Name ~ Heading()*When * Weight *
Heading()*identity, data=sampledf )

full[c(1:3, 10:12), ]

Figuring out which rows you want to keep can be a little tricky, but
doing something like this might be good:

counts <- tabular( Sex*Name ~ 1, data=sampledf )
full[ as.logical(counts), ]

Duncan Murdoch