Message-ID: <BANLkTikrSydOH-fLcLJkJ7O+ZxxrR2vP-Q@mail.gmail.com>
Date: 2011-05-02T14:03:45Z
From: Mathias Walter
Subject: 3-way contingency table
In-Reply-To: <1CD70FCD-B971-48CE-B357-8B2EAF58A301@comcast.net>
Hi David,
thanks for your quick response. It was really helpful.
--
Kind regards,
Mathias
2011/4/29 David Winsemius <dwinsemius at comcast.net>:
>
> On Apr 29, 2011, at 6:47 AM, Mathias Walter wrote:
>
>> Hi,
>>
>> I have large data frame with many columns. A short example is given below:
>>
>>> dataH
>>
>> ? host ms01 ms31 ms33 ms34
>> 1 ?cattle ? ?4 ? 20 ? ?9 ? ?6
>> 2 ? sheep ? ?4 ? ?3 ? ?4 ? ?5
>> 3 ?cattle ? ?4 ? ?3 ? ?4 ? ?5
>> 4 ?cattle ? ?4 ? ?3 ? ?4 ? ?5
>> 5 ? sheep ? ?4 ? ?3 ? ?5 ? ?5
>> 6 ? ?goat ? ?4 ? ?3 ? ?4 ? ?5
>> 7 ? sheep ? ?4 ? ?3 ? ?5 ? ?5
>> 8 ? ?goat ? ?4 ? ?3 ? ?4 ? ?5
>> 9 ? ?goat ? ?4 ? ?3 ? ?4 ? ?5
>> 10 cattle ? ?4 ? ?3 ? ?4 ? ?5
>>
>> Now I want to determine the the frequencies of every unique value in
>> every column depending on the host column.
>>
>> It is quite easy to determine the frequencies in total with the
>> following command:
>>
>>> dataH2 <- dataH[,c(2,3,4,5)]
>>> table(as.matrix(dataH2), colnames(dataH2)[col(dataH2)], useNA="ifany")
>>
>> ? ms01 ms31 ms33 ms34
>> 3 ? ? 0 ? ?9 ? ?0 ? ?0
>> 4 ? ?10 ? ?0 ? ?7 ? ?0
>> 5 ? ? 0 ? ?0 ? ?2 ? ?9
>> 6 ? ? 0 ? ?0 ? ?0 ? ?1
>> 9 ? ? 0 ? ?0 ? ?1 ? ?0
>> 20 ? ?0 ? ?1 ? ?0 ? ?0
>>
>> But I cannot manage to get it dependent on the host.
>>
>> I tried
>>
>>> xtabs(cbind(ms01, ms31, ms33, ms34) ~ ., dataH)
>>
>> and many other ways but I'm not stressful.
>>
>> I can get it for each column individually with
>>
>>> with(dataH, table(host, ms33))
>>
>> ? ? ?ms33
>> host ? ? 4 5 9
>> cattle 3 0 1
>> deer ? 0 0 0
>> goat ? 3 0 0
>> human ?0 0 0
>> sheep ?1 2 0
>> tick ? 0 0 0
>>
>> But I do not want to repeat the command for every column. I need a
>> single table which can be plotted as a balloon plot, for instance.
>
> You have obviously not given us the full data from which your "correct
> answer" was drawn, but see if this is going ?the right direction:
>
> require(reshape)
>> dataHm <- melt(dataH)
> Using host as id variables
>> xtabs(~host+value+variable, dataHm)
> , , variable = ms01
>
> ? ? ? ?value
> host ? ? 3 4 5 6 9 20
> ?cattle 0 4 0 0 0 ?0
> ?goat ? 0 3 0 0 0 ?0
> ?sheep ?0 3 0 0 0 ?0
>
> , , variable = ms31
>
> ? ? ? ?value
> host ? ? 3 4 5 6 9 20
> ?cattle 3 0 0 0 0 ?1
> ?goat ? 3 0 0 0 0 ?0
> ?sheep ?3 0 0 0 0 ?0
>
> , , variable = ms33
>
> ? ? ? ?value
> host ? ? 3 4 5 6 9 20
> ?cattle 0 3 0 0 1 ?0
> ?goat ? 0 3 0 0 0 ?0
> ?sheep ?0 1 2 0 0 ?0
>
> , , variable = ms34
>
> ? ? ? ?value
> host ? ? 3 4 5 6 9 20
> ?cattle 0 0 3 1 0 ?0
> ?goat ? 0 0 3 0 0 ?0
> ?sheep ?0 0 3 0 0 ?0
>
>>
>> Does anybody knows how to achieve this?
>>
>> --
>> Kind regards,
>> Mathias
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
>