Skip to content

Multivariate quasi-bionomial analysis of proportion data?

4 messages · Amanda Greer, Bob OHara, Philippi, Tom +1 more

#
Hi All,

I am trying to best analyse a set of foraging ecology data with >10
behaviour categories (DVs) and 3 levels of IV (season, sex, age). The time
which an animal spent engaged in a behaviour was recorded and then divided
by the total time spent in sight of the observer, so my data are
proportional. As is typical, not all animals engaged in all behaviours and
there are a large number of zeros in my dataset which is severely
over-dispersed. I had initially analysed all the data using the glm
function (family = quasibinomial, followed by anova. The intention was then
to use the false discovery rate alpha to account for the large number of
analyses. However, it was pointed out to me that a multivariate approach
might be better so I have been trying to figure out (a) if it's possible to
run a quasi-binomial multivariate analysis of proportion data  (b) how to
go about it.

In the R Documentation quasi-binomial family function page (
http://artax.karlin.mff.cuni.cz/r-help/library/VGAM/html/quasibinomialff.html
) it is stated that if multivariate response = TRUE the response matrix
should be binary. This seems a pretty straightforward indictment of my idea
to run this analysis on my proportion data, but I am wondering why - is
this just not possible, or is there a particular package that could help?
If anyone could provide me with an answer or some much needed guidance on
this topic I would be very grateful.

Thanks,

Amanda
#
On 08/02/15 12:27, Amanda Greer wrote:
Ignoring the zeroes problem for the moment, I think (quasi-)binomial 
distributions are a distraction: binomials are based on counts of things 
(see Petr Keil's post: http://www.petrkeil.com/?p=603). If you're 
looking at proportions of times, then it might be better to think in 
terms of gamma distributions, which lead to a beta distribution for the 
proportion of times spent doing one thing, and a Dirichlet distribution 
if you have several items (as you do here).

Once you have to worry about the zeroes, you need to do something more, 
for example see this paper:
<http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12122/abstract>

Bob

  
    
#
Amanda--
I'm not sure I would be convinced by you analyses, as I don't think your
statistical model corresponds to your sampling or data generating process.
But, I'd need to know more information about the response design (data
collection) to make any suggestions.

For binomial or quasi-, you aren't analyzing the ratio of time observed
(DV) to total time observed, you're presumably using the number of minutes
or seconds?  If so, note that you get very different answers depending on
the units, because the binomial response is treating each point observation
as independent.  Depending on the animal and the behaviors, in my
experience not even minute or 10 minute observations are independent.

How long is an individual animal observed in a given bout (period of
consecutive recording)?  Are individuals monitored for more than 1 bout?
How many behaviors does it perform (on average) in one observation bout?
How many times does it switch behavior in a bout?  Even if it only does
behaviors A & B, if it is doing A when you start observing, at some point
it switches to B, and is still doing B when you stop recording, that is
very different than it switching back & forth A B A B A B A B A B in a
single bout.

If you have lots of switching by individual animals in individual bouts,
then there may be a reasonable mixed-model binomial-based approach,
treating individual animals as random subjects.  If not, there are some
approaches to proportional data that might be a better approximation to
your data and components of variation.  But I've already stuck my neck out
far enough guessing about how you might have collected your data, so I'll
stop here unless you provide more information.

I hope that this helps...

Tom 2
On Sun, Feb 8, 2015 at 3:27 AM, Amanda Greer <manda.greer at gmail.com> wrote:

            

  
    
#
Thank you Bob and Tom for assistance, I was unaware of the distributions you
referred to, Bob. 

Some more information: We filmed ~80 foraging bouts of varying length (all >
1 min) with behaviour categories such as eating flowers, eating roots,
eating seeds, walking, digging... Each bout describes the behaviour of 1
focal individual. Individuals switched between behaviours a lot during each
bout: eat, walk to next plant, eat, walk to next, eat, preen, eat, walk,
eat, in a short space of time (< 1 min) would be typical, although I don't
have the data on how many switches to hand. All data were recorded in
seconds. The same individual was occasionally recorded in a second bout but
only if longer than 15 minutes had elapsed from the end of the previous
bout. All of our DVs are foraging or searching behaviours, as these are not
the only behaviours the animals engaged in they do not necessarily total to
100% of the bout recorded.

We are interested in the effects of season, sex and age on each DV. Our
original analysis was: seasonal (oneway ANOVAs) and age by sex (3 x 2
factorial ANOVAs). We ran all age by sex ANOVAs exclusively on bouts
recorded in summer as this was the only season with an even spread of age
and sex categories. We used the Benjamini-Hochberg procedure to adjust p
values.  

Any further advice you have would be greatly appreciated, please let me know
if I can provide any more info. 

Thanks,

Amanda



--
View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Multivariate-quasi-bionomial-analysis-of-proportion-data-tp7579294p7579300.html
Sent from the r-sig-ecology mailing list archive at Nabble.com.