Skip to content

anova test for variables with different lengths

3 messages · Sachinthaka Abeywardana, Jose Iparraguirre, S Ellison

#
Hi all,

I want to test whether the MEAN of two different variables, (and
different number of observations) are the same. I am trying to use the
anova test but it doesn't seem to like that the number of observations
are different:

a=c(1:5)
b=c(1:3)
aov_test=aov(a~b)
variable lengths differ (found for 'b')

Any ideas as to how I would go about doing this test?

Thanks,
Sachin
#
Hi Sachin,

You may find this tutorial useful: http://goanna.cs.rmit.edu.au/~fscholer/anova.php
And you'll need the car package; but become yourself familiar with Type I, II and III sums of squares models before running the Anova; the tutorial explains these in detail.
Hope it helps.

Jos?


Jos? Iparraguirre
Chief Economist
Age UK

T 020 303 31482
E Jose.Iparraguirre at ageuk.org.uk
Twitter @jose.iparraguirre at ageuk


Tavis House, 1- 6 Tavistock Square
London, WC1H 9NB
www.ageuk.org.uk?| ageukblog.org.uk | @ageukcampaigns 


For a copy of our new Economic Monitor and the full Chief Economist's report, visit the Age UK Knowledge Hub http://www.ageuk.org.uk/professional-resources-home/knowledge-hub-evidence-statistics/


For evidence and statistics on the older population, visit the Age UK Knowledge Hub http://www.ageuk.org.uk/professional-resources-home/knowledge-hub-evidence-statistics/


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Sachinthaka Abeywardana
Sent: 16 October 2012 04:18
To: r-help at r-project.org
Subject: [R] anova test for variables with different lengths

Hi all,

I want to test whether the MEAN of two different variables, (and
different number of observations) are the same. I am trying to use the
anova test but it doesn't seem to like that the number of observations
are different:

a=c(1:5)
b=c(1:3)
aov_test=aov(a~b)
variable lengths differ (found for 'b')

Any ideas as to how I would go about doing this test?

Thanks,
Sachin

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Age UK and YouthNet are official charities for the Virgin London Marathon 2013

We need you to Run for it. Join the team and help raise vital funds to bring generations together to combat loneliness and isolation.

Go to http://www.runforit.org.uk for more information or contact Helen Parson at helen.parsons at ageuk.org.uk or on 020 303 31369.

Age UK and YouthNet. A lifeline, online.

www.runforit.org.uk



Age UK Improving later life

www.ageuk.org.uk



-------------------------------
Age UK is a registered charity and company limited by guarantee, (registered charity number 1128267, registered company number 6825798). 
Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

For the purposes of promoting Age UK Insurance, Age UK is an Appointed Representative of Age UK Enterprises Limited, Age UK is an Introducer 
Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth Access for the purposes of introducing potential annuity and health 
cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit Solutions Limited and Simplyhealth Access are all authorised and 
regulated by the Financial Services Authority. 
------------------------------

This email and any files transmitted with it are confide...{{dropped:28}}
#
Sadly, I doubt that it will, though it would be good advice if the OP had got as far as formulating the model correctly. 

But they haven't. The OP has tried to model a variable of length 5 using a predictor of length 3. (In fact what they've just done is a simple linear regression of variables with different length). This will not work, no  matter what the OP does about types of SS. 

First, a t test would do this job, assuming normality - though incidentally the variances differ so the default t.test will return a somewhat different result to anova, which effectively assumes equal variance by default.

Second, to use aov correctly, read ?formula and look at the examples for this and ?lm

Then, if you want to get the same result as an equal variance t test using ANOVA, you'd have to concatenate the two groups and then model with a predictor indicating the groups. In this instance

y <- c(a, b)
g <- factor( rep( letters[1:2], c(length(a), length(b) ) ), )
summary( aov(y~g) )

Since this is a one way problem the type of SS won't matter, but in other cases it would be crucial to at least understand why - and to what extent - anova can be  unsafe* on unbalanced data.


S Ellison

*"unsafe" reads as "actively dangerous" in this context.

*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}