Back to formatted view
Raw Message

Message-ID: <13D3A3DD-2BE1-498F-BE8A-C7C783556321@comcast.net>
Date: 2011-12-08T21:17:04Z
From: David Winsemius
Subject: anova analysis on factors...
In-Reply-To: <CAPNjSFZ-SY_hj7-JirWH81E6LxMFkBaMgFEc7V63TUjD-bBUGA@mail.gmail.com>

On Dec 8, 2011, at 3:28 PM, Michael wrote:

> Hi all,
>
> If we wanted to study the effect on the mean of the hourly data  
> based on
> the hours within a day...
>
> and we wanted to do Anova analysis...
>
> We have two choices:

Who is "we" and how were these constraints imposed?

>
> Please see below:
>
> Why are these two approaches giving very different p-values?

They are markedly different statistical models.

> And which one
> shall I use?
>

Without knowing your situation better and the eventual purposes of  
this analysis, it would be difficult to give sensible advice. I  
suspect the answer is "neither".

-- 
David.

> Thanks a lot!
>
> 1. treating the hours as double/floating numbers:
>
>
> anova(lm(hourlydata~as.double(hours_factors)))
>
> Df Sum Sq Mean Sq F value Pr(>F)
>
> as.double(hours_factors) 1 0.0002 0.00019876 1.3425 0.2466
>
> Residuals 14868 2.2013 0.00014806
>
> 2. treating the hours as factors:
>
>
>
> anova(lm(hourlydata~hours_factors))
>
> Df Sum Sq Mean Sq F value Pr(>F)
>
> hours_factors 9 0.00077 8.5979e-05 0.5806 0.8142
>
> Residuals 14860 2.20072 1.4810e-04
>
> 	[[alternative HTML version deleted]]



David Winsemius, MD
West Hartford, CT