Help with normal distributions
Hi Michael,
Secondly, and perhaps more difficult, is a second data set. This, when plotted as a histogram, has two clear peaks, perhaps even three, all of which look as though they are normally distributed. So the theory is that my data set is actually made up of two, possibly three, underlying sub-sets of data which are normally distributed, but with different means and standard deviations. So 1) how do I test for this? And 2) how can I estimate the parameters (mean and SD) for the underlying distributions?
The answer to 2, as pointed out already, is to use EMclust in package mclust. Testing for the presence of a mixture is difficult from a theoretical point of view, and as far as I know, nothing is already implemented in R. What you can do is: a) Let EMclust estimate the number of mixture components by BIC (it can also decide for only one component). b) Use a standard normality test such as shapiro.test to exclude homogeneous normality. This tells you that you have to fit something more complex than a single normal, but it does not tell you what. Christian
Thanks in advance for your help Mick
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
*********************************************************************** Christian Hennig Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag-online.de