question about result of loglinear analysis
Date: Wed, 19 Jan 2011 01:20:06 -0800 From: djmuser at gmail.com To: laomeng.3 at gmail.com CC: r-help at r-project.org Subject: Re: [R] question about result of loglinear analysis Hi: Well, you fit a saturated model. How many degrees of freedom do you have left for error? The fact that the standard errors are so huge relative to the estimates is a clue. Taking a look at your data, it's pretty clear that nation 3 is an outstanding outlier on its own. It is clearly - nay, blatantly - different from the other nations in the sample. Look at boxplot(fre ~ nation, data = data_Analysis) boxplot(sqrt(fre) ~ nation, data = data_Analysis)
I'm scrolling back though my cygwin windoh, last night I used this, ( read data into "x" not data_Analysis)
x<-read.table("area_nation.txt",header=TRUE)
str(x)
'data.frame':?? 77 obs. of? 3 variables: ?$ area? : int? 1 1 1 1 1 1 1 1 1 1 ... ?$ nation: int? 1 2 3 4 5 6 7 8 9 10 ... ?$ fre?? : int? 0 0 85 2 0 0 0 0 1 0 ...
library(scatterplot3d) library(rgl) scatterplot3d(x$area,x$nation,x$fre,type="h") scatterplot3d(x$area,x$nation,log(x$fre+1),type="h")
there is always a discussion here on "looking at pictures" and post hoc analysis or what is legitimate to do with outliers that may be confusing to some readers but you always need to keep in mind your overall objectives here. It often helps to forget for a minute that you are doing something intellectual or pompous and just stare at the pictures ( or someone else quoted a statistician talking about getting rat dropping under your finger nails presumably meaning getting more familiar with details of your data aqusition system LOL).
the latter to deal with the huge outlier near 1200 in the original data. Even on the square root scale, nation 3 sticks out like a sore thumb. 43/77 of your responses have zero frequency, so you should probably be looking into zero-inflated Poisson models and some of its relatives. Here is one citation to get you started: http://www.jstatsoft.org/v27/i08/paper Package VGAM also has functionality to fit these types of models. Using package sos, I typed # Install package sos first if you don't have it: library(sos) findFn('zero Poisson') which found 255 matches; you should find several packages that pertain to zero-inflated/zero-altered Poisson models. In the absence of the scientific background behind the data, the dominance of nation 3 may well mask more subtle effects among the other nations, so you might want to consider analyses with and without nation 3. HTH, Dennis On Tue, Jan 18, 2011 at 5:45 PM, Lao Meng wrote:
Hi all: Here's a question about result of loglinear analysis. There're 2 factors:area and nation.The raw data is in the attachment. I fit the saturated model of loglinear with the command: glm_sat<-glm(fre~area*nation, family=poisson, data=data_Analysis) After that,I extract the coefficients: result_sat<-summary(glm_sat) result_coe<-result_sat$coefficients I find that all the coeffients are 1 or very near to 1. How does this happen?Why all the coeffients are 1 or very near to 1? Thanks! My best
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.