(1) In short, I can find essentially no difference in the results of the samm and lmer models (the most complex one that both functions could handle): lmer: math~ gr + sx + eth + cltype + (1+yrs|id) + (1|sch) samm: math ~ gr + sx + eth + cltype, random=~ us(link(~1+yrs)):id + sch After several hours searching through the return values of the two functions (slots, S3 methods, S4 methods, extractors and all that) and identifying unique approaches (lmer uses polynomial contrasts for ordered factors and samm uses treatment contrasts; missing values in the data are handled slightly differently in the way the random effects return values are presented), I find the two functions have nearly identical variance components and essentially identical fixed effects and random effects. An R transcript is attached for reference. (2) Yes, SAMM is proprietary. It is available on Windows and Linux, S-Plus and R. The developer has told me that version 2 is very near completion. If you ever want to try it out, there is a 30-day free trial before it stops working. I use lme4 because it is open-source and has a good community of users, published examples, etc. I use samm to analyze data from plant breeding experiments (current literature methods use large data sets, crossed random effects, heteroskedasticity, spatial correlation, etc.). SAMM also has convenient reporting of linear predictions of BLUEs/BLUPs (Welham , Cullis, Gogel, Gilmour, & Thompson 2004, Stroup & Mulitze 1991) for decision-making. I also use both because fitting the same model using two different software packages (when the software capabilities allow for it) really helps me think carefully and hard about what I'm asking the software to do, what it actually does, and what the results actually mean. Thanks for the challenging question...I learned more because of it. Kevin (thread edited below for brevity)
Doug Bates wrote:
I'm not sure if I have access to SAMM to check the timings. Is SAMM
proprietary? My recollection is that it is proprietary and that it is Windows-only. Either one of those characteristics is the kiss of death for my wanting to use it. Are you sure that SAMM is fitting a model with partially-crossed random effects? If you are only considering students and schools as grouping factors for the random effects there will be little difference between assuming nested random effects and correcting for the partial crossing. This is because there are very few students in this study who attend more than one school. If you considered teachers as well as students and schools as grouping factors then you could more easily discern the difference in model fits taking into account the crossing or assuming that each student:teacher:school combination is unique.
-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20070202/36df8a8d/attachment.html> -------------- next part -------------- R> library(mlmRev) Loading required package: lme4 Loading required package: Matrix R> library(samm) Attaching package: 'samm' R> library(kw) # For ranef.samm, fixef.samm, VarCorr.samm # samm and lmer treat missings slightly differently in the return value, # so level the playing field by removing all observations with missing values R> star2 <- subset(star, !(is.na(math) | is.na(gr) | is.na(sx) | is.na(eth) | + is.na(cltype) | is.na(yrs) | is.na(id) | is.na(sch))) # In the 'star' data, gr is an "ordered factor" # lmer uses polynomial contrasts for ordered factors by default # samm uses treatment contrasts for ordered factors # Make 'gr' into an unordered factor R> star2$gr <- as.factor(as.character(star2$gr)) # Fit the models. lmer: 10.4 seconds lmer2: 11.3 samm: 6.0 R> system.time(f.lmer <- lmer(math~ gr + sx + eth + cltype + + (1+yrs|id) + (1|sch), data=star2, + control=list(gradient = FALSE, niterEM = 0))) [1] 16.13 0.40 18.22 NA NA R> system.time(f.lmer2 <- lmer2(math~ gr + sx + eth + cltype + + (1+yrs|id) + (1|sch), data=star2)) [1] 17.00 0.53 18.50 NA NA R> system.time(f.samm <- samm(math ~ gr + sx + eth + cltype, + random=~ us(link(~1+yrs)):id + sch , + data=star2)) +-----------------------------------------------------------------+ Spatial Analysis <> Mixed Models Version: VSN 1.10 AIsamm license expires: 65535 days SAMM Convergence Monitoring: Thu Feb 01 17:08:08 2007 +-----------------------------------------------------------------+ Seq Component 1 link("~1 + yrs"):id!Intercept:Intercept 2 link("~1 + yrs"):id!yrs:Intercept 3 link("~1 + yrs"):id!yrs:yrs 4 sch 5 R!variance Equations: 21560 (17 dense) Initial update shrinkage factor: 0.316 Singularities: 4 (more) LogLik S2 DF 1 2 3 4 5 -101615.5560 898.1042 24566 0.150 0.100 0.150 0.100 1.000 17:08:09 -100808.5510 796.0875 24566 0.435 0.071 0.118 0.125 1.000 17:08:10 -100144.8784 689.3688 24566 0.966 -0.005 0.101 0.167 1.000 17:08:10 -99855.8696 603.4995 24566 1.786 -0.155 0.113 0.214 1.000 17:08:11 -99838.3681 584.1043 24566 2.086 -0.228 0.128 0.212 1.000 17:08:11 -99838.0609 583.0869 24566 2.119 -0.240 0.130 0.209 1.000 17:08:12 -99838.0569 583.1879 24566 2.121 -0.241 0.130 0.209 1.000 17:08:12 FINAL parameter values: 2.121 -0.241 0.130 0.209 1.000 Constraint codes: U U U P P Exit status: 0 - LogLikelihood Converged Finished on: Thu Feb 01 17:08:12 2007 Convergence monitoring: LogLikelihood Converged [1] 5.70 0.02 6.38 NA NA # Use results of lmer, since extractor functions seem easier to use # Variance components for samm R> kw::VarCorr(f.samm) Response: math Component Std Err Z Ratio Constraint link("~1 + yrs"):id!Intercept:Intercept 1236.8585 30.3504 40.753 Unconstrained link("~1 + yrs"):id!yrs:Intercept -140.5416 10.4943 -13.392 Unconstrained link("~1 + yrs"):id!yrs:yrs 75.8448 5.0346 15.065 Unconstrained sch 121.6656 21.1143 5.762 Positive R!variance 583.1879 8.6846 67.152 Positive # Variance components for lmer R> VarCorr(f.lmer)$id 2 x 2 Matrix of class "dpoMatrix" (Intercept) yrs (Intercept) 1236.9322 -140.56470 yrs -140.5647 75.85085 R> VarCorr(f.lmer)$sch 1 x 1 Matrix of class "dpoMatrix" (Intercept) (Intercept) 121.6565 R> attr(VarCorr(f.lmer),"sc")^2 scale 583.2067 # Fixed effects for samm R> kw::fixef(f.samm) cltype_small cltype_reg cltype_reg+A eth_W eth_B eth_A 0.000000 -7.340255 -6.054712 0.000000 -22.934383 2.554495 eth_H eth_I eth_O sx_M sx_F gr_1 1.410943 -34.301117 3.402846 0.000000 2.797615 0.000000 gr_2 gr_3 gr_K (Intercept) 47.993067 83.587681 -44.421802 539.051085 # Fixed effects for lmer R> fixef(f.lmer) (Intercept) gr2 gr3 grK sxF ethB 539.051209 47.993546 83.588789 -44.421994 2.797348 -22.934191 ethA ethH ethI ethO cltypereg cltypereg+A 2.556208 1.411219 -34.300729 3.405428 -7.340299 -6.054774 # Correlation of fixed effects for samm/lmer R> fs <- kw::fixef(f.samm) R> fs <- fs[abs(fs)>.001] R> cor(sort(fs), sort(fixef(f.lmer))) [1] 1 R> # samm random intercept for each school R> as.vector(tail(kw::ranef(f.samm),80)) [1] 1.3143717 -20.5348662 3.2968950 -1.0183895 -21.1174158 -17.1111596 [7] 8.1185468 -3.5960209 1.9353613 21.5313533 19.7123450 13.1381558 [13] 5.1417748 6.4959617 -4.9180225 -9.1269814 -8.5555231 -8.0622654 [19] -12.9893362 7.3534878 -8.7966736 0.1812109 20.5266225 11.7676528 [25] -8.8642947 -19.4369518 11.1248498 -4.7711922 17.2450765 8.9582622 [31] -2.9094668 -13.8162709 -9.0721915 -7.1982139 -10.0674440 4.3567975 [37] 2.9597060 -7.9531537 3.9690027 -2.6906443 25.1882796 -6.8958274 [43] 3.0637880 -9.1939666 5.5059775 -1.9412242 -7.6311463 -11.8170561 [49] -7.0021395 -5.6523660 4.6034859 6.9618590 -9.6490649 2.2870641 [55] -12.6429433 -16.4871068 -10.5297346 -0.6011866 -0.3508403 -13.0738001 [61] 9.2144515 -3.0903109 9.7637953 0.4114071 9.0939491 1.9779132 [67] -3.2119299 12.8595955 17.8323498 0.8572007 -8.1456880 15.8826757 [73] 5.1027719 23.8611109 1.9011668 -6.3425222 5.6704674 -2.2273366 [79] 9.7215416 -1.7956171 # lmer random intercept for each school R> lme4:::ranef(f.lmer)[[2]][,1] [1] 1.3133874 -20.5335295 3.2956710 -1.0160289 -21.1169777 -17.1095972 [7] 8.1168346 -3.5945118 1.9352882 21.5310053 19.7115966 13.1401330 [13] 5.1383739 6.4961299 -4.9184228 -9.1255942 -8.5557981 -8.0617211 [19] -12.9903605 7.3527759 -8.7991525 0.1799393 20.5243880 11.7679624 [25] -8.8640390 -19.4362049 11.1219601 -4.7703302 17.2445306 8.9532094 [31] -2.9119062 -13.8155380 -9.0719066 -7.1963657 -10.0661642 4.3582431 [37] 2.9599896 -7.9522175 3.9701887 -2.6930150 25.1871108 -6.8951564 [43] 3.0654065 -9.1979970 5.5086618 -1.9394761 -7.6314905 -11.8180674 [49] -7.0025942 -5.6537153 4.6030342 6.9599747 -9.6486151 2.2865630 [55] -12.6427040 -16.4839846 -10.5288129 -0.6033224 -0.3515562 -13.0727761 [61] 9.2167562 -3.0882666 9.7630371 0.4125758 9.0944709 1.9818962 [67] -3.2120202 12.8576685 17.8320933 0.8574842 -8.1447511 15.8819324 [73] 5.1012380 23.8613070 1.9013746 -6.3435578 5.6695532 -2.2235653 [79] 9.7241380 -1.7960726 # Correlation of samm/lmer random intercepts for school R> cor(lme4:::ranef(f.lmer)[[2]][,1], as.vector(tail(kw::ranef(f.samm),80))) [1] 1 # lmer Random slope/intercept for each 'id' R> head(lme4:::ranef(f.lmer)[[1]]) (Intercept) yrs 100017 89.893617 -10.21549112 100028 -36.049228 4.09662647 100045 6.318275 -0.00364208 100064 -9.404819 1.06876159 100070 62.249604 -6.72732399 100096 26.410811 0.76421193 # samm R> data.frame(int=kw::ranef(f.samm)[1:6], + yrs=kw::ranef(f.samm)[10733:10738]) int yrs id_100017 89.890923 -10.209949210 id_100028 -36.049490 4.094556472 id_100045 6.317053 -0.003151222 id_100064 -9.403919 1.068111606 id_100070 62.244643 -6.723240707 id_100096 26.408558 0.766162668 # The tail of the random effects from lme R> tail(lme4:::ranef(f.lmer)[[1]]) (Intercept) yrs 99912 5.695326 -3.6083994 99937 13.972907 4.4756514 99942 13.718499 -1.5589673 99967 7.085396 -3.8997280 99979 5.132715 -0.5832806 99992 -4.265711 -5.6205076 # The tail of the random effects from samm R> data.frame(int=kw::ranef(f.samm)[10727:10732], + yrs=kw::ranef(f.samm)[21459:21464]) int yrs id_99912 5.693616 -3.6080847 id_99937 13.973628 4.4767916 id_99942 13.716419 -1.5579320 id_99967 7.082693 -3.8993951 id_99979 5.131698 -0.5828662 id_99992 -4.265344 -5.6211783 # Correlation of samm/lmer 'id' random effects R> cor(lme4:::ranef(f.lmer)[[1]][,1], kw::ranef(f.samm)[1:10732]) [1] 1 R> cor(lme4:::ranef(f.lmer)[[1]][,2], kw::ranef(f.samm)[10733:21464]) [1] 1