help with the logistic formula using nlme/nlmer
See for example the gamm4 package (which hooks into and extends the lme4 formula syntax). Instead of I(...), you would have s(...) Best, Phillip Alday
On Mon, 2015-06-08 at 15:37 +0200, Thierry Onkelinx wrote:
Dear Hans, I'd rather use a gamm with a penalized regression spline for total.hours.worked with a small basis for the smoother (k = 3 of k = 4). Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2015-06-08 15:07 GMT+02:00 Hans Ekbrand <hans.ekbrand at gmail.com>:
Dear list, I model the effect of child labour on the childs probability of being in school. The data comes from 22 countries. Countries have different means on the outcome variable, ie. the probability is in school is to a large part determined on in which country the child resides. The sample includes only children aged 7-14 years. Child labour is a numerical covariate measured in hours, being in school is binary variable, age is a numerical covariate, measured in years. Data is available here: http://hansekbrand.se/code/cl.df.RData
str(cl.df)
'data.frame': 345321 obs. of 8 variables: $ country : Factor w/ 23 levels "Armenia","Burkina Faso",..: 1 1 1 1 1 1 1 1 1 1 ... $ areaID : Factor w/ 14584 levels "Armenia.1","Armenia.10",..: 3 3 3 3 3 3 2 2 2 2 ... $ school : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ... $ age : num 9 10 8 11 14 10 12 10 14 12 ... $ total.hours.worked: num 1 1 1 0 1 0 0 0 0 0 ... $ Chorehours : num 1 1 1 0 1 0 0 0 0 0 ... $ Chwkhours : num 0 0 0 0 0 0 0 0 0 0 ... $ Chothwkhours : num 0 0 0 0 0 0 0 0 0 0 ... The distribution of child labour, which is indicated by the variable total.hours.worked, has a positive skew.
quantile(cl.df$total.hours.worked, probs = seq(from = 0, to = 1, by =
0.1))
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0 0 0 0 3 5 7 10 14 28 133
I have manually defined classes for this variable,
library(car)
cl.df$hours.class <- recode(cl.df$total.hours.worked, recodes = ("lo:7=1;
7:14=2; 14:21=3; 21:28=4; 28:35=5; 35:42=6; 42:49=7; 49:56=8; 56:63=9;
63:hi='more than ten'"), as.factor.result=TRUE)
and used them like this:
fm1 <- glmer(school ~ age + hours.class + (1|country) + (1|areaID), data =
cl.df, family = binomial)
this works, but I would prefer to fit a non-linear regression with a
polynomal form instead. I think a simple exponential function would
work.
E.g.
fm1 <- glmer(school ~ age + I(total.hours.worked^2) + (1|country) +
(1|areaID), data = cl.df, family = binomial)
However, I *think* nlmer() could be used to find the optimal number
instead of "2" here. But I don't know how to do that. I have searched
the archive, but found rather few posts concerning nlmer(), so any
help is much appreciated.
If you can solve the problem with nlme() or anything else for that
matter, that's perfectly fine, I'm used to lme4, but I'm happy to
learn new stuff.
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models