Specifying models nested crossed random effects

Tue, Apr 25, 2017 7:28 AM

Evan - thank you very much for your advice, I've basically specified the
model as you suggested and it seems to be a reasonable approach.

thanks again,
Josh

On Sun, Apr 9, 2017 at 2:56 PM, Evan Palmer-Young <ecp52 at cornell.edu> wrote:

Thanks for those details, Josh. Interesting design!

I'm not experienced in interpreting random effects on their own, so others
will have better advice on that.

For your model structure, it sounds like there are three random effects:

"program_ID"
"participant_ID"
"sample_ID"

From my reading of lme4 documentation, I think that you have coded
sample_ID correctly and do not need to explicitly nest it within program_ID.

In general, think it may be better form to include both fixed and random
predictors in your model, rather than having separate models to assess only
the random effects.

So your model might be something like,

interest_model <- lmer(interest ~ ?Instruction_type? + ?time_of_day?  +
?Working_alone? + (1}program_ID) +  (1|participant_ID) + (1|sample?_ID?),
data = df)

Where Instruction_type, time_of_day , Working_alone, are fabricated
variables that might resemble variables you recorded.

As a disclaimer, this is my second time answering to the list-- welcome!

Best wishes, Evan





On Sat, Apr 8, 2017 at 4:26 PM, Joshua Rosenberg <
jmichaelrosenberg at gmail.com> wrote:

Thank you Evan for your response and thank you for clarifying.

?Responses are in-line below.?


?Thank you for considering this!?

?Josh?


On Sat, Apr 8, 2017 at 3:28 PM, Evan Palmer-Young <ecp52 at cornell.edu>
wrote:

Josh,
Thanks for the questions.
Can you provide a little bit more description about the variables?

?First, sorry, I had changed some of the variable names in the data and
realize I used different names (and a different outcome) in the examples at
the bottom.

?"interest" (one outcome we're measuring) is a variable of participants'
self-reported interest using a 1-4 scale.

"overall_engagement" is one other (different) outcome: One that was a
composite of variables of students' interest, how hard they were
concentrating,
?and how challenging they reported what they were learning was.

We asked participants (youth) about how interested they were in what they
were learning at random intervals using what is called  an experience
sampling method. In our method, youth had phones on which they were asked
about what they were thinking / feeling - every youth in the same program
(more on the programs in just a moment) was notified to answer our
questions at the same time, although both the instance in time and the
interval between these questions was different between programs.

"site" = "program" (ID) and program is an indicator for membership in one
of the 10 programs.

Because youth were repeatedly sampled, "participant_ID" is an indicator
for one of about 200 participants.

"sample_ID" is an indicator unique for each program (it was made from the
program_ID, the date, and which of one of four samples it was for that
date). There are about 20 unique values for it for each program, from
around 200 values total.

Does "site" = "program"?
Are participants queried at multiple timepoints? If pre- and
post-program, could this be included as a factor with levels "before" and
"afte

Yes, the sampling consisted of repeated measures within participant
(around 15-20 responses per participant). It's a bit tricky for me to
describe, but as I mentioned above every youth in the same program was
notified to answer questions at the same time, though both the instance in
time and the interval between these questions differed between the 10
programs.

Do you have any particular hypotheses or questions you want to answer
with your model?

?We're interested in, for a lack of a better word, time point or
situation-specific ("sample_ID") variables' relationships with engagement.
We coded video of the programs, including before and when youth were
notified to respond, for example, the type of activity youth were
participating in (i.e., working in groups or individually; doing hands-on
activities or listening to the activity leaders). We imagine considering
these as categorical variables.

Similarly, we're interested in relationships between youth's
characteristics (such as pre-program interest and demographic
characteristics, such as gender) and our outcomes and to a bit of a lesser
extent relationships between some program factors and outcomes (though with
only 10 programs, we do not imagine we will have statistical power to
detect any / many effects at that level).

We're interested in sources of variance as a substantive question (how
much of students' engagement is explained by time-point ("sample_ID"),
youth ("participant_ID"), and program ("program_ID") effects?). Though this
is a bit secondary to our questions about the specific variables at
time-point, youth, and program levels.

Best wishes, Evan




--
Joshua Rosenberg
jmichaelrosenberg at gmail.com
http://joshuamrosenberg.com



--
Evan Palmer-Young
PhD candidate
Department of Biology
221 Morrill Science Center
611 North Pleasant St
Amherst MA 01003
https://sites.google.com/a/cornell.edu/evan-palmer-young/
epalmery at cns.umass.edu
ecp52 at cornell.edu

Joshua Rosenberg
jmichaelrosenberg at gmail.com
http://joshuamrosenberg.com

	[[alternative HTML version deleted]]

Specifying models nested crossed random effects

Thread (7 messages)