Hi all, I?m running a generalized linear mixed model in R (4.0.3) and while most findings are in-line with what could be expected, I?m getting one that?s off. Either I?m mixing something up in my equation or there?s a reasonable explanation for my results that I?m not seeing. I'm hoping someone here might be able to diagnose the issue. *Background* I am researching speech perception in second languages (n = 53, 48 items). Specifically, I am investigating how well a person's ability to accurately perceive words spoken in isolation predicts their ability to perceive words spoken in sentences. Different language groups have different language transfer issues which complicate things. Also impacting perception is association--whether you associate the word with its sentence context. *Variables of interest* Outcome variable (Y): perception of a word in a sentence Predictor variables: - iso1: the participant?s estimated ability to identify a word in isolation (a performance score. I?ve used raw and Rasch standardised scores here to see if results would change. No dice). - iso2: the participant?s estimated ability to discriminate between isolated words in sequences (a performance score, as described in iso1). - language: what language group the participant is from. Languages include English, Mandarin, and Spanish (note: Mandarin consistently outperforms Spanish in raw and standard scores] - association: whether the participant associates the target word in the sentence with the sentence. Association levels include same, different, and neutral (it's a little more nuanced, but this communicates what's necessary). Random variables: participant and item. Equation: Y ~ iso1 + iso2 + language + association + (1|participant) + (1|item) *Outputs *(via sjPlot) Predictor Odds ratio CI (Intercept) 58.45 18.47-184.90 - Iso1: 1.02 1.00-1.03 - Iso2: 1.03 1.01-1.04 - Association [same]: 2.44 0.23-0.49 - Association [different]: 0.34 1.64-3.61 - Language [Mandarin]: 0.04 0.01-0.12 - Language [Spanish]: 0.05 0.01-.18 The good from the output: Association works out. Participants have greater log odds of obtaining a correct answer when they associate the word with its sentential context. Not associating the word with the context tends to lead to misperception. There is a pretty large effect here. The bad from the output: Language is yielding opposite results than expected. The Spanish group has an odds ratio of .05 while the Mandarin group has an odds ratio of .04. This is irregular as Mandarin outperforms Spanish across all tasks (evidenced by raw scores and Rasch analysis). If the equation looks right, how can it be that a lower performing group (by every other task or metric) has a better odds ratio than a higher performing group when predicting performance? Any ideas as to what I might try to resolve the language variable issue or possible interpretations of what I see as a wonky result would be very much appreciated. Thank you! John Jones E: johnathan.jones at gmail.com SM: linkedin.com/in/johnathanjones
contradictory odds ratios--a problem with the equation or the interpretation?
4 messages · Johnathan Jones, Mitchell Maltenfort, Ben Bolker +1 more
Look at the confidence intervals. Mandarin and Spanish overlap. On Mon, May 10, 2021 at 1:16 PM Johnathan Jones <johnathan.jones at gmail.com> wrote:
Hi all,
I?m running a generalized linear mixed model in R (4.0.3) and while most
findings are in-line with what could be expected, I?m getting one that?s
off. Either I?m mixing something up in my equation or there?s a reasonable
explanation for my results that I?m not seeing. I'm hoping someone here
might be able to diagnose the issue.
*Background*
I am researching speech perception in second languages (n = 53, 48 items).
Specifically, I am investigating how well a person's ability to accurately
perceive words spoken in isolation predicts their ability to perceive words
spoken in sentences.
Different language groups have different language transfer issues which
complicate things.
Also impacting perception is association--whether you associate the word
with its sentence context.
*Variables of interest*
Outcome variable (Y): perception of a word in a sentence
Predictor variables:
- iso1: the participant?s estimated ability to identify a word in isolation
(a performance score. I?ve used raw and Rasch standardised scores here to
see if results would change. No dice).
- iso2: the participant?s estimated ability to discriminate between
isolated words in sequences (a performance score, as described in iso1).
- language: what language group the participant is from. Languages include
English, Mandarin, and Spanish (note: Mandarin consistently outperforms
Spanish in raw and standard scores]
- association: whether the participant associates the target word in the
sentence with the sentence. Association levels include same, different, and
neutral (it's a little more nuanced, but this communicates what's
necessary).
Random variables: participant and item.
Equation: Y ~ iso1 + iso2 + language + association + (1|participant) +
(1|item)
*Outputs *(via sjPlot)
Predictor Odds ratio CI
(Intercept) 58.45 18.47-184.90
- Iso1: 1.02 1.00-1.03
- Iso2: 1.03 1.01-1.04
- Association [same]: 2.44 0.23-0.49
- Association [different]: 0.34 1.64-3.61
- Language [Mandarin]: 0.04 0.01-0.12
- Language [Spanish]: 0.05 0.01-.18
The good from the output:
Association works out. Participants have greater log odds of obtaining a
correct answer when they associate the word with its sentential context.
Not associating the word with the context tends to lead to misperception.
There is a pretty large effect here.
The bad from the output:
Language is yielding opposite results than expected. The Spanish group has
an odds ratio of .05 while the Mandarin group has an odds ratio of .04.
This is irregular as Mandarin outperforms Spanish across all tasks
(evidenced by raw scores and Rasch analysis).
If the equation looks right, how can it be that a lower performing group
(by every other task or metric) has a better odds ratio than a higher
performing group when predicting performance?
Any ideas as to what I might try to resolve the language variable issue or
possible interpretations of what I see as a wonky result would be very much
appreciated.
Thank you!
John Jones
E: johnathan.jones at gmail.com
SM: linkedin.com/in/johnathanjones
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Sent from Gmail Mobile [[alternative HTML version deleted]]
I don't know, but ... these are very small differences both absolutely (odds ratio of 0.4 vs 0.5) and in terms of the confidence intervals on each parameter (0.01-0.12 for Mandarin, even wider for Spanish). A lot of the variation among language groups will also be included in the 'participant' random effect (since participants are effectively nested within language groups). If you look at the participant-by-participant predictions (i.e. including both the language group and the participant-level random effect in the prediction) do the results make more sense? Tangentially a little worried about your very high odds ratio for the intercept. At the baseline level your subjects have a probability of approximately 1-4e-26 (from plogis(58.45, lower.tail=FALSE)) of correct association? Do you have a continuous predictor whose values are far from zero so that the model baseline doesn't make sense? This should be independent of the other issues, but makes me wonder if you have complete separation and/or other sources of numerical instability lurking?
On 5/10/21 1:15 PM, Johnathan Jones wrote:
Hi all, I?m running a generalized linear mixed model in R (4.0.3) and while most findings are in-line with what could be expected, I?m getting one that?s off. Either I?m mixing something up in my equation or there?s a reasonable explanation for my results that I?m not seeing. I'm hoping someone here might be able to diagnose the issue. *Background* I am researching speech perception in second languages (n = 53, 48 items). Specifically, I am investigating how well a person's ability to accurately perceive words spoken in isolation predicts their ability to perceive words spoken in sentences. Different language groups have different language transfer issues which complicate things. Also impacting perception is association--whether you associate the word with its sentence context. *Variables of interest* Outcome variable (Y): perception of a word in a sentence Predictor variables: - iso1: the participant?s estimated ability to identify a word in isolation (a performance score. I?ve used raw and Rasch standardised scores here to see if results would change. No dice). - iso2: the participant?s estimated ability to discriminate between isolated words in sequences (a performance score, as described in iso1). - language: what language group the participant is from. Languages include English, Mandarin, and Spanish (note: Mandarin consistently outperforms Spanish in raw and standard scores] - association: whether the participant associates the target word in the sentence with the sentence. Association levels include same, different, and neutral (it's a little more nuanced, but this communicates what's necessary). Random variables: participant and item. Equation: Y ~ iso1 + iso2 + language + association + (1|participant) + (1|item) *Outputs *(via sjPlot) Predictor Odds ratio CI (Intercept) 58.45 18.47-184.90 - Iso1: 1.02 1.00-1.03 - Iso2: 1.03 1.01-1.04 - Association [same]: 2.44 0.23-0.49 - Association [different]: 0.34 1.64-3.61 - Language [Mandarin]: 0.04 0.01-0.12 - Language [Spanish]: 0.05 0.01-.18 The good from the output: Association works out. Participants have greater log odds of obtaining a correct answer when they associate the word with its sentential context. Not associating the word with the context tends to lead to misperception. There is a pretty large effect here. The bad from the output: Language is yielding opposite results than expected. The Spanish group has an odds ratio of .05 while the Mandarin group has an odds ratio of .04. This is irregular as Mandarin outperforms Spanish across all tasks (evidenced by raw scores and Rasch analysis). If the equation looks right, how can it be that a lower performing group (by every other task or metric) has a better odds ratio than a higher performing group when predicting performance? Any ideas as to what I might try to resolve the language variable issue or possible interpretations of what I see as a wonky result would be very much appreciated. Thank you! John Jones E: johnathan.jones at gmail.com SM: linkedin.com/in/johnathanjones [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
How were the data obtained? Are they from a designed experiment? Are the data balanced, i.e., equal numbers in each factor level and interaction? Is there an interaction between Association and language? Is it possible that an important explanatory variable has been omitted. Omission of a key variable or interaction can reverse the apparent direction of an effect. Also, check the matrix of correlations between model parameters. Remember that the regression coefficients are telling you how a variable affects outcome when all other variables are held constant. If there is a strongish correlation between two variables, this has implications for the individual coefficients. Re-parameterization can sometimes help, e.g., in another context (time to complete a hill race) work with distance and gradient (height/distance) rather than distance and height, with the effect of reducing the correlation to close to 0. John Maindonald email: john.maindonald at anu.edu.au<mailto:john.maindonald at anu.edu.au>
On 11/05/2021, at 05:15, Johnathan Jones <johnathan.jones at gmail.com<mailto:johnathan.jones at gmail.com>> wrote:
Hi all, I?m running a generalized linear mixed model in R (4.0.3) and while most findings are in-line with what could be expected, I?m getting one that?s off. Either I?m mixing something up in my equation or there?s a reasonable explanation for my results that I?m not seeing. I'm hoping someone here might be able to diagnose the issue. *Background* I am researching speech perception in second languages (n = 53, 48 items). Specifically, I am investigating how well a person's ability to accurately perceive words spoken in isolation predicts their ability to perceive words spoken in sentences. Different language groups have different language transfer issues which complicate things. Also impacting perception is association--whether you associate the word with its sentence context. *Variables of interest* Outcome variable (Y): perception of a word in a sentence Predictor variables: - iso1: the participant?s estimated ability to identify a word in isolation (a performance score. I?ve used raw and Rasch standardised scores here to see if results would change. No dice). - iso2: the participant?s estimated ability to discriminate between isolated words in sequences (a performance score, as described in iso1). - language: what language group the participant is from. Languages include English, Mandarin, and Spanish (note: Mandarin consistently outperforms Spanish in raw and standard scores] - association: whether the participant associates the target word in the sentence with the sentence. Association levels include same, different, and neutral (it's a little more nuanced, but this communicates what's necessary). Random variables: participant and item. Equation: Y ~ iso1 + iso2 + language + association + (1|participant) + (1|item) *Outputs *(via sjPlot) Predictor Odds ratio CI (Intercept) 58.45 18.47-184.90 - Iso1: 1.02 1.00-1.03 - Iso2: 1.03 1.01-1.04 - Association [same]: 2.44 0.23-0.49 - Association [different]: 0.34 1.64-3.61 - Language [Mandarin]: 0.04 0.01-0.12 - Language [Spanish]: 0.05 0.01-.18 The good from the output: Association works out. Participants have greater log odds of obtaining a correct answer when they associate the word with its sentential context. Not associating the word with the context tends to lead to misperception. There is a pretty large effect here. The bad from the output: Language is yielding opposite results than expected. The Spanish group has an odds ratio of .05 while the Mandarin group has an odds ratio of .04. This is irregular as Mandarin outperforms Spanish across all tasks (evidenced by raw scores and Rasch analysis). If the equation looks right, how can it be that a lower performing group (by every other task or metric) has a better odds ratio than a higher performing group when predicting performance? Any ideas as to what I might try to resolve the language variable issue or possible interpretations of what I see as a wonky result would be very much appreciated. Thank you! John Jones E: johnathan.jones at gmail.com<mailto:johnathan.jones at gmail.com> SM: linkedin.com/in/johnathanjones<http://linkedin.com/in/johnathanjones> _______________________________________________ R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models