Skip to content
Prev 14371 / 20628 Next

Does changing the reference level cause any difference in results?

Hi all,
I am analysing a dataset 'qaaf' (attached) using logistic regression.
The dataset includes:
1. speaker: participants in my study
2. item: words as used by my participants
3. gender: independent variable (2 levels: 'female' and 'male')
4. age.group: independent variable (3 levels:  'middle-aged',  'old' and
'young')
5. education: independent variable (3 levels:  'postgraduate', 'secondary
or below' and 'university')
6. residence: independent variable (3 levels: 'migrant', 'urbanite' and
'villager')
7. convergence: the dependent variable (whether a speaker uses a CA or MA
form).  Here, I am testing whether my participants use the CA form or not.
This is the form of the prestigious dialect in Egypt. If they use MA, this
means that they use their traditional dialect. I am trying to find out
which factor (independent variable) is responsible or more responsible for
using the CA form.
As the target is CA and this (alphabetically) takes the 0 value,
I re-levelled the dependent variable (convergence) to change the value of
CA from 0 to 1,  as follows:
(a) attach(qaaf)
(b) qaaf$convergence= factor(convergence, levels=c(MA', 'CA'))
I also re-levelled these variables:
(c) qaaf$education=factor(education, levels=c("secondary or below",
"university",  "postgraduate"))
(d) qaaf$residence = factor(residence, levels=c('villager', 'migrant',
'urbanite'))
(e) qaaf$age.group = factor(age.group, levels=c('young', 'middle-aged',
'old'))

I re-levelled the variables in (c), (d) and (e) because these are ordinal
variables (e.g. old people were middle-aged one day and before that had
been young). My question may be general:
Q: Does changing the reference level cause any difference in results?
or
Q: Is leaving the variable levels alphabetically arranged good or bad? Put
another way, when should levels be left alphabetically arranged and when
should they be re-levelled?

Best