Skip to content

Results of CFA with Lavaan

11 messages · R Help, Jeremy Miles, John Fox +1 more

#
I've just found the lavaan package, and I really appreciate it, as it
seems to succeed with models that were failing in sem::sem.  I need
some clarification, however, in the output, and I was hoping the list
could help me.

I'll go with the standard example from the help documentation, as my
problem is much larger but no more complicated than that.

My question is, why is there one latent estimate that is set to 1 with
no SD for each factor?  Is that normal?  When I've managed to get
sem::sem to fit a model this has not been the case.

Thanks,
Sam Stewart

HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '
fit <- sem(HS.model, data=HolzingerSwineford1939)
summary(fit, fit.measures=TRUE)
Lavaan (0.4-8) converged normally after 35 iterations

  Number of observations                           301

  Estimator                                         ML
  Minimum Function Chi-square                   85.306
  Degrees of freedom                                24
  P-value                                        0.000

Chi-square test baseline model:

  Minimum Function Chi-square                  918.852
  Degrees of freedom                                36
  P-value                                        0.000

Full model versus baseline model:

  Comparative Fit Index (CFI)                    0.931
  Tucker-Lewis Index (TLI)                       0.896

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)              -3737.745
  Loglikelihood unrestricted model (H1)      -3695.092

  Number of free parameters                         21
  Akaike (AIC)                                7517.490
  Bayesian (BIC)                              7595.339
  Sample-size adjusted Bayesian (BIC)         7528.739

Root Mean Square Error of Approximation:

  RMSEA                                          0.092
  90 Percent Confidence Interval          0.071  0.114
  P-value RMSEA <= 0.05                          0.001

Standardized Root Mean Square Residual:

  SRMR                                           0.065

Parameter estimates:

  Information                                 Expected
  Standard Errors                             Standard


                   Estimate  Std.err  Z-value  P(>|z|)
Latent variables:
  visual =~
    x1                1.000
    x2                0.554    0.100    5.554    0.000
    x3                0.729    0.109    6.685    0.000
  textual =~
    x4                1.000
    x5                1.113    0.065   17.014    0.000
    x6                0.926    0.055   16.703    0.000
  speed =~
    x7                1.000
    x8                1.180    0.165    7.152    0.000
    x9                1.082    0.151    7.155    0.000

Covariances:
  visual ~~
    textual           0.408    0.074    5.552    0.000
    speed             0.262    0.056    4.660    0.000
  textual ~~
    speed             0.173    0.049    3.518    0.000

Variances:
    x1                0.549    0.114    4.833    0.000
    x2                1.134    0.102   11.146    0.000
    x3                0.844    0.091    9.317    0.000
    x4                0.371    0.048    7.778    0.000
    x5                0.446    0.058    7.642    0.000
    x6                0.356    0.043    8.277    0.000
    x7                0.799    0.081    9.823    0.000
    x8                0.488    0.074    6.573    0.000
    x9                0.566    0.071    8.003    0.000
    visual            0.809    0.145    5.564    0.000
    textual           0.979    0.112    8.737    0.000
    speed             0.384    0.086    4.451    0.000
#
What do you mean by latent estimate?

The table of variances has  variances for each factors.

Is there something different in the sem output that you don't see here?

Yes, this looks normal.

Jeremy
On 8 June 2011 13:14, R Help <rhelp.stats at gmail.com> wrote:
#
Dear Sam,

In each case, the first observed variable is treated as a "reference
indicator" with its coefficient fixed to 1 to establish the metric of the
corresponding factor and therefore to identify the model. If you didn't do
the same thing (or something equivalent, such as fixing the factor variances
to 1) in specifying the model to sem::sem(), that might account for the
problems you encountered.

Best,
 John

--------------------------------
John Fox
Senator William McMaster
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox
#
Yes, that is the difference.  For the last SEM I built I fixed the
factor variances to 1, and I think that's what I want to do for the
CFA I'm doing now.  Does that make sense for a CFA?

I'll try figuring out how to do that with lavaan later, but my model
takes so long to fit that I can't try it right now.

Thanks,
Sam
On Wed, Jun 8, 2011 at 5:58 PM, John Fox <jfox at mcmaster.ca> wrote:
#
Dear Sam,
Sure -- then the factor covariances are correlations. The point is that you
have to do something to fix the metrics of the factors and identify the
model.
Maybe that should tell you something about the conditioning of the problem.

Best,
 John
#
On 06/08/2011 11:56 PM, R Help wrote:
If you have a latent variable in your model (like a factor in CFA), you 
need to define its metric/scale. There are typically two ways to do 
this: 1) fix the variance of the latent variable to a constant 
(typically 1.0), or 2) fix the factor loading of one of the indicators 
of the factor (again to 1.0). For CFA with a single group, it should not 
matter which method you choose. The fit measures will be identical.

Lavaan by default uses the second option. If you prefer the first 
(fixing the variances), you can simply add the 'std.lv=TRUE' option to 
the cfa() call, and lavaan will take care of the rest.
You can use the 'verbose=TRUE' argument to monitor progress. You may 
also use the options se="none" (no standard errors) and test="none" (no 
test statistic) to speed things up, if you are still constructing your 
model. Or the model does not convergence, but I should see both the 
model and the data to determine the possible cause.

Hope this helps,

Yves Rosseel
http://lavaan.org
#
Thanks for the help, the std.lv=TRUE command is exactly what I was
looking for.  As you stated, it doesn't matter in terms of overall
model fit, but my client is more interested in the loadings than the
factor variances.

In terms of speed, it's just a very large model (7 factors, 90
observations, only ~560 subjects) with missing values, so I don't
expect much in terms of speed.  I think the overall conclusion for the
project is that the model is poorly specified, but whether that's the
model itself or the lack of samples is difficult to determine at this
point.

Thanks for your help, and I'll certainly be using lavaan in the future,
Sam
On Thu, Jun 9, 2011 at 6:19 AM, yrosseel <yrosseel at gmail.com> wrote:
#
Ok, I think this is the last question I have.  My model is producing
an estimate of intercepts for my variables along with my loadings.
meanstructure	option in cfa.  It says that setting it to TRUE includes
the intercepts, and setting it to "default" means thatthe value is set
based on the user-specified model, and/or the values of other
arguments.  I've included my model specification below, and I would
prefer not to fit intercepts, but setting it to FALSE does not seem to
achieve this.

Thanks,
Sam

F1 =~ reFDE + ReFUIDGreg + reFDRwithDDRV + reparD + reparDR +
reparRisk + reWDD + reWDH + reWSP + reWDIS + reWCell + reWFAT + reAanx
+ reDanx + reDstress + reAstress
F2 =~ reSI1 + reSI2 + reSI3 + reSI4 + reSimDE + reSimDD + reSimDrug + reSimDRD
F3 =~ RENOINTEND + RETRYNOTD + RENOSTARTD + REUSEDD + REWILLD1 + REDU1
+ REDA1 + RERIDE1 + REAFTER1 + REUSEC1 + REUSESP1 + REUM1 + REABUSE1 +
RESB1 + REMIGHT1
F4 =~ retrydrink + RetryDope + reNoD + reLeaveD + reDeDR + reDopeNo +
reDopeleave + reDopeDD + reP3D
F5 =~ reP3DA + reP3DD + reP3DRD + reP3Equip + reP3UC + reP3SP + reP3UM
+ reP3Abuse + reP3SB + reP3helmet + reP1DADR + reP1DRUG + reP1SP
F6 =~ reinjwhileDU + reinjwhileWDUDRV + reinjwhileDA +
reinjwhileDRafterD + reinjwhileUcrack + reinjwhileUM +
reinjwhileabusePRDG + reinjwhilenoSB + reinjwhilenohelmet
F7 =~ relikeDR + relikeSP + relikeDIS + relikeCELL + relikeDROW +
relikeDRUG + restupid + reimmature + takerisksFthinkcool +
takeriskFthinkIMP + takeriskFthinkbrave + takeriskFthinkexciting +
reSELF + reNORISK + reNOPERSON + reNOCONSE + reWRONG + reGEAR +
reCONSEQ + reSUCES
On Thu, Jun 9, 2011 at 6:19 AM, yrosseel <yrosseel at gmail.com> wrote:
#
On 06/09/2011 05:21 PM, R Help wrote:
Several arguments of the cfa() function force meanstructure=TRUE (and 
indeed, silently overriding the meanstructure=FALSE option if specified 
by the user; perhaps, lavaan should spit out a warning if this happens).

The following argument choices force meanstructure to be TRUE (if there 
is only a single group):

- estimator = "mlm" or "mlf" or "mlr"
- missing = "ml" or "fiml"

Did you use any one of those arguments?

But why would you prefer not to fit the intercepts? If there are no 
restrictions on the intercepts/means, fitting them should have no effect 
on your model fit whatsoever.

Yves Rosseel
http://lavaan.org
#
I am using missing = 'fiml', which would require estimating intercepts.

I figured they would effect my overall model fit, but can I still
estimate my loading coefficients the same way?

The warning would be helpful, but if I had looked closer into the
'fiml' option I might have been able to figure it out myself.

Thanks,
Sam
On Thu, Jun 9, 2011 at 1:02 PM, yrosseel <yrosseel at gmail.com> wrote:
#
On 06/09/2011 06:06 PM, R Help wrote:
Yes, no problem.

Yves Rosseel
http://lavaan.org