-----Original Message-----
From: Justin Delahunty [mailto:ACU at genius.net.au]
Sent: Friday, July 26, 2013 2:22 PM
To: PIKAL Petr; 'Justin Delahunty'; r-help at r-project.org
Subject: RE: [R] Maintaining data order in factanal with missing data
Hi Petr,
Thanks for the quick response. Unfortunately I cannot share the data I
am working with, however please find attached a suitable R workspace
with generated data. It has the appropriate variable names, only the
data has been changed.
The last function in the list (init.dfs()) I call to subset the overall
data set into the three waves, then conduct the factor analysis on each
(1 factor CFA); it's just in a function to ease re-typing in a new
workspace.
Thanks,
Justin
-----Original Message-----
From: PIKAL Petr [mailto:petr.pikal at precheza.cz]
Sent: Friday, 26 July 2013 7:35 PM
To: Justin Delahunty; r-help at r-project.org
Subject: RE: [R] Maintaining data order in factanal with missing data
Hi
You provided functions, so far so good. But without data it would be
quite difficult to understand what the functions do and where could be
the issue.
I suspect combination of complete cases selection together with subset
and factor behaviour. But I can be completely out of target too.
Petr
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of s00123776 at myacu.edu.au
Sent: Friday, July 26, 2013 9:35 AM
To: r-help at r-project.org
Subject: [R] Maintaining data order in factanal with missing data
Hi,
I'm new to R, so sorry if this is a simple answer. I'm currently
trying to collapse some ordinal variables into a composite; the
program ideally should take a data frame as input, perform a factor
analysis, compute factor scores, sds, etc., and return the rescaled
scores and loadings. The difficulty I'm having is that my data set
contains a number of NA, which I am excluding from the analysis using
complete.cases(), and thus the incomplete cases are "skipped". These
functions are for a longitudinal data set with repeated waves of
so the final rescaled scores from each wave need to be saved as
variables grouped by a unique ID (DMID). The functions I'm trying to
implement are as follows:
weighted.sd<-function(x,w){
sum.w<-sum(w)
sum.w2<-sum(w^2)
mean.w<-sum(x*w)/sum(w)
x.sd.w<-sqrt((sum.w/(sum.w^2-sum.w2))*sum(w*(x-mean.w)^2))
return(x.sd.w)
}
re.scale<-function(f.scores, raw.data, loadings){
fz.scores<-(f.scores+mean(f.scores))/(sd(f.scores))
means<-apply(raw.data,1,weighted.mean,w=loadings)
sds<-apply(raw.data,1,weighted.sd,w=loadings)
grand.mean<-mean(means)
grand.sd<-mean(sds)
final.scores<-((fz.scores*grand.sd)+grand.mean)
return(final.scores)
}
get.scores<-function(data){
fact<-
factanal(data[complete.cases(data),],factors=1,scores="regression")
f.scores<-fact$scores[,1]
f.loads<-fact$loadings[,1]
rescaled.scores<-re.scale(f.scores,
data[complete.cases(data),], f.loads)
output.list<-list(rescaled.scores,
f.loads)
names(output.list)<-
c("rescaled.scores",
"factor.loadings")
return(output.list)
}
init.dfs<-function(){
ab.1.df<-subset(ab.df,,select=c(dmid,g5oab2:g5ovb1))
ab.2.df<-subset(ab.df,,select=c(dmid,w2oab3:w2ovb1))
ab.3.df<-subset(ab.df,,select=c(dmid,
w3oab3, w3oab4, w3oab7, w3oab8, w3ovb1))
ab.1.fa<-get.scores(ab.1.df[-1])
ab.2.fa<-get.scores(ab.2.df[-1])
ab.3.fa<-get.scores(ab.3.df[-1])
}
Thanks for your help,
Justin
[[alternative HTML version deleted]]