An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120727/f78afe30/attachment.pl>
How can I use IPF function correctly?
2 messages · Miao Zhang, David L Carlson
It is not clear what you are trying to do. The ipf() function you are using seems to be the one included in package cat for imputing missing values for categorical variables. For ipf() you have not read the instructions carefully because you have entered the marginal values, not their dimensions and you have given ipf() a 2 way table but miss-specified a three way model. No wonder it is confused. Function loglin() which is part of the included stats package also does iterative proportional fitting. Iterative proportional fitting (ipf) is used for fitting models for categorical data when there are three or more variables. There is no need for ipf on a table with two variables since, the values can be directly calculated. Your example data does not include the raw data counts (as it should), but percentages for each of the 3 x 2 cells (I assume, since they sum to 100). The marginal values you list (again percentages) are for a model assuming equal margins. That is easily computed as 1/3*1/2*100 (one third in each row by one half in each column times 100). So each cell should be 16.667 percent of the total. Using loglin() that would be specified as follows:
loglin(raw, margin=list(0), fit=TRUE)
0 iterations: deviation
$lrt
[1] 25.87661
$pearson
[1] 23.80933
$df
[1] 5
$margin
[1] 0
$fit
[,1] [,2]
[1,] 16.66667 16.66667
[2,] 16.66667 16.66667
[3,] 16.66667 16.66667
The lrt and pearson statistics are not valid because you are not using
original counts. Note that the number of iterations is 0 because in a 2 way
model the values are directly computed.
----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of Miao Zhang
Sent: Friday, July 27, 2012 6:52 AM
To: r-help at r-project.org
Subject: [R] How can I use IPF function correctly?
Hi All,
I am trying to creat a simple example byusing ipf function in R, but i
could not get it succefully...I am very new to R, does anyone could
help,
to instruct me about this ipf fucntion?
Actually, this is what I mean
50 | 50
----------------------
33.4| 28.57 | 14.29
33.3| 23.81 | 4.762
33.3| 9.523 | 19.05
----------------------
A 3*2 matrix
raw<-matrix(c(28.571,14.286,23.809,4.762,9.523,19.049),3, 2,byrow=TRUE)
the sum of margin (the value I am setting as the target)
m<-c(33.4,50,0,33.3,50,0,33.3,50)
then call ipf function:
fit1<-ipf(table, margins=m,start=raw,eps = 1e-04, maxits = 50, showits
=
TRUE)
I could calculate it by hand with 7 iterations, but end by I am hoping
to
get R build in ipf function to get it done, what should I put "table"
here?
Thanks in advance!
Mandy
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.