Hi all, I've been trying to get a large (12mb) Stata
survey database into R. I managed that, but when I
attach survey weights, something goes wrong. The error
message is: object dchina not found. Here's the
script:
library(car)
library(foreign)
library(survey)
China <- read.dta("C:/final07c2.dta")
attach(China)
data(China)
dchina<-svydesign(id=~psu,strata=~strata,weights=~weight0x,data=China,nest=TRUE)
summary(dchina)
Any thoughts?
-Bobby
survey weights
7 messages · A Das, Thomas Lumley
On Sat, 3 Sep 2005, A Das wrote:
Hi all, I've been trying to get a large (12mb) Stata survey database into R. I managed that, but when I attach survey weights, something goes wrong. The error message is: object dchina not found. Here's the script:
If that is the *first* message then something extremly strange is happening
library(car)
library(foreign)
library(survey)
China <- read.dta("C:/final07c2.dta")
attach(China)
This attach() isn't necessary or helpful
data(China)
You should get a warning here Warning message: data set 'China' not found in: data(China) since China isn't one of the built-in data sets. If you don't get this message it suggests that you do have a built-in dataset called China, which will have overwritten your file.
dchina<-svydesign(id=~psu,strata=~strata,weights=~weight0x,
data=China,nest=TRUE) If this line doesn't produce an error message then a variable called "dchina" must have been produced, in which case you shouldn't get an error message saying it wasn't found in the next line.
summary(dchina)
Are you sure there wasn't an earlier error message from the call to svydesign()? -thomas
Thanks, Thomas.
Yes, that's exactly what happened: the warnings
came first after "data(China)", and then after
"dchina<-svydesign..." So the design object isn't
being produced? The dataset is very large, and the
weights were already set in Stata before importing.
Would either of those cause problems?
-Bobby
--- Thomas Lumley <tlumley at u.washington.edu> wrote:
On Sat, 3 Sep 2005, A Das wrote:
Hi all, I've been trying to get a large (12mb)
Stata
survey database into R. I managed that, but when I attach survey weights, something goes wrong. The
error
message is: object dchina not found. Here's the script:
If that is the *first* message then something extremly strange is happening
library(car)
library(foreign)
library(survey)
China <- read.dta("C:/final07c2.dta")
attach(China)
This attach() isn't necessary or helpful
data(China)
You should get a warning here Warning message: data set 'China' not found in: data(China) since China isn't one of the built-in data sets. If you don't get this message it suggests that you do have a built-in dataset called China, which will have overwritten your file.
dchina<-svydesign(id=~psu,strata=~strata,weights=~weight0x,
data=China,nest=TRUE) If this line doesn't produce an error message then a variable called "dchina" must have been produced, in which case you shouldn't get an error message saying it wasn't found in the next line.
summary(dchina)
Are you sure there wasn't an earlier error message from the call to svydesign()? -thomas
On Sun, 4 Sep 2005, A Das wrote:
Thanks, Thomas. Yes, that's exactly what happened: the warnings came first after "data(China)", and then after "dchina<-svydesign..." So the design object isn't being produced? The dataset is very large, and the weights were already set in Stata before importing. Would either of those cause problems?
Probably not. What was the error message from svydesign()? That is what will say what went wrong. -thomas
Just: "missing values in object". That would imply the
object was created. But then I write "dchina", and it
says "object dchina not found".
-Bobby
--- Thomas Lumley <tlumley at u.washington.edu> wrote:
On Sun, 4 Sep 2005, A Das wrote:
Thanks, Thomas. Yes, that's exactly what happened: the warnings came first after "data(China)", and then after "dchina<-svydesign..." So the design object isn't being produced? The dataset is very large, and the weights were already set in Stata before
importing.
Would either of those cause problems?
Probably not. What was the error message from svydesign()? That is what will say what went wrong. -thomas
On Sun, 4 Sep 2005, A Das wrote:
Just: "missing values in object". That would imply the object was created. But then I write "dchina", and it says "object dchina not found".
No, it would not imply the object was created. If it was an error message
(rather than a warning) the object would not have been created.
I presume the full message was
Error in na.fail.default(object) : missing values in object
If so, it sounds as though you have missing values in the id, weights, or
strata variable.
summary(China[,c("psu","stata","weight0x"])
will verify this.
Stata will just have dropped these observations (use -svydes- to verify
this). If you want to drop the observations in R you need to do this
explicitly. Having missing data may be unavoidable, but if you have
observations in a sample it seems that you should know how they were
sampled.
To drop these observations you could use
obsChina <- subset(China, !is.na(psu) & !is.na(strata) & !is.na(weight0x))
and then use obsChina rather than China in the svydesign() function.
-thomas
-Bobby --- Thomas Lumley <tlumley at u.washington.edu> wrote:
On Sun, 4 Sep 2005, A Das wrote:
Thanks, Thomas. Yes, that's exactly what happened: the warnings came first after "data(China)", and then after "dchina<-svydesign..." So the design object isn't being produced? The dataset is very large, and the weights were already set in Stata before
importing.
Would either of those cause problems?
Probably not. What was the error message from svydesign()? That is what will say what went wrong. -thomas
____________________________________________________ Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs
Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
That worked. Many thanks, Thomas.
-Bobby
--- Thomas Lumley <tlumley at u.washington.edu> wrote:
On Sun, 4 Sep 2005, A Das wrote:
Just: "missing values in object". That would imply
the
object was created. But then I write "dchina", and
it
says "object dchina not found".
No, it would not imply the object was created. If
it was an error message
(rather than a warning) the object would not have
been created.
I presume the full message was
Error in na.fail.default(object) : missing values
in object
If so, it sounds as though you have missing values
in the id, weights, or
strata variable.
summary(China[,c("psu","stata","weight0x"])
will verify this.
Stata will just have dropped these observations (use
-svydes- to verify
this). If you want to drop the observations in R
you need to do this
explicitly. Having missing data may be unavoidable,
but if you have
observations in a sample it seems that you should
know how they were
sampled.
To drop these observations you could use
obsChina <- subset(China, !is.na(psu) &
!is.na(strata) & !is.na(weight0x))
and then use obsChina rather than China in the
svydesign() function.
-thomas
-Bobby --- Thomas Lumley <tlumley at u.washington.edu>
wrote:
On Sun, 4 Sep 2005, A Das wrote:
Thanks, Thomas. Yes, that's exactly what happened: the
warnings
came first after "data(China)", and then after "dchina<-svydesign..." So the design object
isn't
being produced? The dataset is very large, and
the
weights were already set in Stata before
importing.
Would either of those cause problems?
Probably not. What was the error message from svydesign()? That is what will say what went wrong. -thomas
____________________________________________________ Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle