Skip to content
Prev 32919 / 398513 Next

Logistic regression problem: propensity score matching

Thanks for your reply.

I am using logistic regression because my response variable is 
categorical - and this seems to be recommended in the literature (by 
Heckman, Smith and others).

The response variable is named sample because that is what it is - I am 
new to R so haven't quite got into habits of naming using Title Case.

Having selected a sample from the action area, randomly, the aim is to 
find people to survey who, if they had been in the action area, would 
have had as close odds of being in the samle as those actually selected.

Therefore I select a sample using sample(), write that out back to the 
access database as a new table, then (from R) run a query which creates 
a dataset comprising all those in the action area who could have been 
selected, and the control area group, and read this back in to R (using 
as many characteristics as possible except area) before undertaking the 
logistic regression. sample can take the values 0 (not in sample) or 1 
(n sample).

The aim is to find the odds of being in the sample (by characteristics) 
which is the Propensity Score, and then match action to control using 
Propensity Score Matching.

I have MASS but was unable to locate logistic regression, which I was 
advised was the standard method for my problem.

Thanks again.
Prof Brian Ripley wrote: