Skip to content

within-subject stdized regression w missing data

5 messages · Jonathan Baron, Thomas Lumley, M. Edward (Ed) Borasky +1 more

#
I am looking for an elegant solution to the following problem.  I
have one that works, but it is ugly.

In a questionnaire, each of 80 subjects answered 8 questions
about each of 30 different behaviors.  My main method of analysis
is within-subject regression, in which I predict the answer to
one of the 8 questions from answers to some of the other
questions - different subsets for different analyses.  The 30
behaviors are the units of analysis.  (That is, there is an 8x30
matrix for each subject.  One of the 8 variables is the dependent
variable, and some of the others are predictors.)

There are lots of missing data, sometimes so much as to make a
given subject's regression impossible for one analysis.

I also want standardized regression weights.  The standardization
should include just the items used in each regression.  I've been
using scale() on all the variables in order to get standardized
weights, but it is not a simple matter to arrange this so as to
get just the right subset of the 30 behaviors for each
regression, because the behaviors that are missing for one
analysis are not necessarily missing for another (even for the
same subject).

I have been using a loop to do the regressions, e.g.,

for (i in 1:numberofsubjects) 
 r.v1[i,] <- lm(v1[i,] ~ v2[i,] + v3[i,] + v4[i,])$coef[2:4]

where v1 etc. are already selected and rescaled appropriately for
the variables included.  This gives me a matrix of the regression
coefficients, which I can then test across subjects.

One thing I've got to do is make sure that each subject's
regression will actually run, or omit the ones that won't, or
else the whole thing bombs.  I have been unable to get lm() to
proceed without doing this with several extra steps.  (I did read
the help page.)

I've thought about modifing the lm() function - this would be my
first attempt at such a thing - or else putting the whole thing
in some kind of wrapper that uses scale() on a matrix.  (I've
been using it on vectors.)  But the matrix would have to have
just the right entries in the end.

I'm writing just in case there is some really obvious solution to
this that I'm missing.

Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~jbaron
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
On Sat, 21 Apr 2001, Jonathan Baron wrote:

            
This part can be easily solved. Use try() as a wrapper around your lm().
It captures errors and returns an object of class "try-error" instead of
crashing.

	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
1 day later
#
I have just finished loading Red Hat Linux 7.1, R-base 1.2.2.1 and almost
all of the contributed packages on my laptop. For those of you who want to
do the same, here's a brief run-down of the steps involved.

1. Pick which version of Red Hat Linux 7.1 you want. RH Linux 7.1 comes in
three versions: Red Hat Linux 7.1 ($39.95 US), Deluxe Workstation ($79.95
US) and Professional Server ($179.95). Most people go with the low end. I
got the Deluxe Workstation, which actually includes R-base (see below,
though). But almost all of what I did will work with the basic version.

2. Install Red Hat Linux. I won't go into the details, except to say that
there is a point in the install process where you can select extra packages
to install. If you go with the default installs, you won't get everything
you need later, and will end up going back and installing various packages
from the CDs to make everything work. It's much easier to know ahead of time
what you're going to need and select the packages when you install. Here's a
run-down of what you should select:

a. If you're going to build contributed packages, you will need "gcc" and
"g77". They can be found in the "Development" section. The full names of
these are "gcc-2.96-81.i386.rpm" and "gcc-g77-2.96-81.i386.rpm". If you just
select them, the installer will figure out the dependencies and load them as
well. If you wait till later, you will have to flip back and forth between
two CDs to satisfy all the dependencies. I did :-(. Do as I say, not as I
did :-).

b. If you're going to use ZIP files, you need to install "zip" and "unzip".
They don't appear to be loaded by default. They can be found in the
"Archiving" section. The full names are "unzip-5.41-3.i386.rpm" and
"zip-2.3-8.i386.rpm".

c. Some of the contributed packages require the Basic Linear Algebra
Subroutines (BLAS) and LAPACK. They can be found in the "Libraries" section.
The full names are "blas-3.0-9.i386.rpm" and "lapack-3.0-9.i386.rpm".

d. The database packages require the specified database to be loaded and
often the development libraries for the database as well. They can be found
in the "Databases" and "Libraries" sections. I only loaded "RODBC"; the
specific package required is "unixODBC-devel-1.8.13-2.i386.rpm", which is a
library and is found in the "Libraries" section.

e. The "XML" package requires "libxml-devel-1.8.10-1.i386.rpm" from the
"Libraries" section.

3. Install R. If you buy the Deluxe Workstation edition, the version of R on
the PowerTools CD is 1.2.0; specifically "R-base-1.2.0-6.i386.rpm". The VR
package requires a more recent version of R, R 1.2.2. I loaded
"R-base-1.2.2-1.i386.rpm" from CRAN.

4. Install the contributed packages you want. I installed *everything* from
CRAN except the databases I don't have. The only gotcha here is that the
"netCDF" package requires "netcdf-3.4-9.i386.rpm". This one is *not* on the
basic distribution. It can be found on the PowerTools CD which comes in the
Deluxe Workstation distribution. As a result, you won't be able to load it
during the install process; you'll need to install it afterwards but before
you build "netCDF". I'm sure it's available on the net somewhere, but I
didn't bother to search for it.
--
M. Edward (Ed) Borasky, Chief Scientist, Borasky Research
http://www.borasky-research.net  http://www.aracnet.com/~znmeb
mailto:znmeb at borasky-research.com  mailto:znmeb at aracnet.com

If there's nothing to astrology, how come so many famous men were born on
holidays?

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Oops!! It seems I missed one :-). It turns out that "e1071" and "Matrix" use
"c++", not "gcc"! So in item 2.a., add "gcc-c++-2.96-81.i386.rpm" to the
list of packages to load. BTW, my environment is a Toshiba 2595CDT laptop, a
400 MHz Celeron with 192 MB of RAM and a 6 GB hard drive. I am currently
dual-booted Windows 2000 Professional and Red Hat Linux 7.1. 4 GB of the
hard drive is Windows 2000 (FAT32, which the Linux OS can read from and
write to) and 2 GB is Linux (ext2 and swap, which the Windows OS can't
access). I have R running on both sides.
--
M. Edward (Ed) Borasky, Chief Scientist, Borasky Research
http://www.borasky-research.net  http://www.aracnet.com/~znmeb
mailto:znmeb at borasky-research.com  mailto:znmeb at aracnet.com

If there's nothing to astrology, how come so many famous men were born on
holidays?

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
"M. Edward Borasky" <znmeb at aracnet.com> writes:
One piece of advice: If the disk is big enough make a directory which
holds the entire content of the CDs' RPM directories. Makes it much
easier to fix problems with missing tools and utilities when you're on
the go...