I am looking for an elegant solution to the following problem. I have one that works, but it is ugly. In a questionnaire, each of 80 subjects answered 8 questions about each of 30 different behaviors. My main method of analysis is within-subject regression, in which I predict the answer to one of the 8 questions from answers to some of the other questions - different subsets for different analyses. The 30 behaviors are the units of analysis. (That is, there is an 8x30 matrix for each subject. One of the 8 variables is the dependent variable, and some of the others are predictors.) There are lots of missing data, sometimes so much as to make a given subject's regression impossible for one analysis. I also want standardized regression weights. The standardization should include just the items used in each regression. I've been using scale() on all the variables in order to get standardized weights, but it is not a simple matter to arrange this so as to get just the right subset of the 30 behaviors for each regression, because the behaviors that are missing for one analysis are not necessarily missing for another (even for the same subject). I have been using a loop to do the regressions, e.g., for (i in 1:numberofsubjects) r.v1[i,] <- lm(v1[i,] ~ v2[i,] + v3[i,] + v4[i,])$coef[2:4] where v1 etc. are already selected and rescaled appropriately for the variables included. This gives me a matrix of the regression coefficients, which I can then test across subjects. One thing I've got to do is make sure that each subject's regression will actually run, or omit the ones that won't, or else the whole thing bombs. I have been unable to get lm() to proceed without doing this with several extra steps. (I did read the help page.) I've thought about modifing the lm() function - this would be my first attempt at such a thing - or else putting the whole thing in some kind of wrapper that uses scale() on a matrix. (I've been using it on vectors.) But the matrix would have to have just the right entries in the end. I'm writing just in case there is some really obvious solution to this that I'm missing. Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~jbaron -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
within-subject stdized regression w missing data
5 messages · Jonathan Baron, Thomas Lumley, M. Edward (Ed) Borasky +1 more
On Sat, 21 Apr 2001, Jonathan Baron wrote:
One thing I've got to do is make sure that each subject's regression will actually run, or omit the ones that won't, or else the whole thing bombs. I have been unable to get lm() to proceed without doing this with several extra steps. (I did read the help page.)
This part can be easily solved. Use try() as a wrapper around your lm(). It captures errors and returns an object of class "try-error" instead of crashing. -thomas Thomas Lumley Asst. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
1 day later
I have just finished loading Red Hat Linux 7.1, R-base 1.2.2.1 and almost all of the contributed packages on my laptop. For those of you who want to do the same, here's a brief run-down of the steps involved. 1. Pick which version of Red Hat Linux 7.1 you want. RH Linux 7.1 comes in three versions: Red Hat Linux 7.1 ($39.95 US), Deluxe Workstation ($79.95 US) and Professional Server ($179.95). Most people go with the low end. I got the Deluxe Workstation, which actually includes R-base (see below, though). But almost all of what I did will work with the basic version. 2. Install Red Hat Linux. I won't go into the details, except to say that there is a point in the install process where you can select extra packages to install. If you go with the default installs, you won't get everything you need later, and will end up going back and installing various packages from the CDs to make everything work. It's much easier to know ahead of time what you're going to need and select the packages when you install. Here's a run-down of what you should select: a. If you're going to build contributed packages, you will need "gcc" and "g77". They can be found in the "Development" section. The full names of these are "gcc-2.96-81.i386.rpm" and "gcc-g77-2.96-81.i386.rpm". If you just select them, the installer will figure out the dependencies and load them as well. If you wait till later, you will have to flip back and forth between two CDs to satisfy all the dependencies. I did :-(. Do as I say, not as I did :-). b. If you're going to use ZIP files, you need to install "zip" and "unzip". They don't appear to be loaded by default. They can be found in the "Archiving" section. The full names are "unzip-5.41-3.i386.rpm" and "zip-2.3-8.i386.rpm". c. Some of the contributed packages require the Basic Linear Algebra Subroutines (BLAS) and LAPACK. They can be found in the "Libraries" section. The full names are "blas-3.0-9.i386.rpm" and "lapack-3.0-9.i386.rpm". d. The database packages require the specified database to be loaded and often the development libraries for the database as well. They can be found in the "Databases" and "Libraries" sections. I only loaded "RODBC"; the specific package required is "unixODBC-devel-1.8.13-2.i386.rpm", which is a library and is found in the "Libraries" section. e. The "XML" package requires "libxml-devel-1.8.10-1.i386.rpm" from the "Libraries" section. 3. Install R. If you buy the Deluxe Workstation edition, the version of R on the PowerTools CD is 1.2.0; specifically "R-base-1.2.0-6.i386.rpm". The VR package requires a more recent version of R, R 1.2.2. I loaded "R-base-1.2.2-1.i386.rpm" from CRAN. 4. Install the contributed packages you want. I installed *everything* from CRAN except the databases I don't have. The only gotcha here is that the "netCDF" package requires "netcdf-3.4-9.i386.rpm". This one is *not* on the basic distribution. It can be found on the PowerTools CD which comes in the Deluxe Workstation distribution. As a result, you won't be able to load it during the install process; you'll need to install it afterwards but before you build "netCDF". I'm sure it's available on the net somewhere, but I didn't bother to search for it. -- M. Edward (Ed) Borasky, Chief Scientist, Borasky Research http://www.borasky-research.net http://www.aracnet.com/~znmeb mailto:znmeb at borasky-research.com mailto:znmeb at aracnet.com If there's nothing to astrology, how come so many famous men were born on holidays? -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Oops!! It seems I missed one :-). It turns out that "e1071" and "Matrix" use "c++", not "gcc"! So in item 2.a., add "gcc-c++-2.96-81.i386.rpm" to the list of packages to load. BTW, my environment is a Toshiba 2595CDT laptop, a 400 MHz Celeron with 192 MB of RAM and a 6 GB hard drive. I am currently dual-booted Windows 2000 Professional and Red Hat Linux 7.1. 4 GB of the hard drive is Windows 2000 (FAT32, which the Linux OS can read from and write to) and 2 GB is Linux (ext2 and swap, which the Windows OS can't access). I have R running on both sides. -- M. Edward (Ed) Borasky, Chief Scientist, Borasky Research http://www.borasky-research.net http://www.aracnet.com/~znmeb mailto:znmeb at borasky-research.com mailto:znmeb at aracnet.com If there's nothing to astrology, how come so many famous men were born on holidays? -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
"M. Edward Borasky" <znmeb at aracnet.com> writes:
Oops!! It seems I missed one :-). It turns out that "e1071" and "Matrix" use "c++", not "gcc"! So in item 2.a., add "gcc-c++-2.96-81.i386.rpm" to the list of packages to load. BTW, my environment is a Toshiba 2595CDT laptop, a 400 MHz Celeron with 192 MB of RAM and a 6 GB hard drive. I am currently dual-booted Windows 2000 Professional and Red Hat Linux 7.1. 4 GB of the hard drive is Windows 2000 (FAT32, which the Linux OS can read from and write to) and 2 GB is Linux (ext2 and swap, which the Windows OS can't access). I have R running on both sides.
One piece of advice: If the disk is big enough make a directory which holds the entire content of the CDs' RPM directories. Makes it much easier to fix problems with missing tools and utilities when you're on the go...
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._