joining columns as in a relational database
?merge says
Merge two data frames by common columns or row names, or do other
versions of database ``join'' operations.
and it has done all the examples of this sort of thing that I have ever
needed.
On 24 Jun 2003, Douglas Bates wrote:
In our recent workshop on "Multilevel Modeling in R" we discussed handling data for multilevel modeling. An classic example of such data are test scores of students grouped into schools. We may wish to model the scores as functions of both student-level covariates and school-level covariates. Such data are often organized in a multi-table format with a separate table for each level of information. The MathAchieve and MathAchSchool data frames in the nlme package are examples of such an organization. The HLM software requires the data to be organized like this. To fit a model in R we need to create a composite table by "joining" the columns of the student-level and school-level tables, in the relational database sense of "join". I have created a function to join the columns from two such frames according to the values of a key column. In relational database terms the key column must be a primary key for the second frame. I have called this function 'cjoin', by analogy to cbind. You can try data(MathAchieve, package = 'nlme') data(MathAchSchool, package = 'nlme') cjoin(MathAchieve, MathAchSchool, "School") cjoin(MathAchieve, MathAchSchool, "School", which = "Sector") as examples Several questions: - Am I duplicating existing functionality? - Is cjoin a good name for such a function? - Would this be useful in base?
Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595