Skip to content

merging to data.frames whose columns are different but follow a pattern.

2 messages · ManuelS, Dennis Murphy

#
Hi,

I'm working with panel data from the Swiss Houshold Panel (SHP). The data i
got came in the following way:
1.) 12 *different* /individual/ files - one for each year .
2.) 12 *different* /houshold/ files - again: one for each year

Each file came in the SPSS format (.sav). I implemented all the files in R
an managed (via rename, cbind, rbind, merge etc.) to get *two* files. 
The first file is a combination of all 12 /individual/ files orderd by
IDPERS (unique individual identification) and T (=Time 1...12) 
The second file contains all houshold waves (1-12) and is orderd by IDHOUS 
(unique houshold identification) and T (=1...12).

Now, to the problem:
As the following except shows: the individual id used in the individual file
(IDPERS) is derived from the houshold id (IDHOUS).
IDPERS  T SEX PC02 GHS Single 
1    4101  1   1    9   1     NA
2    4101  2   1   10   1     NA
3    4101  3   1   10   1     NA
4    4101  4   1   10   1     NA
.
.
.
13   4102  1   2    0   0     NA
14   4102  2   2    5   1     NA
15   4102  3   2    3   0     NA
16   4102  4   2    9   1     NA
.
.
.
26   5101  1   1    8   1     NA
27   5101  2   1    8   1     NA
28   5101  3   1    8   1     NA
29   5101  4   1    8   1     NA
IDHOUS  T COM2 NBPERS NBKID
1      41  1    2      5     3
2      41  2    2      5     3
3      41  3    2      5     3
4      41  4    2      5     3
.
.
.
13     51  1    2      1     0
14     51  2    2      1     0
15     51  3    2      1     0
16     51  4    2      1     0

I want to merge the two files (data.frames) according to their ID's and the
time dimension T. Since IDHOUS and IDPERS are not the same I cannot use
/merge() /. Doing the task "by hand" in a excel spread sheet or any
equivalent is possible but would probably take ages.
So my question is:

Does anyone know a way, a command, a trick to merge two files whose columns
are different but follow a pattern as described above?

I am gratful for any kind of input.

Manuel Steiner


--
View this message in context: http://r.789695.n4.nabble.com/merging-to-data-frames-whose-columns-are-different-but-follow-a-pattern-tp3937376p3937376.html
Sent from the R help mailing list archive at Nabble.com.
#
Hi:

Try this:

INDIVIDUAL <- transform(INDIVIDUAL, IDHOUS = IDPERS %/% 100)
merge(INDIVIDUAL, HOUSHOLD, by = c('IDHOUS', 'T'))

   IDHOUS T IDPERS SEX PC02 GHS Single COM2 NBPERS NBKID
1      41 1   4101   1    9   1     NA    2      5     3
2      41 1   4102   2    0   0     NA    2      5     3
3      41 2   4101   1   10   1     NA    2      5     3
4      41 2   4102   2    5   1     NA    2      5     3
5      41 3   4101   1   10   1     NA    2      5     3
6      41 3   4102   2    3   0     NA    2      5     3
7      41 4   4101   1   10   1     NA    2      5     3
8      41 4   4102   2    9   1     NA    2      5     3
9      51 1   5101   1    8   1     NA    2      1     0
10     51 2   5101   1    8   1     NA    2      1     0
11     51 3   5101   1    8   1     NA    2      1     0
12     51 4   5101   1    8   1     NA    2      1     0

HTH,
Dennis
On Tue, Oct 25, 2011 at 9:43 AM, ManuelS <steiner-manuel at web.de> wrote: