Skip to content

Table Transformation

4 messages · Christian Pilger, Uwe Ligges, Paul Johnson +1 more

#
Dear R-experts,

recently, I started to discover the world of R. I came across a problem,
that I was unable to solve by myself (including searches in R-help, etc.)

I have a flat table similar to

key1	key2	value1

abcd_1	BP	10
abcd_1	BSMP	1A
abcd_1	PD	25
abcd_2	BP	20
abcd_3	BP	80
abcd_4	IA	30
abcd_4	PD	70
abcd_4	PS	N

I wish to transform this table to obtain the following result:

	key2
key1	BP	BSMP	IA	PD	PS
abcd_1	"10"	"1A"	""	"25"	""			
abcd_2	"20"	""	""	""	""
abcd_3	"80"	""	""	""	""
abcd_4	""	""	"30"	"70"	"N"

I considered "table" and "xtabs" but I could not get the desired result: I
received cross-tables key1 vs. key2 that contained counts within the cells.

Can anybody help me?

Best wishes,

Christian
#
See ?reshape

Uwe Ligges
Christian Pilger wrote:
#
On Wed, Mar 4, 2009 at 11:58 AM, Christian Pilger
<christian.pilger at gmx.net> wrote:
I think we would say that a dataframe of the first type is in the
"long" format, while the other one you want is in the "wide" format.
I've done changes like that with the "reshape" function that is in the
stats package.

This example you propose is like making one column for each "country"
where key 1 is like the "year" in which the observation is made.
Right?

You don't have an easily cut-and-pasteable code example, so I've
generated a little working example. Here, x1 is key 1 and x2 is key 2.
[1] c1 c1 c1 c1 c1 c2 c2 c2 c2 c2 c3 c3 c3 c3 c3 c4 c4 c4 c4 c4
Levels: c1 c2 c3 c4
x1 x2           y
1  c1  1  0.02095747
2  c1  2  0.05926233
3  c1  3 -0.07561916
4  c1  4 -1.06272710
5  c1  5 -1.89202032
6  c2  1 -0.04549782
7  c2  2 -0.68333187
8  c2  3 -0.99151410
9  c2  4 -0.29070280
10 c2  5 -0.97655024
11 c3  1  0.33411223
12 c3  2 -0.24907340
13 c3  3 -0.25469819
14 c3  4  1.23956157
15 c3  5 -1.38162430
16 c4  1  0.50343661
17 c4  2 -0.58126964
18 c4  3  0.24256348
19 c4  4 -0.39398578
20 c4  5  0.01664450
x1         y.1         y.2         y.3        y.4        y.5
1  c1  0.02095747  0.05926233 -0.07561916 -1.0627271 -1.8920203
6  c2 -0.04549782 -0.68333187 -0.99151410 -0.2907028 -0.9765502
11 c3  0.33411223 -0.24907340 -0.25469819  1.2395616 -1.3816243
16 c4  0.50343661 -0.58126964  0.24256348 -0.3939858  0.0166445
Your case will have many missings, but I think the idea is the same.

HTH
#
On Wed, Mar 4, 2009 at 11:58 AM, Christian Pilger
<christian.pilger at gmx.net> wrote:
With the reshape package:

cast(mydf, key1 ~ key2)

You can find out more at http://had.co.nz/reshape

Hadley