Table Transformation - R-help

Wed, Mar 4, 2009 9:58 AM #

Dear R-experts,

recently, I started to discover the world of R. I came across a problem,
that I was unable to solve by myself (including searches in R-help, etc.)

I have a flat table similar to

key1	key2	value1

abcd_1	BP	10
abcd_1	BSMP	1A
abcd_1	PD	25
abcd_2	BP	20
abcd_3	BP	80
abcd_4	IA	30
abcd_4	PD	70
abcd_4	PS	N

I wish to transform this table to obtain the following result:

	key2
key1	BP	BSMP	IA	PD	PS
abcd_1	"10"	"1A"	""	"25"	""			
abcd_2	"20"	""	""	""	""
abcd_3	"80"	""	""	""	""
abcd_4	""	""	"30"	"70"	"N"

I considered "table" and "xtabs" but I could not get the desired result: I
received cross-tables key1 vs. key2 that contained counts within the cells.

Can anybody help me?

Best wishes,

Christian

View this message in context: http://www.nabble.com/Table-Transformation-tp22335545p22335545.html
Sent from the R help mailing list archive at Nabble.com.

Uwe Ligges

Wed, Mar 4, 2009 10:37 AM #

See ?reshape

Uwe Ligges

Christian Pilger wrote:

Paul Johnson

Wed, Mar 4, 2009 11:14 AM #

On Wed, Mar 4, 2009 at 11:58 AM, Christian Pilger

<christian.pilger at gmx.net> wrote:

I think we would say that a dataframe of the first type is in the
"long" format, while the other one you want is in the "wide" format.
I've done changes like that with the "reshape" function that is in the
stats package.

This example you propose is like making one column for each "country"
where key 1 is like the "year" in which the observation is made.
Right?

You don't have an easily cut-and-pasteable code example, so I've
generated a little working example. Here, x1 is key 1 and x2 is key 2.

[1] c1 c1 c1 c1 c1 c2 c2 c2 c2 c2 c3 c3 c3 c3 c3 c4 c4 c4 c4 c4
Levels: c1 c2 c3 c4

x1 x2           y
1  c1  1  0.02095747
2  c1  2  0.05926233
3  c1  3 -0.07561916
4  c1  4 -1.06272710
5  c1  5 -1.89202032
6  c2  1 -0.04549782
7  c2  2 -0.68333187
8  c2  3 -0.99151410
9  c2  4 -0.29070280
10 c2  5 -0.97655024
11 c3  1  0.33411223
12 c3  2 -0.24907340
13 c3  3 -0.25469819
14 c3  4  1.23956157
15 c3  5 -1.38162430
16 c4  1  0.50343661
17 c4  2 -0.58126964
18 c4  3  0.24256348
19 c4  4 -0.39398578
20 c4  5  0.01664450

x1         y.1         y.2         y.3        y.4        y.5
1  c1  0.02095747  0.05926233 -0.07561916 -1.0627271 -1.8920203
6  c2 -0.04549782 -0.68333187 -0.99151410 -0.2907028 -0.9765502
11 c3  0.33411223 -0.24907340 -0.25469819  1.2395616 -1.3816243
16 c4  0.50343661 -0.58126964  0.24256348 -0.3939858  0.0166445

Your case will have many missings, but I think the idea is the same.

HTH

Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

Hadley Wickham

Wed, Mar 4, 2009 11:30 AM #

On Wed, Mar 4, 2009 at 11:58 AM, Christian Pilger

<christian.pilger at gmx.net> wrote:

With the reshape package:

cast(mydf, key1 ~ key2)

You can find out more at http://had.co.nz/reshape

Hadley

http://had.co.nz/