Table Transformation
On Wed, Mar 4, 2009 at 11:58 AM, Christian Pilger
<christian.pilger at gmx.net> wrote:
Dear R-experts, recently, I started to discover the world of R. I came across a problem, that I was unable to solve by myself (including searches in R-help, etc.) I have a flat table similar to key1 ? ?key2 ? ?value1 abcd_1 ?BP ? ? ?10 abcd_1 ?BSMP ? ?1A abcd_1 ?PD ? ? ?25 abcd_2 ?BP ? ? ?20 abcd_3 ?BP ? ? ?80 abcd_4 ?IA ? ? ?30 abcd_4 ?PD ? ? ?70 abcd_4 ?PS ? ? ?N I wish to transform this table to obtain the following result: ? ? ? ?key2 key1 ? ?BP ? ? ?BSMP ? ?IA ? ? ?PD ? ? ?PS abcd_1 ?"10" ? ?"1A" ? ?"" ? ? ?"25" ? ?"" abcd_2 ?"20" ? ?"" ? ? ?"" ? ? ?"" ? ? ?"" abcd_3 ?"80" ? ?"" ? ? ?"" ? ? ?"" ? ? ?"" abcd_4 ?"" ? ? ?"" ? ? ?"30" ? ?"70" ? ?"N"
I think we would say that a dataframe of the first type is in the "long" format, while the other one you want is in the "wide" format. I've done changes like that with the "reshape" function that is in the stats package. This example you propose is like making one column for each "country" where key 1 is like the "year" in which the observation is made. Right? You don't have an easily cut-and-pasteable code example, so I've generated a little working example. Here, x1 is key 1 and x2 is key 2.
x1 <- gl(4,5, labels=c("c1","c2","c3","c4"))
x1
[1] c1 c1 c1 c1 c1 c2 c2 c2 c2 c2 c3 c3 c3 c3 c3 c4 c4 c4 c4 c4 Levels: c1 c2 c3 c4
x2 <- rep(1:5,4) df <- data.frame(x1, x2, y=rnorm(20)) df
x1 x2 y 1 c1 1 0.02095747 2 c1 2 0.05926233 3 c1 3 -0.07561916 4 c1 4 -1.06272710 5 c1 5 -1.89202032 6 c2 1 -0.04549782 7 c2 2 -0.68333187 8 c2 3 -0.99151410 9 c2 4 -0.29070280 10 c2 5 -0.97655024 11 c3 1 0.33411223 12 c3 2 -0.24907340 13 c3 3 -0.25469819 14 c3 4 1.23956157 15 c3 5 -1.38162430 16 c4 1 0.50343661 17 c4 2 -0.58126964 18 c4 3 0.24256348 19 c4 4 -0.39398578 20 c4 5 0.01664450
reshape(df, direction="wide", timevar="x2", idvar="x1")
x1 y.1 y.2 y.3 y.4 y.5 1 c1 0.02095747 0.05926233 -0.07561916 -1.0627271 -1.8920203 6 c2 -0.04549782 -0.68333187 -0.99151410 -0.2907028 -0.9765502 11 c3 0.33411223 -0.24907340 -0.25469819 1.2395616 -1.3816243 16 c4 0.50343661 -0.58126964 0.24256348 -0.3939858 0.0166445
Your case will have many missings, but I think the idea is the same. HTH
Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas