Newbie data organisation/structures question...

Wed, Dec 20, 2006 9:50 AM

On Wed, 2006-12-20 at 16:05 +0000, Gav Wood wrote:

Reading in your data using:

DF <- read.fwf("clipboard", widths = c(3, 3, 12),
               skip = 1)

colnames(DF) <- c("P", "T", "I")


Substitute your actual data file name for 'clipboard' above.


Note that I skip the header row, as the "T" causes problems, since it
wants to be converted to 'TRUE' (logical, not char) upon import,
screwing up the column widths. I then assign the colnames post import.

This then gives me:

P T            I
1 1 1    (1, 2, 3)
2 2 1       (2, 4)
3 1 2 (1, 3, 6, 7)
4 2 2          (6)

Given the manipulations that you appear to want to do, I would first
strip the parens from "I" to make subsequent operations easier:

DF$I <- gsub("\\(|\\)", "", DF$I)

So:

P T          I
1 1 1    1, 2, 3
2 2 1       2, 4
3 1 2 1, 3, 6, 7
4 2 2          6


Now, split the character vector based DF$I into components and convert
it to numeric lists:

P T          I
1 1 1    1, 2, 3
2 2 1       2, 4
3 1 2 1, 3, 6, 7
4 2 2          6

# Look at the structure of 'DF'

'data.frame':	4 obs. of  3 variables:
 $ P: num  1 2 1 2
 $ T: num  1 1 2 2
 $ I:List of 4
  ..$ : num  1 2 3
  ..$ : num  2 4
  ..$ : num  1 3 6 7
  ..$ : num 6


Now for your manipulations above:

1: The I when both P and T are given. e.g.:
P = 2, T = 2; I = (6)

I
4 6


2: The concatenated vector of Is when P and a subset of T is given,
e.g.:
P = 1, T = 1:2;  Is = (1, 2, 3, 1, 3, 6, 7)

I1 I2 I3 I4 I5 I6 I7 
 1  2  3  1  3  6  7

or you can use:

[1] 1 2 3 1 3 6 7

which strips the name attributes from the vector.



3: The length of that vector.

[1] 7



4: A list of Is when either P or T is given. e.g.:
P = 2: I = (2, 4), (6)
T = 1: I = (1, 2, 3), (1, 3, 6, 7)

I
2 2, 4
4    6

I
1 1, 2, 3
2    2, 4

Note that your example above for 'T == 1' in 4 is incorrect based upon
your example data. "(1, 3, 6, 7)" is on the row where T == 2.   :-)


See ?read.fwf, ?read.table, ?subset, ?split, ?gsub, ?lapply, ?unlist, ?Syntax and ?Comparison for more information.

HTH,

Marc Schwartz

Newbie data organisation/structures question...

Thread (3 messages)