data manipulation
On Wed, 2005-04-13 at 20:56 -0400, Yoko Nakajima wrote:
Hello, my question is about the data handling. I have a data set that is lined as: 4 1 17 1 1 -5.1536 -0.1668 -2.3412 -0.5062 0.9621 0.3640 0.3678 -0.5081 -0.2227 0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232 0.8673 -0.1033 -0.0796 -0.0341 -0.1716 -0.1801 -0.7014 0.6578 0.5611 4 1 17 2 1 -5.1536 -0.1668 -2.3412 -0.5062 0.9621 0.3640 0.3678 -0.5081 -0.2227 0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232 0.8673 -0.1033 -0.0796 -0.0341 -0.1716 -0.1801 -0.7014 0.6578 0.5611 This means that 29 variables are together as a set. You saw two sets of them in example. I have about 1000 sets (of 29 variables) in my data. When I "scan" this data set, the result comes with 7 columns and it is not possible, so far, to read the table by column wise, and thus it is not possible to analyze the data. I would like to know whether there is a way to solve this problem, say, by arranging columns or increasing the number of columns of data matrix by R. Also, I would like to know how you could name each column of the data so that you could use the individual column separately.
You probably change some default setting in scan(). By default it treats 'white space' as field delimiters. Using your data above, which I save in file called 'test.dat':
mat <- matrix(scan("test.dat"), ncol = 29)
Read 58 items
dim(mat)
[1] 2 29
colnames(mat) <- paste("Col", 1:29, sep = "")
mat
Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9
[1,] 4 17 1.0000 -0.1668 -0.5062 0.3640 -0.5081 0.8142 -0.0445
[2,] 1 1 -5.1536 -2.3412 0.9621 0.3678 -0.2227 -0.0389 -0.0578
Col10 Col11 Col12 Col13 Col14 Col15 Col16 Col17 Col18
[1,] -0.1175 0.8673 -0.0796 -0.1716 -0.7014 0.5611 1 2 -5.1536
[2,] -0.1232 -0.1033 -0.0341 -0.1801 0.6578 4.0000 17 1 -0.1668
Col19 Col20 Col21 Col22 Col23 Col24 Col25 Col26
[1,] -2.3412 0.9621 0.3678 -0.2227 -0.0389 -0.0578 -0.1232 -0.1033
[2,] -0.5062 0.3640 -0.5081 0.8142 -0.0445 -0.1175 0.8673 -0.0796
Col27 Col28 Col29
[1,] -0.0341 -0.1801 0.6578
[2,] -0.1716 -0.7014 0.5611
In this case, 'mat' is a matrix with 2 rows and 29 columns.
You can restructure this differently as per your requirements.
HTH,
Marc Schwartz