Skip to content

Data frame with 3 columns to matrix

8 messages · PIKAL Petr, Michael Bach, David Winsemius

#
Dear R Users,

Lets assume I have this data frame:

     x y   z
1 1.00 5 0.5
2 1.02 5 0.7
3 1.04 7 0.1
4 1.06 9 0.4

x and y columns are sorted and the values not necessarily integers.  z
values are not sorted.  Now I would like to create a matrix out of this
with x as first column values and y as first row values.  Matrix element
a_11 shall be left NA.  The a_ij should have the z value for the
corresponding x and y pair.  The result shall be some sort of a grid and
then e.g. look like:

     [,1] [,2] [,3] [,4] [,5]
[1,]   NA    5    6    7    9 (y)
[2,] 1.00  0.5   NA   NA   NA
[3,] 1.02  0.7   NA   NA   NA
[4,] 1.04   NA   NA  0.1   NA
[5,] 1.06   NA   NA   NA  0.4
      (x)

This example is just for illustration.  The resulting matrix should have
more numeric values than NA's.

I hope I made myself clear.  Any hints on how to achieve this?  Is there
already a function that does it?  All searches I did pointed me to data
type frame to matrix conversion...

Kind Regards,
Michael Bach
#
Hi

r-help-bounces at r-project.org napsal dne 19.04.2011 09:46:47:
I am not sure if this is the solution you want

tab<-xtabs(z~x+y, data=df)
tab[tab==0]<-NA

Regards
Petr
http://www.R-project.org/posting-guide.html
#
On Apr 19, 2011, at 3:46 AM, Michael Bach wrote:

            
Perhaps but only if the third row of your example was incorrectly  
constructed:
 >  dta <- rd.txt("   x y   z
1 1.00 5 0.5
2 1.02 5 0.7
3 1.04 7 0.1
4 1.06 9 0.4")
#rd.txt() is a combo fn of read.table and textConnection

 > mat <- matrix(NA, ncol=NROW(dta)+1, nrow=NROW(dta)+1)
 > mat[2:NROW(mat),1] <- dta[["x"]]
 > mat[1,2:NROW(mat)] <- dta[["y"]]
 > diag(mat) <- c(NA, dta[["z"]])
 > mat
      [,1] [,2] [,3] [,4] [,5]
[1,]   NA  5.0  5.0  7.0  9.0
[2,] 1.00  0.5   NA   NA   NA
[3,] 1.02   NA  0.7   NA   NA
[4,] 1.04   NA   NA  0.1   NA
[5,] 1.06   NA   NA   NA  0.4
David Winsemius, MD
West Hartford, CT
#
David Winsemius <dwinsemius at comcast.net> writes:
Thanks for your answer David,

but this yields a diagonal matrix only.  I think I did not make myself
clear enough.  In the original 3 column data frame, there could have
been a pair of x and y with identical y's but different x's and z's.
The way my data source is derived, there is a guarantee that there is
are no two rows with identical x and y in the original data frame.  In
the end, x and y serve as a grid, with z values at each point in the
grid or NA's if there is no z value for a x and y pair.  The number of
rows in the data frame is then equal to the number of non-NA values in
the resulting matrix.

Another try, lets assume this original data frame:

  x  y z
1 2  5 1
2 2  6 1
3 3  7 1
4 3  8 1
5 3  9 1
6 5 10 2
7 5 11 2
8 5 12 2

Then I would like to get

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]   NA    5    6    7    8    9   10   11   12
[2,]    2    1    1           
[3,]    2
[4,]    3              1    1    1
[5,]    3
[6,]    3
[7,]    5                             2    2    2
[8,]    5
[9,]    5

I left out all the NA's, except the first, where there is no z value,
say e.g. x=5 and y=8.

Do you see what I mean?
#
Petr PIKAL <petr.pikal at precheza.cz> writes:
This looks right, but the resulting table does not have rows == columns
and there are also no missing values (0 or NA) which should be there.  I
think I did not make myself clear, please see also my reply to the
previous answer to my question by David Winsemius.

Thanks for your reply though!
#
On Apr 19, 2011, at 8:16 AM, Michael Bach wrote:

            
I do, ... now anyway. Your earlier data example had non-integer x and  
y values which made what I will now offer infeasible (or at the very  
least ambiguous). Indexing with decimal numbers does not provoke an  
error and that the truncated value is used.  With integer indices you  
can use a two column matrix as an argument to "["

 > mat <- matrix(NA, nrow=max(dta[[1]])+1, ncol=max(dta[[2]])+1 )
 > mat[data.matrix(dta[,1:2])] <- dta[,3]
 > mat
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA
[2,]   NA   NA   NA   NA    1    1   NA   NA   NA    NA    NA    NA
[3,]   NA   NA   NA   NA   NA   NA    1    1    1    NA    NA    NA
[4,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA
[5,]   NA   NA   NA   NA   NA   NA   NA   NA   NA     2     2     2
[6,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA
      [,13]
[1,]    NA
[2,]    NA
[3,]    NA
[4,]    NA
[5,]    NA
[6,]    NA

I leave the insertion of the first row and columns and removal of the  
extra columns induced by the mismatch of the values and row numbers to  
you, since .....
 > mat[, 4:12]
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
[2,]   NA    1    1   NA   NA   NA   NA   NA   NA
[3,]   NA   NA   NA    1    1    1   NA   NA   NA
[4,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
[5,]   NA   NA   NA   NA   NA   NA    2    2    2
[6,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
#
Hi


Michael Bach <phaebz at gmail.com> napsal dne 19.04.2011 14:21:13:
this
element
and
have
They are there

try str(tab)

With little effort you can get exactly the structure you want.
x  y zdata
1 2  5     1
2 2  6     1
3 3  7     1
4 3  8     1
5 3  9     1
6 5 10     2
7 5 11     2
8 5 12     2
tab<-xtabs(zdata~x+y, data)
tab[tab==0]<-NA
tab
   y
x    5  6  7  8  9 10 11 12
  2  1  1 
  3        1  1  1 
  5                 2  2  2
str(as.matrix(tab))
 xtabs [1:3, 1:8] 1 NA NA 1 NA NA NA 1 NA NA ...
 - attr(*, "dimnames")=List of 2
  ..$ x: chr [1:3] "2" "3" "5"
  ..$ y: chr [1:8] "5" "6" "7" "8" ...
 - attr(*, "class")= chr [1:2] "xtabs" "table"
 - attr(*, "call")= language xtabs(formula = zdata ~ x + y, data = data)

tab1<-cbind(as.numeric(rownames(tab)), tab)
rbind(c(NA,as.numeric(colnames(tab))), tab1)
      5  6  7  8  9 10 11 12
  NA  5  6  7  8  9 10 11 12
2  2  1  1 NA NA NA NA NA NA
3  3 NA NA  1  1  1 NA NA NA
5  5 NA NA NA NA NA  2  2  2
rbind(c(NA,as.numeric(colnames(tab))), tab1)

If you want to get rid of column and row names set them to NULL.

Regards
Petr
#
David Winsemius <dwinsemius at comcast.net> writes:
Thanks for your tips and advice!

I will see what I can work out alone from here on...