Dear R Users,
Lets assume I have this data frame:
x y z
1 1.00 5 0.5
2 1.02 5 0.7
3 1.04 7 0.1
4 1.06 9 0.4
x and y columns are sorted and the values not necessarily integers. z
values are not sorted. Now I would like to create a matrix out of this
with x as first column values and y as first row values. Matrix element
a_11 shall be left NA. The a_ij should have the z value for the
corresponding x and y pair. The result shall be some sort of a grid and
then e.g. look like:
[,1] [,2] [,3] [,4] [,5]
[1,] NA 5 6 7 9 (y)
[2,] 1.00 0.5 NA NA NA
[3,] 1.02 0.7 NA NA NA
[4,] 1.04 NA NA 0.1 NA
[5,] 1.06 NA NA NA 0.4
(x)
This example is just for illustration. The resulting matrix should have
more numeric values than NA's.
I hope I made myself clear. Any hints on how to achieve this? Is there
already a function that does it? All searches I did pointed me to data
type frame to matrix conversion...
Kind Regards,
Michael Bach
Data frame with 3 columns to matrix
8 messages · PIKAL Petr, Michael Bach, David Winsemius
Hi r-help-bounces at r-project.org napsal dne 19.04.2011 09:46:47:
Dear R Users,
Lets assume I have this data frame:
x y z
1 1.00 5 0.5
2 1.02 5 0.7
3 1.04 7 0.1
4 1.06 9 0.4
x and y columns are sorted and the values not necessarily integers. z
values are not sorted. Now I would like to create a matrix out of this
with x as first column values and y as first row values. Matrix element
a_11 shall be left NA. The a_ij should have the z value for the
corresponding x and y pair. The result shall be some sort of a grid and
then e.g. look like:
[,1] [,2] [,3] [,4] [,5]
[1,] NA 5 6 7 9 (y)
[2,] 1.00 0.5 NA NA NA
[3,] 1.02 0.7 NA NA NA
[4,] 1.04 NA NA 0.1 NA
[5,] 1.06 NA NA NA 0.4
(x)
This example is just for illustration. The resulting matrix should have
more numeric values than NA's.
I am not sure if this is the solution you want tab<-xtabs(z~x+y, data=df) tab[tab==0]<-NA Regards Petr
I hope I made myself clear. Any hints on how to achieve this? Is there already a function that does it? All searches I did pointed me to data type frame to matrix conversion... Kind Regards, Michael Bach
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
On Apr 19, 2011, at 3:46 AM, Michael Bach wrote:
Dear R Users,
Lets assume I have this data frame:
x y z
1 1.00 5 0.5
2 1.02 5 0.7
3 1.04 7 0.1
4 1.06 9 0.4
x and y columns are sorted and the values not necessarily integers. z
values are not sorted. Now I would like to create a matrix out of
this
with x as first column values and y as first row values. Matrix
element
a_11 shall be left NA. The a_ij should have the z value for the
corresponding x and y pair. The result shall be some sort of a grid
and
then e.g. look like:
[,1] [,2] [,3] [,4] [,5]
[1,] NA 5 6 7 9 (y)
[2,] 1.00 0.5 NA NA NA
[3,] 1.02 0.7 NA NA NA
[4,] 1.04 NA NA 0.1 NA
[5,] 1.06 NA NA NA 0.4
(x)
This example is just for illustration. The resulting matrix should
have
more numeric values than NA's.
I hope I made myself clear.
Perhaps but only if the third row of your example was incorrectly
constructed:
> dta <- rd.txt(" x y z
1 1.00 5 0.5
2 1.02 5 0.7
3 1.04 7 0.1
4 1.06 9 0.4")
#rd.txt() is a combo fn of read.table and textConnection
> mat <- matrix(NA, ncol=NROW(dta)+1, nrow=NROW(dta)+1)
> mat[2:NROW(mat),1] <- dta[["x"]]
> mat[1,2:NROW(mat)] <- dta[["y"]]
> diag(mat) <- c(NA, dta[["z"]])
> mat
[,1] [,2] [,3] [,4] [,5]
[1,] NA 5.0 5.0 7.0 9.0
[2,] 1.00 0.5 NA NA NA
[3,] 1.02 NA 0.7 NA NA
[4,] 1.04 NA NA 0.1 NA
[5,] 1.06 NA NA NA 0.4
Any hints on how to achieve this? Is there already a function that does it? All searches I did pointed me to data type frame to matrix conversion...
David Winsemius, MD West Hartford, CT
David Winsemius <dwinsemius at comcast.net> writes:
Perhaps but only if the third row of your example was incorrectly constructed:
dta <- rd.txt(" x y z
1 1.00 5 0.5 2 1.02 5 0.7 3 1.04 7 0.1 4 1.06 9 0.4") #rd.txt() is a combo fn of read.table and textConnection
mat <- matrix(NA, ncol=NROW(dta)+1, nrow=NROW(dta)+1) mat[2:NROW(mat),1] <- dta[["x"]] mat[1,2:NROW(mat)] <- dta[["y"]] diag(mat) <- c(NA, dta[["z"]]) mat
[,1] [,2] [,3] [,4] [,5] [1,] NA 5.0 5.0 7.0 9.0 [2,] 1.00 0.5 NA NA NA [3,] 1.02 NA 0.7 NA NA [4,] 1.04 NA NA 0.1 NA [5,] 1.06 NA NA NA 0.4
Thanks for your answer David,
but this yields a diagonal matrix only. I think I did not make myself
clear enough. In the original 3 column data frame, there could have
been a pair of x and y with identical y's but different x's and z's.
The way my data source is derived, there is a guarantee that there is
are no two rows with identical x and y in the original data frame. In
the end, x and y serve as a grid, with z values at each point in the
grid or NA's if there is no z value for a x and y pair. The number of
rows in the data frame is then equal to the number of non-NA values in
the resulting matrix.
Another try, lets assume this original data frame:
x y z
1 2 5 1
2 2 6 1
3 3 7 1
4 3 8 1
5 3 9 1
6 5 10 2
7 5 11 2
8 5 12 2
Then I would like to get
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA 5 6 7 8 9 10 11 12
[2,] 2 1 1
[3,] 2
[4,] 3 1 1 1
[5,] 3
[6,] 3
[7,] 5 2 2 2
[8,] 5
[9,] 5
I left out all the NA's, except the first, where there is no z value,
say e.g. x=5 and y=8.
Do you see what I mean?
Petr PIKAL <petr.pikal at precheza.cz> writes:
Hi r-help-bounces at r-project.org napsal dne 19.04.2011 09:46:47:
Dear R Users,
Lets assume I have this data frame:
x y z
1 1.00 5 0.5
2 1.02 5 0.7
3 1.04 7 0.1
4 1.06 9 0.4
x and y columns are sorted and the values not necessarily integers. z
values are not sorted. Now I would like to create a matrix out of this
with x as first column values and y as first row values. Matrix element
a_11 shall be left NA. The a_ij should have the z value for the
corresponding x and y pair. The result shall be some sort of a grid and
then e.g. look like:
[,1] [,2] [,3] [,4] [,5]
[1,] NA 5 6 7 9 (y)
[2,] 1.00 0.5 NA NA NA
[3,] 1.02 0.7 NA NA NA
[4,] 1.04 NA NA 0.1 NA
[5,] 1.06 NA NA NA 0.4
(x)
This example is just for illustration. The resulting matrix should have
more numeric values than NA's.
I am not sure if this is the solution you want tab<-xtabs(z~x+y, data=df) tab[tab==0]<-NA
This looks right, but the resulting table does not have rows == columns and there are also no missing values (0 or NA) which should be there. I think I did not make myself clear, please see also my reply to the previous answer to my question by David Winsemius. Thanks for your reply though!
On Apr 19, 2011, at 8:16 AM, Michael Bach wrote:
David Winsemius <dwinsemius at comcast.net> writes:
Perhaps but only if the third row of your example was incorrectly constructed:
dta <- rd.txt(" x y z
1 1.00 5 0.5 2 1.02 5 0.7 3 1.04 7 0.1 4 1.06 9 0.4") #rd.txt() is a combo fn of read.table and textConnection
mat <- matrix(NA, ncol=NROW(dta)+1, nrow=NROW(dta)+1) mat[2:NROW(mat),1] <- dta[["x"]] mat[1,2:NROW(mat)] <- dta[["y"]] diag(mat) <- c(NA, dta[["z"]]) mat
[,1] [,2] [,3] [,4] [,5] [1,] NA 5.0 5.0 7.0 9.0 [2,] 1.00 0.5 NA NA NA [3,] 1.02 NA 0.7 NA NA [4,] 1.04 NA NA 0.1 NA [5,] 1.06 NA NA NA 0.4
Thanks for your answer David,
but this yields a diagonal matrix only. I think I did not make myself
clear enough. In the original 3 column data frame, there could have
been a pair of x and y with identical y's but different x's and z's.
The way my data source is derived, there is a guarantee that there is
are no two rows with identical x and y in the original data frame. In
the end, x and y serve as a grid, with z values at each point in the
grid or NA's if there is no z value for a x and y pair. The number of
rows in the data frame is then equal to the number of non-NA values in
the resulting matrix.
Another try, lets assume this original data frame:
x y z
1 2 5 1
2 2 6 1
3 3 7 1
4 3 8 1
5 3 9 1
6 5 10 2
7 5 11 2
8 5 12 2
Then I would like to get
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA 5 6 7 8 9 10 11 12
[2,] 2 1 1
[3,] 2
[4,] 3 1 1 1
[5,] 3
[6,] 3
[7,] 5 2 2 2
[8,] 5
[9,] 5
I left out all the NA's, except the first, where there is no z value,
say e.g. x=5 and y=8.
Do you see what I mean?
I do, ... now anyway. Your earlier data example had non-integer x and
y values which made what I will now offer infeasible (or at the very
least ambiguous). Indexing with decimal numbers does not provoke an
error and that the truncated value is used. With integer indices you
can use a two column matrix as an argument to "["
> mat <- matrix(NA, nrow=max(dta[[1]])+1, ncol=max(dta[[2]])+1 )
> mat[data.matrix(dta[,1:2])] <- dta[,3]
> mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] NA NA NA NA NA NA NA NA NA NA NA NA
[2,] NA NA NA NA 1 1 NA NA NA NA NA NA
[3,] NA NA NA NA NA NA 1 1 1 NA NA NA
[4,] NA NA NA NA NA NA NA NA NA NA NA NA
[5,] NA NA NA NA NA NA NA NA NA 2 2 2
[6,] NA NA NA NA NA NA NA NA NA NA NA NA
[,13]
[1,] NA
[2,] NA
[3,] NA
[4,] NA
[5,] NA
[6,] NA
I leave the insertion of the first row and columns and removal of the
extra columns induced by the mismatch of the values and row numbers to
you, since .....
> mat[, 4:12]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA NA NA NA NA NA NA NA NA
[2,] NA 1 1 NA NA NA NA NA NA
[3,] NA NA NA 1 1 1 NA NA NA
[4,] NA NA NA NA NA NA NA NA NA
[5,] NA NA NA NA NA NA 2 2 2
[6,] NA NA NA NA NA NA NA NA NA
David Winsemius, MD West Hartford, CT
Hi Michael Bach <phaebz at gmail.com> napsal dne 19.04.2011 14:21:13:
Petr PIKAL <petr.pikal at precheza.cz> writes:
Hi r-help-bounces at r-project.org napsal dne 19.04.2011 09:46:47:
Dear R Users,
Lets assume I have this data frame:
x y z
1 1.00 5 0.5
2 1.02 5 0.7
3 1.04 7 0.1
4 1.06 9 0.4
x and y columns are sorted and the values not necessarily integers. z
values are not sorted. Now I would like to create a matrix out of
this
with x as first column values and y as first row values. Matrix
element
a_11 shall be left NA. The a_ij should have the z value for the corresponding x and y pair. The result shall be some sort of a grid
and
then e.g. look like:
[,1] [,2] [,3] [,4] [,5]
[1,] NA 5 6 7 9 (y)
[2,] 1.00 0.5 NA NA NA
[3,] 1.02 0.7 NA NA NA
[4,] 1.04 NA NA 0.1 NA
[5,] 1.06 NA NA NA 0.4
(x)
This example is just for illustration. The resulting matrix should
have
more numeric values than NA's.
I am not sure if this is the solution you want tab<-xtabs(z~x+y, data=df) tab[tab==0]<-NA
This looks right, but the resulting table does not have rows == columns and there are also no missing values (0 or NA) which should be there. I
They are there try str(tab) With little effort you can get exactly the structure you want.
data
x y zdata
1 2 5 1
2 2 6 1
3 3 7 1
4 3 8 1
5 3 9 1
6 5 10 2
7 5 11 2
8 5 12 2
tab<-xtabs(zdata~x+y, data)
tab[tab==0]<-NA
tab
y
x 5 6 7 8 9 10 11 12
2 1 1
3 1 1 1
5 2 2 2
str(as.matrix(tab))
xtabs [1:3, 1:8] 1 NA NA 1 NA NA NA 1 NA NA ...
- attr(*, "dimnames")=List of 2
..$ x: chr [1:3] "2" "3" "5"
..$ y: chr [1:8] "5" "6" "7" "8" ...
- attr(*, "class")= chr [1:2] "xtabs" "table"
- attr(*, "call")= language xtabs(formula = zdata ~ x + y, data = data)
tab1<-cbind(as.numeric(rownames(tab)), tab)
rbind(c(NA,as.numeric(colnames(tab))), tab1)
5 6 7 8 9 10 11 12
NA 5 6 7 8 9 10 11 12
2 2 1 1 NA NA NA NA NA NA
3 3 NA NA 1 1 1 NA NA NA
5 5 NA NA NA NA NA 2 2 2
rbind(c(NA,as.numeric(colnames(tab))), tab1)
If you want to get rid of column and row names set them to NULL.
Regards
Petr
think I did not make myself clear, please see also my reply to the previous answer to my question by David Winsemius. Thanks for your reply though!
David Winsemius <dwinsemius at comcast.net> writes:
On Apr 19, 2011, at 8:16 AM, Michael Bach wrote:
David Winsemius <dwinsemius at comcast.net> writes:
Perhaps but only if the third row of your example was incorrectly constructed:
dta <- rd.txt(" x y z
1 1.00 5 0.5 2 1.02 5 0.7 3 1.04 7 0.1 4 1.06 9 0.4") #rd.txt() is a combo fn of read.table and textConnection
mat <- matrix(NA, ncol=NROW(dta)+1, nrow=NROW(dta)+1) mat[2:NROW(mat),1] <- dta[["x"]] mat[1,2:NROW(mat)] <- dta[["y"]] diag(mat) <- c(NA, dta[["z"]]) mat
[,1] [,2] [,3] [,4] [,5] [1,] NA 5.0 5.0 7.0 9.0 [2,] 1.00 0.5 NA NA NA [3,] 1.02 NA 0.7 NA NA [4,] 1.04 NA NA 0.1 NA [5,] 1.06 NA NA NA 0.4
Thanks for your answer David,
but this yields a diagonal matrix only. I think I did not make myself
clear enough. In the original 3 column data frame, there could have
been a pair of x and y with identical y's but different x's and z's.
The way my data source is derived, there is a guarantee that there is
are no two rows with identical x and y in the original data frame. In
the end, x and y serve as a grid, with z values at each point in the
grid or NA's if there is no z value for a x and y pair. The number of
rows in the data frame is then equal to the number of non-NA values in
the resulting matrix.
Another try, lets assume this original data frame:
x y z
1 2 5 1
2 2 6 1
3 3 7 1
4 3 8 1
5 3 9 1
6 5 10 2
7 5 11 2
8 5 12 2
Then I would like to get
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA 5 6 7 8 9 10 11 12
[2,] 2 1 1
[3,] 2
[4,] 3 1 1 1
[5,] 3
[6,] 3
[7,] 5 2 2 2
[8,] 5
[9,] 5
I left out all the NA's, except the first, where there is no z value,
say e.g. x=5 and y=8.
Do you see what I mean?
I do, ... now anyway. Your earlier data example had non-integer x and y values which made what I will now offer infeasible (or at the very least ambiguous). Indexing with decimal numbers does not provoke an error and that the truncated value is used. With integer indices you can use a two column matrix as an argument to "["
mat <- matrix(NA, nrow=max(dta[[1]])+1, ncol=max(dta[[2]])+1 ) mat[data.matrix(dta[,1:2])] <- dta[,3] mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] NA NA NA NA NA NA NA NA NA NA NA NA
[2,] NA NA NA NA 1 1 NA NA NA NA NA NA
[3,] NA NA NA NA NA NA 1 1 1 NA NA NA
[4,] NA NA NA NA NA NA NA NA NA NA NA NA
[5,] NA NA NA NA NA NA NA NA NA 2 2 2
[6,] NA NA NA NA NA NA NA NA NA NA NA NA
[,13]
[1,] NA
[2,] NA
[3,] NA
[4,] NA
[5,] NA
[6,] NA
I leave the insertion of the first row and columns and removal of the extra columns
induced by the mismatch of the values and row numbers to you, since .....
mat[, 4:12]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] NA NA NA NA NA NA NA NA NA [2,] NA 1 1 NA NA NA NA NA NA [3,] NA NA NA 1 1 1 NA NA NA [4,] NA NA NA NA NA NA NA NA NA [5,] NA NA NA NA NA NA 2 2 2 [6,] NA NA NA NA NA NA NA NA NA -- David Winsemius, MD West Hartford, CT
Thanks for your tips and advice! I will see what I can work out alone from here on...