Skip to content

convert dataframe to matrix for cmdscale

4 messages · William Simpson, Henrique Dallazuanna

#
I have a dataframe like this (toy example):

x	y	z
"a"	"a"	0
"a"	"b"	1
"a"	"c"	2
"b"	"a"	.9
"b"	"b"	0
"b"	"c"	1.3
"c"	"a"	2.2
"c"	"b"	1.1
"c"	"c"	0

The observations are from a matrix like this:

c 2.2 1.1 0.0
b 0.9 0.0 1.3
a 0.0 1.0 2.0
  a    b    c

Notice that the observation a,b != b,a
That is because the two stimuli a & b are presented to the subject,
who judges how different they are. The stimuli are presented twice,
once in the order a,b and once in the order b,a. Subjects are not
perfectly consistent and so will not give exactly the same answer
twice. However it is reasonable to take the average of a,b and b,a.

I would like to do cmdscale or isoMDS on the data.
As I understand it, these take the data as a lower triangle. At least
that is how the eurodist example for cmdscale went.
So in my case I need
c
b             1.20
a    0.95 2.10
  a    b    c

Starting from the dataframe at the top of this posting, how do I get a
lower triangular matrix in this form, with the labels a, b, c (just
like eurodist)?

Thanks very much for any help.

Bill
#
On re-reading ?cmdscale I see that I can also use cmdscale(d) on a
full matrix rather than just the lower triangle.
 d: a distance structure such as that returned by 'dist' or a
          full symmetric matrix containing the dissimilarities.

So how to get a full symmetric matrix from a dataframe like this (toy example)?
 x       y       z
 "a"     "a"     0
 "a"     "b"     1
 "a"     "c"     2
 "b"     "a"     .9
 "b"     "b"     0
 "b"     "c"     1.3
 "c"     "a"     2.2
 "c"     "b"     1.1
 "c"     "c"     0

Replacing the corresponding cells in the matrix by their means, I need this:

 c 2.10 1.20 0.00
 b 0.95 0.00 1.20
 a 0.00 0.95 2.10
  a    b    c

E.g. in the original data, the cell (a,b) = 1; (b, a) =0.9. In the
final matrix, (a,b) = (b,a) =0.95

 Thanks very much for any help.

Bill
#
Thanks Henrique

Here's what I came up with

temp<-read.table(filename, header=TRUE)
tempm<-tapply(temp[,3],temp[,1:2],c) #put dataframe into matrix form
tempm<-(tempm + t(tempm))/2 #add matrix to matrix flipped about
diagonal and divide by 2

Bill
On Wed, Dec 10, 2008 at 1:29 PM, Henrique Dallazuanna <wwwhsd at gmail.com> wrote: