Skip to content
Prev 396629 / 398502 Next

please help generate a square correlation matrix

Your expanded explanation helps clarify your intent. Herewith some
comments. Of course, feel free to ignore and not respond. And, as
always, my apologies if I have failed to comprehend your intent.

1. I would avoid any notion of "statistical significance" like the
plague. This is a purely exploratory exercise.

2. My understanding is that you want to know the proportion of rows in
a pair of columns/vectors in which only 1 values of the pair is 1 out
of the number of pairs where 1 or 2 values is 1.  In R syntax, this is
simply:

sum(xor(x, y)) / sum(x | y)  ,
where x and y are two columns of 1's and 0's

Better yet might be to report both this *and* sum(x|y) to help you
judge "meaningfulness".
Here is a simple function that does this

## first, define a function that does above calculation:
assoc <- \(z){
   x <- z[,1]; y <- z[,2]
   n <- sum(x|y)
   c(prop = sum(xor(x, y))/n, N = n)
}

## Now a function that uses it for the various combinations:

somecor <- function(dat, func = assoc){
   dat <- as.matrix(dat)
   indx <- seq_len(ncol(dat))
   rbind(w <- combn(indx,2),
         combn(indx, 2, FUN = \(m)func(dat[,m]) )) |>
     t()  |> round(digits =2) |>
  'dimnames<-'(list(rep.int('',ncol(w)), c("","", "prop","N")))
}

# Now apply it to your example data:

somecor(dat)
## which gives
     prop N
 1 2 0.67 6
 1 3 0.60 5
 1 4 0.57 7
 2 3 0.60 5
 2 4 0.33 6
 3 4 0.71 7

This seems more interpretable and directly useful to me. Bigger values
of prop for bigger N are the more interesting, assuming I have
interpreted you correctly.

Cheers,
Bert
On Sat, Jul 27, 2024 at 12:54?PM Yuan Chun Ding <ycding at coh.org> wrote:
Message-ID: <CAGxFJbS_MwQgxueuX6JXeMRAa8+zrv=mnEc9r8H9kC26QpEkbg@mail.gmail.com>
In-Reply-To: <MN2PR02MB691195653B0F7349AE172F6CD4B52@MN2PR02MB6911.namprd02.prod.outlook.com>