Skip to content
Prev 381712 / 398502 Next

Remove highly correlated variables from a data frame or matrix

it can be converted between data frame and matrix. I am attaching here
the whole file for examination

I basically want to remove all entries for pairs which have value in
between them (correlation calculated not in R, bit it is correlation,
r2)
so for example I would not keep: rs883504 because it has r2>0.8 for
all those rs...

                  rs8069610 rs883504 rs8072394 rs4280293 rs4465638 rs12602378
rs56192520      0.582    0.903     0.582     0.582     0.811      0.302
rs3764410       0.598    0.928     0.598     0.598     0.836      0.311
rs145984817     0.638    0.975     0.638     0.638     0.879      0.344
rs1807401       0.638    0.975     0.638     0.638     0.879      0.344
rs1807402       0.638    0.975     0.638     0.638     0.879      0.344
rs35350506      0.638    0.975     0.638     0.638     0.879      0.344
On Thu, Nov 14, 2019 at 2:29 PM Abby Spurdle <spurdle.a at gmail.com> wrote:
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ro246_matrix.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20191114/2577162a/attachment.txt>