Skip to content
Back to formatted view

Raw Message

Message-ID: <763609346.1040209.1464024355110.JavaMail.yahoo@mail.yahoo.com>
Date: 2016-05-23T17:25:55Z
From: oslo
Subject: Choosing rows
In-Reply-To: <20160522123520.Horde.sAzcaYYDx5NUX5gf6zesx10@mail.sapo.pt>

Hi Rui;
Thanks so much for this. It works perfectly.
Regards,Oslo 

    On Sunday, May 22, 2016 7:35 AM, "ruipbarradas at sapo.pt" <ruipbarradas at sapo.pt> wrote:
 

 Hello,

First of all, it's better to post data using ?dput. Below, I give an example of that? in the lines structure(...).
dat <-
structure(list(rs = c("?? rs941873? ", "?? rs634552? ", "?? rs11107175? ",
"?? rs12307687? ", "?? rs3917155? ", "?? rs1600640? ", "?? rs2871865? ",
"?? rs2955250? ", "?? rs228758? ", "?? rs224333? ", "?? rs4681725? ",
"?? rs7652177? ", "?? rs925098? ", "?? rs1662837? ", "?? rs10071837? "
), n0 = c(81139462, 75282052, 94161719, 47175866, 76444685, 84603034,
99194896, 61959740, 42148205, 34023962, 56692321, 171969077,
17919811, 82168889, 33381581), Pvalue = c(1.52e-07, 0.108, 0.0285,
0.123, 0.68, 0.000275, 0.0709, 0.0317, 0.0772, 0.021, 0.000445,
0.000634, 5.55e-09, 8.66e-05, 0.000574), V1 = c("rs941873", "rs941873",
"rs941873", "rs12307687", "rs941873", "rs12307687", "rs12307687",
"rs12307687", "rs12307687", "rs10071837", "rs10071837", "rs10071837",
"rs925098", "rs925098", "rs925098")), .Names = c("rs", "n0",
"Pvalue", "V1"), row.names = c(NA, -15L), class = "data.frame")

Now, if I understand correctly, the following might do what you want.


tmp <- split(dat[, "Pvalue"], dat[, "V1"])
idx <- unlist(lapply(tmp, function(x) x == min(x)))[order(order(dat[, "V1"]))]
rm(tmp)
result <- dat[idx, ]
result

Hope this helps,

Rui Barradas
?Citando oslo via R-help <r-help at r-project.org>:
Hi all;
I have a big data set (a small part is given below) and V1 column has repeated info in it. That is rs941873, rs12307687... are repeating many times. I need choose only one SNP (in first column named rs) which has the smallest ?Pvalue withing V1 column. That is I need choose only one SNP for repeated names in V1 which has the smallest Pvalue.
Your helps are truly appreciated,Oslo

| ?rs? | n0 | Pvalue | V1 |
|? ?rs941873? | 81139462 | 1.52E-07 | rs941873 |
|? ?rs634552? | 75282052 | 1.08E-01 | rs941873 |
|? ?rs11107175? | 94161719 | 2.85E-02 | rs941873? |
|? ?rs12307687? | 47175866 | 1.23E-01 | rs12307687 |
|? ?rs3917155? | 76444685 | 6.80E-01 | rs941873? |
|? ?rs1600640? | 84603034 | 2.75E-04 | rs12307687 |
|? ?rs2871865? | 99194896 | 7.09E-02 | rs12307687 |
|? ?rs2955250? | 61959740 | 3.17E-02 | rs12307687 |
|? ?rs228758? | 42148205 | 7.72E-02 | rs12307687 |
|? ?rs224333? | 34023962 | 2.10E-02 | rs10071837 |
|? ?rs4681725? | 56692321 | 4.45E-04 | rs10071837 |
|? ?rs7652177? | 171969077 | 6.34E-04 | rs10071837 |
|? ?rs925098? | 17919811 | 5.55E-09 | rs925098 |
|? ?rs1662837? | 82168889 | 8.66E-05 | rs925098? |
|? ?rs10071837? | 33381581 | 5.74E-04 | rs925098? |


? ? ? ? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.

?

  
	[[alternative HTML version deleted]]