Removing rows in dataframe w'o duplicated values

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111122/50c19818/attachment.pl>
This is ugly, but it gets what you want. 

dat[which(dat[,1] %in% unique((dat[duplicated(dat[,1], fromLast = T),
1]))),]

AC Del Re wrote
Hi,

Is there an easy way to remove dataframe rows without duplicated values of
a specified column ('id')?  e.g.,

dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 =
c(1,4,3,3,4,3))
dat

  id value value2
1  1     5      1
2  1     6      4
3  1     7      3
4  2     4      3
5  3     5      4
6  3     4      3

This is sample data and the real data has hundreds of rows. In this
case, only row 4 does not have a duplicated id and I would like to
remove it without using:

dat$id[4] <- NULL

Any help is appreciated!

AC

	[[alternative HTML version deleted]]

______________________________________________
R-help@ mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
View this message in context: http://r.789695.n4.nabble.com/Removing-rows-in-dataframe-w-o-duplicated-values-tp4096582p4096672.html
Sent from the R help mailing list archive at Nabble.com.
Hi:

Here's one way:

do.call(rbind, lapply(L, function(d) if(nrow(d) > 1) return(d)))
    id value value2
1.1  1     5      1
1.2  1     6      4
1.3  1     7      3
3.5  3     5      4
3.6  3     4      3

HTH,
Dennis
Hi,

Is there an easy way to remove dataframe rows without duplicated values of
a specified column ('id')? ?e.g.,

dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 =
c(1,4,3,3,4,3))
dat

?id value value2
1 ?1 ? ? 5 ? ? ?1
2 ?1 ? ? 6 ? ? ?4
3 ?1 ? ? 7 ? ? ?3
4 ?2 ? ? 4 ? ? ?3
5 ?3 ? ? 5 ? ? ?4
6 ?3 ? ? 4 ? ? ?3

This is sample data and the real data has hundreds of rows. In this
case, only row 4 does not have a duplicated id and I would like to
remove it without using:

dat$id[4] <- NULL

Any help is appreciated!

AC

? ? ? ?[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Sorry, you need this first:

L <- split(dat, dat$id)
do.call(rbind, lapply(L, function(d) if(nrow(d) > 1) return(d)))

D.
Hi:

Here's one way:

do.call(rbind, lapply(L, function(d) if(nrow(d) > 1) return(d)))
? ?id value value2
1.1 ?1 ? ? 5 ? ? ?1
1.2 ?1 ? ? 6 ? ? ?4
1.3 ?1 ? ? 7 ? ? ?3
3.5 ?3 ? ? 5 ? ? ?4
3.6 ?3 ? ? 4 ? ? ?3

HTH,
Dennis

On Tue, Nov 22, 2011 at 9:43 AM, AC Del Re <delre at wisc.edu> wrote:
Hi,

Is there an easy way to remove dataframe rows without duplicated values of
a specified column ('id')? ?e.g.,

dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 =
c(1,4,3,3,4,3))
dat

?id value value2
1 ?1 ? ? 5 ? ? ?1
2 ?1 ? ? 6 ? ? ?4
3 ?1 ? ? 7 ? ? ?3
4 ?2 ? ? 4 ? ? ?3
5 ?3 ? ? 5 ? ? ?4
6 ?3 ? ? 4 ? ? ?3

This is sample data and the real data has hundreds of rows. In this
case, only row 4 does not have a duplicated id and I would like to
remove it without using:

dat$id[4] <- NULL

Any help is appreciated!

AC

? ? ? ?[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

one approach is the following:

dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4),
     value2 = c(1,4,3,3,4,3))

ind <- ave(dat$id, dat$id, FUN = length) > 1
dat[ind, ]

I hope it helps.

Best,
Dimitris
Hi,

Is there an easy way to remove dataframe rows without duplicated values of
a specified column ('id')?  e.g.,

dat<- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 =
c(1,4,3,3,4,3))
dat

   id value value2
1  1     5      1
2  1     6      4
3  1     7      3
4  2     4      3
5  3     5      4
6  3     4      3

This is sample data and the real data has hundreds of rows. In this
case, only row 4 does not have a duplicated id and I would like to
remove it without using:

dat$id[4]<- NULL

Any help is appreciated!

AC

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/

Hi,

Is there an easy way to remove dataframe rows without duplicated  
values of
a specified column ('id')?  e.g.,

dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4),  
value2 =
c(1,4,3,3,4,3))
dat

 id value value2
1  1     5      1
2  1     6      4
3  1     7      3
4  2     4      3
5  3     5      4
6  3     4      3
> dat[ave(dat$id, dat$id, FUN=length) >1, ]
   id value value2
1  1     5      1
2  1     6      4
3  1     7      3
5  3     5      4
6  3     4      3

This is sample data and the real data has hundreds of rows. In this
case, only row 4 does not have a duplicated id and I would like to
remove it without using:

dat$id[4] <- NULL

Any help is appreciated!

AC

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT