An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111122/50c19818/attachment.pl>
Removing rows in dataframe w'o duplicated values
6 messages · AC Del Re, Brad Patrick Schneid, Dennis Murphy +2 more
This is ugly, but it gets what you want. dat[which(dat[,1] %in% unique((dat[duplicated(dat[,1], fromLast = T), 1]))),] AC Del Re wrote
Hi,
Is there an easy way to remove dataframe rows without duplicated values of
a specified column ('id')? e.g.,
dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 =
c(1,4,3,3,4,3))
dat
id value value2
1 1 5 1
2 1 6 4
3 1 7 3
4 2 4 3
5 3 5 4
6 3 4 3
This is sample data and the real data has hundreds of rows. In this
case, only row 4 does not have a duplicated id and I would like to
remove it without using:
dat$id[4] <- NULL
Any help is appreciated!
AC
[[alternative HTML version deleted]]
______________________________________________ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- View this message in context: http://r.789695.n4.nabble.com/Removing-rows-in-dataframe-w-o-duplicated-values-tp4096582p4096672.html Sent from the R help mailing list archive at Nabble.com.
Hi:
Here's one way:
do.call(rbind, lapply(L, function(d) if(nrow(d) > 1) return(d)))
id value value2
1.1 1 5 1
1.2 1 6 4
1.3 1 7 3
3.5 3 5 4
3.6 3 4 3
HTH,
Dennis
On Tue, Nov 22, 2011 at 9:43 AM, AC Del Re <delre at wisc.edu> wrote:
Hi,
Is there an easy way to remove dataframe rows without duplicated values of
a specified column ('id')? ?e.g.,
dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 =
c(1,4,3,3,4,3))
dat
?id value value2
1 ?1 ? ? 5 ? ? ?1
2 ?1 ? ? 6 ? ? ?4
3 ?1 ? ? 7 ? ? ?3
4 ?2 ? ? 4 ? ? ?3
5 ?3 ? ? 5 ? ? ?4
6 ?3 ? ? 4 ? ? ?3
This is sample data and the real data has hundreds of rows. In this
case, only row 4 does not have a duplicated id and I would like to
remove it without using:
dat$id[4] <- NULL
Any help is appreciated!
AC
? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sorry, you need this first: L <- split(dat, dat$id) do.call(rbind, lapply(L, function(d) if(nrow(d) > 1) return(d))) D.
On Tue, Nov 22, 2011 at 10:38 AM, Dennis Murphy <djmuser at gmail.com> wrote:
Hi: Here's one way: do.call(rbind, lapply(L, function(d) if(nrow(d) > 1) return(d))) ? ?id value value2 1.1 ?1 ? ? 5 ? ? ?1 1.2 ?1 ? ? 6 ? ? ?4 1.3 ?1 ? ? 7 ? ? ?3 3.5 ?3 ? ? 5 ? ? ?4 3.6 ?3 ? ? 4 ? ? ?3 HTH, Dennis On Tue, Nov 22, 2011 at 9:43 AM, AC Del Re <delre at wisc.edu> wrote:
Hi,
Is there an easy way to remove dataframe rows without duplicated values of
a specified column ('id')? ?e.g.,
dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 =
c(1,4,3,3,4,3))
dat
?id value value2
1 ?1 ? ? 5 ? ? ?1
2 ?1 ? ? 6 ? ? ?4
3 ?1 ? ? 7 ? ? ?3
4 ?2 ? ? 4 ? ? ?3
5 ?3 ? ? 5 ? ? ?4
6 ?3 ? ? 4 ? ? ?3
This is sample data and the real data has hundreds of rows. In this
case, only row 4 does not have a duplicated id and I would like to
remove it without using:
dat$id[4] <- NULL
Any help is appreciated!
AC
? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
one approach is the following:
dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4),
value2 = c(1,4,3,3,4,3))
ind <- ave(dat$id, dat$id, FUN = length) > 1
dat[ind, ]
I hope it helps.
Best,
Dimitris
On 11/22/2011 6:43 PM, AC Del Re wrote:
Hi,
Is there an easy way to remove dataframe rows without duplicated values of
a specified column ('id')? e.g.,
dat<- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 =
c(1,4,3,3,4,3))
dat
id value value2
1 1 5 1
2 1 6 4
3 1 7 3
4 2 4 3
5 3 5 4
6 3 4 3
This is sample data and the real data has hundreds of rows. In this
case, only row 4 does not have a duplicated id and I would like to
remove it without using:
dat$id[4]<- NULL
Any help is appreciated!
AC
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/
On Nov 22, 2011, at 12:43 PM, AC Del Re wrote:
Hi,
Is there an easy way to remove dataframe rows without duplicated
values of
a specified column ('id')? e.g.,
dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4),
value2 =
c(1,4,3,3,4,3))
dat
id value value2
1 1 5 1
2 1 6 4
3 1 7 3
4 2 4 3
5 3 5 4
6 3 4 3
> dat[ave(dat$id, dat$id, FUN=length) >1, ] id value value2 1 1 5 1 2 1 6 4 3 1 7 3 5 3 5 4 6 3 4 3
This is sample data and the real data has hundreds of rows. In this case, only row 4 does not have a duplicated id and I would like to remove it without using: dat$id[4] <- NULL Any help is appreciated! AC [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD West Hartford, CT