Skip to content

Keep only those values in a row in a data frame which occur only once.

3 messages · Ashim Kapoor, Jim Lemon, S Ellison

#
Dear All,

I have a file data.txt as follows:

Name_1,A,B,C
Name_2,E,F
Name_3,I,J,I,K,L,M

I will read this with:
my_data<- read.csv("data.txt",header=FALSE,col.names=paste0("V",
seq(1:10)),fill=TRUE)

Then the file will have 10 columns. I am assuming that each row in data.txt
will have at the max 10 entries.

Note: Here each row will have a different number of columns in data.txt but
each row will have 10 ( some trailing blank columns ) columns.

My query is how can I keep only the unique elements in each row? For
example: I want the row 3 to be Name_3,I,J,K,L,M

Please note I don't want the 2nd I to appear.

How can I do this?

Best Regards,
Ashim
#
Hi Ashim,
One way is this, assuming that your data frame is named akdf:

akdf<-t(apply(akdf,1,function(x) return(unique(x)[1:length(x)])))

If you want factors instead of strings, more processing will be required.
Jim
On Mon, Jun 12, 2017 at 3:23 PM, Ashim Kapoor <ashimkapoor at gmail.com> wrote:
#
Use unique() on each row and pad with NA?

Example:
uniq10 <- function(x, L=10) {
	u <- unique(x)
	c(u, rep(NA, L-length(u)) )
}

as.data.frame(  t( apply(tmp, 1, uniq10)  )  )

assuming tmp is the name of your initial data frame.

S Ellison




*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}