Dear All,
I have a file data.txt as follows:
Name_1,A,B,C
Name_2,E,F
Name_3,I,J,I,K,L,M
I will read this with:
my_data<- read.csv("data.txt",header=FALSE,col.names=paste0("V",
seq(1:10)),fill=TRUE)
Then the file will have 10 columns. I am assuming that each row in data.txt
will have at the max 10 entries.
Note: Here each row will have a different number of columns in data.txt but
each row will have 10 ( some trailing blank columns ) columns.
My query is how can I keep only the unique elements in each row? For
example: I want the row 3 to be Name_3,I,J,K,L,M
Please note I don't want the 2nd I to appear.
How can I do this?
Best Regards,
Ashim
Keep only those values in a row in a data frame which occur only once.
3 messages · Ashim Kapoor, Jim Lemon, S Ellison
Hi Ashim, One way is this, assuming that your data frame is named akdf: akdf<-t(apply(akdf,1,function(x) return(unique(x)[1:length(x)]))) If you want factors instead of strings, more processing will be required. Jim
On Mon, Jun 12, 2017 at 3:23 PM, Ashim Kapoor <ashimkapoor at gmail.com> wrote:
Dear All,
I have a file data.txt as follows:
Name_1,A,B,C
Name_2,E,F
Name_3,I,J,I,K,L,M
I will read this with:
my_data<- read.csv("data.txt",header=FALSE,col.names=paste0("V",
seq(1:10)),fill=TRUE)
Then the file will have 10 columns. I am assuming that each row in data.txt
will have at the max 10 entries.
Note: Here each row will have a different number of columns in data.txt but
each row will have 10 ( some trailing blank columns ) columns.
My query is how can I keep only the unique elements in each row? For
example: I want the row 3 to be Name_3,I,J,K,L,M
Please note I don't want the 2nd I to appear.
How can I do this?
Best Regards,
Ashim
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
I have a file data.txt as follows: Name_1,A,B,C Name_2,E,F Name_3,I,J,I,K,L,M My query is how can I keep only the unique elements in each row? For example: I want the row 3 to be Name_3,I,J,K,L,M Please note I don't want the 2nd I to appear. How can I do this?
Use unique() on each row and pad with NA?
Example:
uniq10 <- function(x, L=10) {
u <- unique(x)
c(u, rep(NA, L-length(u)) )
}
as.data.frame( t( apply(tmp, 1, uniq10) ) )
assuming tmp is the name of your initial data frame.
S Ellison
*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}