help matching rows of a data frame
Hi Terry,
I take your question to mean how to label distinct rows of a data frame. If
that is not your question please clarify.
I found the row.match() function in the package prodlim that can be used to
solve this.
However since your request requires no additional dependencies I borrowed
the relevant code from the row.match function.
Here is some obfuscated code to provide your answer in one line, per your
request. (less obfuscated code just below that.
Assuming your data frame is called 'df':
df[,ncol(df)+1] <- match( do.call("paste", c(df[, , drop = FALSE], sep =
"\\r")), do.call("paste", c(unique(df)[, , drop = FALSE], sep = "\\r")) )
The last column of df now contains the 'label' i.e. the row number of the
first row in df that is the same as the given row.
Somewhat less obfuscated
getLabels <- function(df) {
match( do.call("paste", c(df[, , drop = FALSE],
sep = "\\r")),
do.call("paste", c(unique(df)[, , drop
= FALSE], sep = "\\r")) )
}
myDataFrame$label <- getLabels(myDataFrame)
HTH,
Eric
On Mon, Sep 18, 2017 at 3:13 PM, Therneau, Terry M., Ph.D. <
therneau at mayo.edu> wrote:
This question likely has a 1 line answer, I'm just not seeing it. (2, 3, or 10 lines is fine too.) For a vector I can do group <- match(x, unqiue(x)) to get a vector that labels each element of x. What is an equivalent if x is a data frame? The result does not have to be fast: the data set will have < 100 elements. Since this is inside the survival package, and that package is on the 'recommended' list, I can't depend on any package outside the recommended list. Terry T.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti ng-guide.html and provide commented, minimal, self-contained, reproducible code.