Sorting vector based on pairs of comparisons
This is called topological sorting in some circles. The function below will give you one ordering that is consistent with the contraints but not all possible orderings. I couldn't find such a function in core R so I wrote one a while back based on Kahn's algorithm, as described in Wikipedia.
Smaller <- c("ASD", "DFE", "ASD", "SDR", "EDF", "ASD")
Larger <- c("SDR", "EDF", "KLM", "KLM", "SDR", "EDF")
matComp <- cbind(Smaller, Larger)
sortTopologically(matComp, unique(as.vector(matComp)))
[1] "ASD" "DFE" "EDF" "SDR" "KLM"
Bill Dunlap
TIBCO Software
wdunlap tibco.com
sortTopologically <- function(edgeMatrix, V)
{
# edgeMatrix is 2-column matrix which describes a partial
# ordering of a set of vertices. The first column is the 'from'
vertex,
# the second the 'to' vertex.
# V is the vector of all the vertices in the graph.
#
# Return a vector, L, consisting of the vertices in
# V in an order consistent with the partial ordering
# described by edgeMatrix.
# Throw an error if such an ordering is not possible.
#
# Use Kahn's algorithm (
https://en.wikipedia.org/wiki/Topological_sorting).
#
# Note that disconnected vertices will not be mentioned in edgeMatrix,
# but will be in V.
stopifnot(is.matrix(edgeMatrix),
ncol(edgeMatrix)==2,
!any(is.na(edgeMatrix)),
!any(is.na(V)),
all(as.vector(edgeMatrix) %in% V))
L <- V[0] # match the type of the input
S <- setdiff(V, edgeMatrix[, 2])
V <- setdiff(V, S)
while(length(S) > 0) {
n <- S[1]
# cat("Adding", n, "to L", "\n")
L <- c(L, n)
S <- S[-1]
mRow <- edgeMatrix[,1] == n
edgeMatrix <- edgeMatrix[ !mRow, , drop=FALSE ]
S <- c(S, setdiff(V, edgeMatrix[,2]))
V <- setdiff(V, S)
}
if (nrow(edgeMatrix) > 0) {
stop("There are cycles in the dependency graph")
}
L
}
On Thu, Mar 14, 2019 at 4:30 AM Pedro Conte de Barros <pbarros at ualg.pt>
wrote:
Dear All,
This should be a quite established algorithm, but I have been searching
for a couple days already without finding any satisfactory solution.
I have a matrix defining pairs of Smaller-Larger arbitrary character
values, like below
Smaller <- c("ASD", "DFE", "ASD", "SDR", "EDF", "ASD")
Larger <- c("SDR", "EDF", "KLM", "KLM", "SDR", "EDF"
matComp <- cbind(Smaller, Larger)
so that matComp looks like this
Smaller Larger
[1,] "ASD" "SDR"
[2,] "DFE" "EDF"
[3,] "ASD" "KLM"
[4,] "SDR" "KLM"
[5,] "EDF" "SDR"
[6,] "ASD" "EDF"
This matrix establishes six pairs of "larger than" relationships that
can be used to sort the unique values in the matrix,
> unique(as.vector(matComp))
[1] "ASD" "DFE" "SDR" "EDF" "KLM"
Specifically, I would like to get this:
sorted <- c("ASD", "DFE", "EDF", "SDR", "KLM")
or, equally valid (my matrix does not have the full information):
sorted <- c("DFE", "ASD", "EDF", "SDR", "KLM")
Preferably, I would get the different combinations of the unique values
that satisfy the "larger than" conditions in the matrix...
I am sure this is a trivial problem, but I could not find any algorithm
to solve it.
Any help would be highly appreciated
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.