An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111011/7b177573/attachment.pl>
high and lowest with names
8 messages · Ben qant, Carlos Ortega, Bert Gunter +2 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111012/f8d15080/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111011/28c97f5c/attachment.pl>
which.max is even faster: dims <- c(1000,1000) tt <- array(rnorm(prod(dims)),dims) # which system.time( replicate(100, which(tt==max(tt), arr.ind=TRUE)) ) # which.max (& arrayInd) system.time( replicate(100, arrayInd(which.max(tt), dims)) ) Best, Denes
But it's simpler and probably faster to use R's built-in capabilities.
?which ## note the arr.ind argument!)
As an example:
test <- matrix(rnorm(24), nr = 4)
which(test==max(test), arr.ind=TRUE)
row col
[1,] 2 6
So this gives the row and column indices of the max, from which row and
column names can easily be obtained from the dimnames attribute of the
matrix.
Note: This assumes that the object in question is a matrix, NOT a data
frame, for which it would be slightly more complicated.
-- Bert
On Tue, Oct 11, 2011 at 3:06 PM, Carlos Ortega
<cof at qualityexcellence.es>wrote:
Hi, With this code you can find row and col names for the largest value applied to your example: r.m.tmp<-apply(dat,1,max) r.max<-names(r.m.tmp)[r.m.tmp==max(r.m.tmp)] c.m.tmp<-apply(dat,2,max) c.max<-names(c.m.tmp)[c.m.tmp==max(c.m.tmp)] It's inmediate how to get the same for the smallest and build a function to calculate everything and return a list. Regards, Carlos Ortega www.qualityexcellence.es 2011/10/11 Ben qant <ccquant at gmail.com>
Hello, I'm looking to get the values, row names and column names of the
largest
and smallest values in a matrix. Example (except is does not include the names):
x <- swiss$Education[1:25]
dat = matrix(x,5,5)
colnames(dat) = c('a','b','c','d','c')
rownames(dat) = c('z','y','x','w','v')
dat
a b c d c z 12 7 6 2 10 y 9 7 12 8 3 x 5 8 7 28 12 w 7 7 12 20 6 v 15 13 5 9 1
#top 10 sort(dat,partial=n-9:n)[(n-9):n]
[1] 9 10 12 12 12 12 13 15 20 28
# bottom 10 sort(dat,partial=1:10)[1:10]
[1] 1 2 3 5 5 6 6 7 7 7 ...except I need the rownames and colnames to go along for the ride
with
the values...because of this, I am guessing the return value will need to
be a
list since all of the values have different row and col names (which
is
fine).
Regards,
Ben
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111012/7d513914/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111013/e1510794/attachment.pl>
On Oct 13, 2011, at 10:42 AM, Ben qant wrote:
Here is a more R'sh solution (speed unknown).
Really? The intermediate, potentially large, objects seem to be proliferating.
Courtesy of Mark Leeds (I modified it a bit to generalize it for a cnt input and get min and max). Again, getting cnt highest and lowest values in the entire matrix and display the data point row and column names with each:
1) For max (or min) I would have thought that one could have much more
easily gathered the maximum and minimum locations with:
which(x == max(x), arr.ind=TRUE) # Bert Gunter's discarded
suggestion
... and used the results as indices into x or rownames(x) or
colnames(x). But I made no earlier comments because it did not appear
that you had provided the swiss$Education object in a form that could
be easily extracted for testing. I see now that setting up a similar
object was fairly easy, but would encourage you to consider the `dput`
function for such problem construction in the future;
dat2 <- matrix(sample(1:25, 25), 5,5)
colnames(dat2) = c('a','b','c','d','e')
rownames(dat2) = c('z','y','x','w','v')
arrns <- which(dat2 == max(dat2), arr.ind=TRUE)
> arrns
row col
v 5 1
> colnames(dat2)[arrns[,2]] ; rownames(dat2)[arrns[,1]]
[1] "a"
[1] "v"
2) For display of all results with row/column labels :
rbind(dat2, rownames(dat2)[row(dat2)], colnames(dat2)[row(dat2)])
3) For display of values of "bottom five" and top five:
dat2five <- which(dat2 <= c(dat2)[order(dat2)][5], arr.ind=TRUE)
rbind( dat2LT5= dat2[dat2five],
Rows = rownames(dat2)[ dat2five[,1] ],
Cols = colnames(dat2)[ dat2five[,2] ])
#--------------
[,1] [,2] [,3] [,4] [,5]
dat2LT5 "2" "3" "5" "1" "4"
Rows "x" "w" "y" "y" "x"
Cols "a" "a" "c" "d" "d"
dat2topfive <- which(dat2 >= c(dat2)[rev(order(dat2))][5], arr.ind=TRUE)
rbind( dat2top5= dat2[dat2topfive],
Rows = rownames(dat2)[ dat2topfive[,1] ],
Cols = colnames(dat2)[ dat2topfive[,2] ])
#---------------
[,1] [,2] [,3] [,4] [,5]
dat2top5 "24" "25" "23" "22" "21"
Rows "z" "v" "y" "w" "v"
Cols "a" "a" "b" "e" "e"
x <- swiss$Education[1:25]
dat = matrix(x,5,5)
colnames(dat) = c('a','b','c','d','e')
rownames(dat) = c('z','y','x','w','v')
cnt = 10
#===============================================
print(dat)
a b c d e z 12 7 6 2 10 y 9 7 12 8 3 x 5 8 7 28 12 w 7 7 12 20 6 v 15 13 5 9 1
# MAKE IT A VECTOR FOR EASIER ORDERING datasvec <- as.vector(dat) # ORDER IT datasvecordered<- order(datasvec) # RECYCLE ROWS AND COLUMNS NAMES FOR EASIER MAPPING recycledcols <- rep(colnames(dat),each=nrow(dat)) recycledrows <- rep(rownames(dat),times=ncol(dat)) # GET THE VALUES, THE ROW NAMES AND THE COLUMN NAMES len = length(datasvecordered) rr_len = length(recycledrows)
rbind(datasvec[datasvecordered][(len-
cnt):len],recycledrows[datasvecordered][(rr_len-
cnt):rr_len],recycledcols[datasvecordered][(rr_len-cnt):rr_len])
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,] "9" "9" "10" "12" "12" "12" "12" "13" "15" "20" "28"
[2,] "y" "v" "z" "z" "y" "w" "x" "v" "v" "w" "x"
[3,] "a" "d" "e" "a" "c" "c" "e" "b" "a" "d" "d"
rbind(datasvec[datasvecordered][1:cnt],recycledrows[datasvecordered]
[1:cnt],recycledcols[datasvecordered][1:cnt])
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] "1" "2" "3" "5" "5" "6" "6" "7" "7" "7"
[2,] "v" "z" "y" "x" "v" "z" "w" "w" "z" "y"
[3,] "e" "d" "e" "a" "c" "c" "e" "a" "b" "b"
enjoy
ben
On Wed, Oct 12, 2011 at 11:47 AM, Ben qant <ccquant at gmail.com> wrote:
Hello,
This is my solution. This is pretty fast (tested with a larger data
set)!
If you have a more elegant way to do it (of similar speed), please
reply.
Thanks for the help!
################## get highest and lowest values and names of a
matrix
# create sample data
x <- swiss$Education[1:25]
dat = matrix(x,5,5)
colnames(dat) = c('a','b','c','d','e')
rownames(dat) = c('z','y','x','w','v')
#my solution
nms = dimnames(dat) #get matrix row and col names
cnt = 10 # number of max and mins to get
tmp = dat
mxs = list("list",cnt)
mns = list("list",cnt)
for(i in 1:cnt){
#get maxes
mx_dims = arrayInd(which.max(tmp), dim(tmp)) # get max dims for
entire
matrix note: which.max also removes NA's
mx_nm = c(nms[[1]][mx_dims[1]],nms[[2]][mx_dims[2]]) #get names
mx = tmp[mx_dims] # get max value
mxs[[i]] = c(mx,mx_nm) # add max and dim names to list of maxes
tmp[mx_dims] = NA #removes last max so new one is found
#get mins (basically same as above)
mn_dims = arrayInd(which.min(tmp), dim(tmp))
mn_nm = c(nms[[1]][mn_dims[1]],nms[[2]][mn_dims[2]])
mn = tmp[mn_dims]
mns[[i]] = c(mn,mn_nm)
tmp[mn_dims] = NA
}
mxs
mns
# end
Regards,
Ben
On Tue, Oct 11, 2011 at 5:32 PM, "D?nes T?TH" <tdenes at cogpsyphy.hu>
wrote:
which.max is even faster: dims <- c(1000,1000) tt <- array(rnorm(prod(dims)),dims) # which system.time( replicate(100, which(tt==max(tt), arr.ind=TRUE)) ) # which.max (& arrayInd) system.time( replicate(100, arrayInd(which.max(tt), dims)) ) Best, Denes
But it's simpler and probably faster to use R's built-in
capabilities.
?which ## note the arr.ind argument!)
As an example:
test <- matrix(rnorm(24), nr = 4)
which(test==max(test), arr.ind=TRUE)
row col
[1,] 2 6
So this gives the row and column indices of the max, from which
row and
column names can easily be obtained from the dimnames attribute
of the
matrix.
Note: This assumes that the object in question is a matrix, NOT a
data
frame, for which it would be slightly more complicated.
-- Bert
On Tue, Oct 11, 2011 at 3:06 PM, Carlos Ortega
<cof at qualityexcellence.es>wrote:
Hi, With this code you can find row and col names for the largest value applied to your example: r.m.tmp<-apply(dat,1,max) r.max<-names(r.m.tmp)[r.m.tmp==max(r.m.tmp)] c.m.tmp<-apply(dat,2,max) c.max<-names(c.m.tmp)[c.m.tmp==max(c.m.tmp)] It's inmediate how to get the same for the smallest and build a
function
to calculate everything and return a list. Regards, Carlos Ortega www.qualityexcellence.es 2011/10/11 Ben qant <ccquant at gmail.com>
Hello, I'm looking to get the values, row names and column names of the
largest
and smallest values in a matrix. Example (except is does not include the names):
x <- swiss$Education[1:25]
dat = matrix(x,5,5)
colnames(dat) = c('a','b','c','d','c')
rownames(dat) = c('z','y','x','w','v')
dat
a b c d c z 12 7 6 2 10 y 9 7 12 8 3 x 5 8 7 28 12 w 7 7 12 20 6 v 15 13 5 9 1
#top 10 sort(dat,partial=n-9:n)[(n-9):n]
[1] 9 10 12 12 12 12 13 15 20 28
# bottom 10 sort(dat,partial=1:10)[1:10]
[1] 1 2 3 5 5 6 6 7 7 7 ...except I need the rownames and colnames to go along for the ride
with
the values...because of this, I am guessing the return value will need to
be a
list since all of the values have different row and col names (which
is
fine). Regards, Ben
David Winsemius, MD West Hartford, CT
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111013/23258222/attachment.pl>