Skip to content

Sorting alphanumerically

5 messages · Gabor Grothendieck, Marc Schwartz (via MN), Chuck Cleland +1 more

#
I'm trying to sort a DATAFRAME by a column "ID" that contains
alphanumeric data. Specifically,"ID" contains integers all preceeded
by the character "g" as in:

g1, g6, g3, g19, g100, g2, g39

I am using the following code:

DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),]

and was hoping it would sort the dataframe by ID in the following manner

g1, g2, g3, g6, g19, g39, g100

but it doesn't sort at all. Could anyone point out my mistake?

Thank you.

Mark
#
This was just discussed recently.  Try:

library(gtools)
?mixedorder
On 2/24/06, mtb954 mtb954 <mtb954 at gmail.com> wrote:
#
On Fri, 2006-02-24 at 12:54 -0600, mtb954 mtb954 wrote:
The values are being sorted by character based ordering, which may be
impacted upon by your locale.

Thus, on my system, you end up with something like the following:
[1] "g1"   "g100" "g19"  "g2"   "g3"   "g39"  "g6"


What you can do, based upon the presumption that the prefix of 'g' is
present as you describe above, is:
[1] "g1"   "g2"   "g3"   "g6"   "g19"  "g39"  "g100"


What this does is to use gsub() to strip the 'g' and then order by
numeric value.


HTH,

Marc Schwartz
#
Does this help?

ID <- paste("g", sample(1:100, 100, replace=FALSE), sep="")

ID
   [1] "g88"  "g5"   "g79"  "g67"  "g43"  "g21"  "g66"
   [8] "g9"   "g38"  "g86"  "g12"  "g85"  "g74"  "g34"
  [15] "g52"  "g95"  "g6"   "g22"  "g70"  "g87"  "g7"
  [22] "g83"  "g63"  "g42"  "g26"  "g65"  "g16"  "g97"
  [29] "g76"  "g2"   "g90"  "g23"  "g15"  "g82"  "g75"
  [36] "g58"  "g17"  "g20"  "g96"  "g91"  "g31"  "g33"
  [43] "g48"  "g32"  "g93"  "g54"  "g49"  "g36"  "g81"
  [50] "g57"  "g27"  "g14"  "g62"  "g10"  "g80"  "g71"
  [57] "g28"  "g37"  "g89"  "g8"   "g94"  "g68"  "g56"
  [64] "g92"  "g41"  "g11"  "g4"   "g99"  "g55"  "g60"
  [71] "g18"  "g69"  "g19"  "g64"  "g39"  "g1"   "g53"
  [78] "g44"  "g24"  "g100" "g35"  "g3"   "g40"  "g47"
  [85] "g51"  "g46"  "g61"  "g45"  "g50"  "g25"  "g13"
  [92] "g73"  "g77"  "g30"  "g84"  "g78"  "g29"  "g59"
  [99] "g72"  "g98"

ID[order(as.numeric(substr(ID, start=2, stop=nchar(ID))))]
   [1] "g1"   "g2"   "g3"   "g4"   "g5"   "g6"   "g7"
   [8] "g8"   "g9"   "g10"  "g11"  "g12"  "g13"  "g14"
  [15] "g15"  "g16"  "g17"  "g18"  "g19"  "g20"  "g21"
  [22] "g22"  "g23"  "g24"  "g25"  "g26"  "g27"  "g28"
  [29] "g29"  "g30"  "g31"  "g32"  "g33"  "g34"  "g35"
  [36] "g36"  "g37"  "g38"  "g39"  "g40"  "g41"  "g42"
  [43] "g43"  "g44"  "g45"  "g46"  "g47"  "g48"  "g49"
  [50] "g50"  "g51"  "g52"  "g53"  "g54"  "g55"  "g56"
  [57] "g57"  "g58"  "g59"  "g60"  "g61"  "g62"  "g63"
  [64] "g64"  "g65"  "g66"  "g67"  "g68"  "g69"  "g70"
  [71] "g71"  "g72"  "g73"  "g74"  "g75"  "g76"  "g77"
  [78] "g78"  "g79"  "g80"  "g81"  "g82"  "g83"  "g84"
  [85] "g85"  "g86"  "g87"  "g88"  "g89"  "g90"  "g91"
  [92] "g92"  "g93"  "g94"  "g95"  "g96"  "g97"  "g98"
  [99] "g99"  "g100"

The idea is to drop the leading "g", convert to numeric, and then order.
mtb954 mtb954 wrote:

  
    
#
Thank you all, for your help.
Mark
On 2/24/06, Chuck Cleland <ccleland at optonline.net> wrote: