I'm trying to sort a DATAFRAME by a column "ID" that contains alphanumeric data. Specifically,"ID" contains integers all preceeded by the character "g" as in: g1, g6, g3, g19, g100, g2, g39 I am using the following code: DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),] and was hoping it would sort the dataframe by ID in the following manner g1, g2, g3, g6, g19, g39, g100 but it doesn't sort at all. Could anyone point out my mistake? Thank you. Mark
Sorting alphanumerically
5 messages · Gabor Grothendieck, Marc Schwartz (via MN), Chuck Cleland +1 more
This was just discussed recently. Try: library(gtools) ?mixedorder
On 2/24/06, mtb954 mtb954 <mtb954 at gmail.com> wrote:
I'm trying to sort a DATAFRAME by a column "ID" that contains alphanumeric data. Specifically,"ID" contains integers all preceeded by the character "g" as in: g1, g6, g3, g19, g100, g2, g39 I am using the following code: DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),] and was hoping it would sort the dataframe by ID in the following manner g1, g2, g3, g6, g19, g39, g100 but it doesn't sort at all. Could anyone point out my mistake? Thank you. Mark
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
On Fri, 2006-02-24 at 12:54 -0600, mtb954 mtb954 wrote:
I'm trying to sort a DATAFRAME by a column "ID" that contains alphanumeric data. Specifically,"ID" contains integers all preceeded by the character "g" as in: g1, g6, g3, g19, g100, g2, g39 I am using the following code: DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),] and was hoping it would sort the dataframe by ID in the following manner g1, g2, g3, g6, g19, g39, g100 but it doesn't sort at all. Could anyone point out my mistake? Thank you. Mark
The values are being sorted by character based ordering, which may be impacted upon by your locale. Thus, on my system, you end up with something like the following:
ID[order(ID)]
[1] "g1" "g100" "g19" "g2" "g3" "g39" "g6" What you can do, based upon the presumption that the prefix of 'g' is present as you describe above, is:
ID[order(as.numeric((gsub("g", "", ID))))]
[1] "g1" "g2" "g3" "g6" "g19" "g39" "g100" What this does is to use gsub() to strip the 'g' and then order by numeric value. HTH, Marc Schwartz
Does this help?
ID <- paste("g", sample(1:100, 100, replace=FALSE), sep="")
ID
[1] "g88" "g5" "g79" "g67" "g43" "g21" "g66"
[8] "g9" "g38" "g86" "g12" "g85" "g74" "g34"
[15] "g52" "g95" "g6" "g22" "g70" "g87" "g7"
[22] "g83" "g63" "g42" "g26" "g65" "g16" "g97"
[29] "g76" "g2" "g90" "g23" "g15" "g82" "g75"
[36] "g58" "g17" "g20" "g96" "g91" "g31" "g33"
[43] "g48" "g32" "g93" "g54" "g49" "g36" "g81"
[50] "g57" "g27" "g14" "g62" "g10" "g80" "g71"
[57] "g28" "g37" "g89" "g8" "g94" "g68" "g56"
[64] "g92" "g41" "g11" "g4" "g99" "g55" "g60"
[71] "g18" "g69" "g19" "g64" "g39" "g1" "g53"
[78] "g44" "g24" "g100" "g35" "g3" "g40" "g47"
[85] "g51" "g46" "g61" "g45" "g50" "g25" "g13"
[92] "g73" "g77" "g30" "g84" "g78" "g29" "g59"
[99] "g72" "g98"
ID[order(as.numeric(substr(ID, start=2, stop=nchar(ID))))]
[1] "g1" "g2" "g3" "g4" "g5" "g6" "g7"
[8] "g8" "g9" "g10" "g11" "g12" "g13" "g14"
[15] "g15" "g16" "g17" "g18" "g19" "g20" "g21"
[22] "g22" "g23" "g24" "g25" "g26" "g27" "g28"
[29] "g29" "g30" "g31" "g32" "g33" "g34" "g35"
[36] "g36" "g37" "g38" "g39" "g40" "g41" "g42"
[43] "g43" "g44" "g45" "g46" "g47" "g48" "g49"
[50] "g50" "g51" "g52" "g53" "g54" "g55" "g56"
[57] "g57" "g58" "g59" "g60" "g61" "g62" "g63"
[64] "g64" "g65" "g66" "g67" "g68" "g69" "g70"
[71] "g71" "g72" "g73" "g74" "g75" "g76" "g77"
[78] "g78" "g79" "g80" "g81" "g82" "g83" "g84"
[85] "g85" "g86" "g87" "g88" "g89" "g90" "g91"
[92] "g92" "g93" "g94" "g95" "g96" "g97" "g98"
[99] "g99" "g100"
The idea is to drop the leading "g", convert to numeric, and then order.
mtb954 mtb954 wrote:
I'm trying to sort a DATAFRAME by a column "ID" that contains alphanumeric data. Specifically,"ID" contains integers all preceeded by the character "g" as in: g1, g6, g3, g19, g100, g2, g39 I am using the following code: DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),] and was hoping it would sort the dataframe by ID in the following manner g1, g2, g3, g6, g19, g39, g100 but it doesn't sort at all. Could anyone point out my mistake? Thank you. Mark
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894
Thank you all, for your help. Mark
On 2/24/06, Chuck Cleland <ccleland at optonline.net> wrote:
Does this help?
ID <- paste("g", sample(1:100, 100, replace=FALSE), sep="")
ID
[1] "g88" "g5" "g79" "g67" "g43" "g21" "g66"
[8] "g9" "g38" "g86" "g12" "g85" "g74" "g34"
[15] "g52" "g95" "g6" "g22" "g70" "g87" "g7"
[22] "g83" "g63" "g42" "g26" "g65" "g16" "g97"
[29] "g76" "g2" "g90" "g23" "g15" "g82" "g75"
[36] "g58" "g17" "g20" "g96" "g91" "g31" "g33"
[43] "g48" "g32" "g93" "g54" "g49" "g36" "g81"
[50] "g57" "g27" "g14" "g62" "g10" "g80" "g71"
[57] "g28" "g37" "g89" "g8" "g94" "g68" "g56"
[64] "g92" "g41" "g11" "g4" "g99" "g55" "g60"
[71] "g18" "g69" "g19" "g64" "g39" "g1" "g53"
[78] "g44" "g24" "g100" "g35" "g3" "g40" "g47"
[85] "g51" "g46" "g61" "g45" "g50" "g25" "g13"
[92] "g73" "g77" "g30" "g84" "g78" "g29" "g59"
[99] "g72" "g98"
ID[order(as.numeric(substr(ID, start=2, stop=nchar(ID))))]
[1] "g1" "g2" "g3" "g4" "g5" "g6" "g7"
[8] "g8" "g9" "g10" "g11" "g12" "g13" "g14"
[15] "g15" "g16" "g17" "g18" "g19" "g20" "g21"
[22] "g22" "g23" "g24" "g25" "g26" "g27" "g28"
[29] "g29" "g30" "g31" "g32" "g33" "g34" "g35"
[36] "g36" "g37" "g38" "g39" "g40" "g41" "g42"
[43] "g43" "g44" "g45" "g46" "g47" "g48" "g49"
[50] "g50" "g51" "g52" "g53" "g54" "g55" "g56"
[57] "g57" "g58" "g59" "g60" "g61" "g62" "g63"
[64] "g64" "g65" "g66" "g67" "g68" "g69" "g70"
[71] "g71" "g72" "g73" "g74" "g75" "g76" "g77"
[78] "g78" "g79" "g80" "g81" "g82" "g83" "g84"
[85] "g85" "g86" "g87" "g88" "g89" "g90" "g91"
[92] "g92" "g93" "g94" "g95" "g96" "g97" "g98"
[99] "g99" "g100"
The idea is to drop the leading "g", convert to numeric, and then order.
mtb954 mtb954 wrote:
I'm trying to sort a DATAFRAME by a column "ID" that contains alphanumeric data. Specifically,"ID" contains integers all preceeded by the character "g" as in: g1, g6, g3, g19, g100, g2, g39 I am using the following code: DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),] and was hoping it would sort the dataframe by ID in the following manner g1, g2, g3, g6, g19, g39, g100 but it doesn't sort at all. Could anyone point out my mistake? Thank you. Mark
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894