Skip to content

"order" issue

8 messages · jim holtman, (Ted Harding), Zoppoli, Gabriele (NIH/NCI) [G] +2 more

#
Hi everybody, this is a real dummy thing.

I sorted a matrix based on a given column, and what I get is right, until it comes to columns of negative and positive values; than, "order" orders everything from max to min in the negative values, and then AGAIN from max to min in the positive values!!!

Why isn't everything order from max to min, and that's it?

Thank you!!!

Attached is the txt file I use; try:

x=x[order(x[,2]),]

What I get is:

print(x)


          Product A B   Tissue
44  ME:MDA_MB_435     -0.1915    -0.16744 Melanoma
17     CNS:SNB_75    -0.23183     1.03945      CNS
37       LE:K_562    -0.58218      1.8581 Leukemia
43    ME:MALME_3M    -0.67327    -1.33493 Melanoma
49    ME:UACC_257    -0.72431    -1.84753 Melanoma
42         ME:M14    -0.73942    -0.73904 Melanoma
40          LE:SR    -0.93541     2.95346 Leukemia
25      CO:SW_620    -1.53265    -1.35446    Colon
63      RE:CAKI_1    -2.48443     0.43245    Renal
39   LE:RPMI_8226    -2.59561     -1.9448 Leukemia
26        LC:A549    -2.66221     0.71215     Lung
61        RE:A498    -2.89402     0.93287    Renal
9       BR:HS578T    -2.94118      1.1217   Breast
34    LC:NCI_H522    -2.94381      0.3859     Lung
66       RE:TK_10    -2.95281     1.26245    Renal
52 OV:NCI_ADR_RES    -3.04456     0.17046  Ovarian
57     OV:SK_OV_3    -3.04477     2.15405  Ovarian
53     OV:OVCAR_3     -3.0705    -0.31743  Ovarian
14     CNS:SF_295    -3.09348    -1.00095      CNS
54     OV:OVCAR_4    -3.13137    -0.47497  Ovarian
36       LE:HL_60    -3.16745    -3.16745 Leukemia
38      LE:MOLT_4    -3.20055    -1.72841 Leukemia
11  BR:MDA_MB_231    -3.24907     1.58326   Breast
59        PR:PC_3    -3.36612     1.39328 Prostate
19     CO:HCT_116    -3.39764     0.43061    Colon
12        BR:T47D    -3.41228     1.13818   Breast
22      CO:HCT_15    -3.45342     0.16357    Colon
64     RE:RXF_393    -3.49615     2.59144    Renal
28      LC:HOP_62     -3.4968     0.67884     Lung
60       RE:786_0     -3.5086     1.75056    Renal
35    LE:CCRF_CEM    -3.54526    -2.09262 Leukemia
29      LC:HOP_92    -3.60636     0.87116     Lung
21    CO:HCC_2998    -3.61457    -0.32362    Colon
13     CNS:SF_268    -3.63916     2.54378      CNS
20     CO:COLO205    -3.64656     0.54344    Colon
56     OV:OVCAR_8    -3.66053     -0.9594  Ovarian
24        CO:KM12    -3.68703     2.19991    Colon
55     OV:OVCAR_5     -3.7852     2.43038  Ovarian
8       BR:BT_549    -3.80239    -0.43099   Breast
15     CNS:SF_539    -3.86184     1.39114      CNS
65       RE:SN12C    -3.90776     0.85244    Renal
31     LC:NCI_H23    -3.91625    -1.14955     Lung
62        RE:ACHN    -3.96246    -0.62365    Renal
67       RE:UO_31    -3.99791    -1.09215    Renal
10        BR:MCF7    -4.00187     1.46303   Breast
51      OV:IGROV1    -4.02758     2.04324  Ovarian
23        CO:HT29    -4.11624    -0.02799    Colon
41     ME:LOXIMVI     -4.2572     0.37259 Melanoma
32   LC:NCI_H322M    -4.28534     1.66783     Lung
27        LC:EKVX    -4.32847     1.66042     Lung
58      PR:DU_145    -4.33961     1.57548 Prostate
30    LC:NCI_H226    -4.37408    -0.22311     Lung
33    LC:NCI_H460      0.0042     -0.6023     Lung
18       CNS:U251     0.01263     1.66389      CNS
16     CNS:SNB_19     0.16583     0.03737      CNS
45       ME:MDA_N     0.21077     0.05502 Melanoma
50     ME:UACC_62     0.52503      0.1605 Melanoma
46    ME:SK_MEL_2     0.55255     -1.6667 Melanoma
47   ME:SK_MEL_28      1.7425     1.45266 Melanoma
48    ME:SK_MEL_5     1.74749    -1.47817 Melanoma

Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppolig at mail.nih.gov
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: x.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100523/b7212e50/attachment.txt>
#
do 'str' on your object to see if you have factors where you think you  
have numerics.

What is the problem you are trying to solve?

Sent from my iPhone.

On May 23, 2010, at 17:39, "Zoppoli, Gabriele (NIH/NCI) [G]" <zoppolig at mail.nih.gov
> wrote:

            
#
On 23-May-10 21:39:06, Zoppoli, Gabriele (NIH/NCI) [G] wrote:
Somewhat strange indeed! The only further question I can think of
is to ask how what did "x" look like before your re-ordered it.
Using the "x.txt" file you supplied, I get:

  x <- read.table("x.txt")
  str(x)
  # 'data.frame':   60 obs. of  4 variables:
  #  $ Product: Factor w/ 60 levels "BR:BT_549","BR:HS578T",..: 37 10 30
  #    36 42 35 33 18 56 32 ...
  #  $ A      : num  -0.192 -0.232 -0.582 -0.673 -0.724 ...
  #  $ B      : num  -0.167 1.039 1.858 -1.335 -1.848 ...
  #  $ Tissue : Factor w/ 9 levels "Breast","CNS",..: 6 2 4 6 6 6 4 3 9 4
  #    ...


so x[,2] and x[,3] are indeed numeric. Then (similar to yours):

  X<-x[order(x[,2]),]
  print(X)
  #           Product        A        B   Tissue
  # 30    LC:NCI_H226 -4.37408 -0.22311     Lung
  # 58      PR:DU_145 -4.33961  1.57548 Prostate
  # 27        LC:EKVX -4.32847  1.66042     Lung
  # 32   LC:NCI_H322M -4.28534  1.66783     Lung
  # 41     ME:LOXIMVI -4.25720  0.37259 Melanoma
  # 23        CO:HT29 -4.11624 -0.02799    Colon
  # 51      OV:IGROV1 -4.02758  2.04324  Ovarian
  # 10        BR:MCF7 -4.00187  1.46303   Breast
  # 67       RE:UO_31 -3.99791 -1.09215    Renal
  # 62        RE:ACHN -3.96246 -0.62365    Renal
  # 31     LC:NCI_H23 -3.91625 -1.14955     Lung
  # 65       RE:SN12C -3.90776  0.85244    Renal
  # 15     CNS:SF_539 -3.86184  1.39114      CNS
  # 8       BR:BT_549 -3.80239 -0.43099   Breast
  # 55     OV:OVCAR_5 -3.78520  2.43038  Ovarian
  # 24        CO:KM12 -3.68703  2.19991    Colon
  # 56     OV:OVCAR_8 -3.66053 -0.95940  Ovarian
  # 20     CO:COLO205 -3.64656  0.54344    Colon
  # 13     CNS:SF_268 -3.63916  2.54378      CNS
  # 21    CO:HCC_2998 -3.61457 -0.32362    Colon
  # 29      LC:HOP_92 -3.60636  0.87116     Lung
  # 35    LE:CCRF_CEM -3.54526 -2.09262 Leukemia
  # 60       RE:786_0 -3.50860  1.75056    Renal
  # 28      LC:HOP_62 -3.49680  0.67884     Lung
  # 64     RE:RXF_393 -3.49615  2.59144    Renal
  # 22      CO:HCT_15 -3.45342  0.16357    Colon
  # 12        BR:T47D -3.41228  1.13818   Breast
  # 19     CO:HCT_116 -3.39764  0.43061    Colon
  # 59        PR:PC_3 -3.36612  1.39328 Prostate
  # 11  BR:MDA_MB_231 -3.24907  1.58326   Breast
  # 38      LE:MOLT_4 -3.20055 -1.72841 Leukemia
  # 36       LE:HL_60 -3.16745 -3.16745 Leukemia
  # 54     OV:OVCAR_4 -3.13137 -0.47497  Ovarian
  # 14     CNS:SF_295 -3.09348 -1.00095      CNS
  # 53     OV:OVCAR_3 -3.07050 -0.31743  Ovarian
  # 57     OV:SK_OV_3 -3.04477  2.15405  Ovarian
  # 52 OV:NCI_ADR_RES -3.04456  0.17046  Ovarian
  # 66       RE:TK_10 -2.95281  1.26245    Renal
  # 34    LC:NCI_H522 -2.94381  0.38590     Lung
  # 9       BR:HS578T -2.94118  1.12170   Breast
  # 61        RE:A498 -2.89402  0.93287    Renal
  # 26        LC:A549 -2.66221  0.71215     Lung
  # 39   LE:RPMI_8226 -2.59561 -1.94480 Leukemia
  # 63      RE:CAKI_1 -2.48443  0.43245    Renal
  # 25      CO:SW_620 -1.53265 -1.35446    Colon
  # 40          LE:SR -0.93541  2.95346 Leukemia
  # 42         ME:M14 -0.73942 -0.73904 Melanoma
  # 49    ME:UACC_257 -0.72431 -1.84753 Melanoma
  # 43    ME:MALME_3M -0.67327 -1.33493 Melanoma
  # 37       LE:K_562 -0.58218  1.85810 Leukemia
  # 17     CNS:SNB_75 -0.23183  1.03945      CNS
  # 44  ME:MDA_MB_435 -0.19150 -0.16744 Melanoma
  # 33    LC:NCI_H460  0.00420 -0.60230     Lung
  # 18       CNS:U251  0.01263  1.66389      CNS
  # 16     CNS:SNB_19  0.16583  0.03737      CNS
  # 45       ME:MDA_N  0.21077  0.05502 Melanoma
  # 50     ME:UACC_62  0.52503  0.16050 Melanoma
  # 46    ME:SK_MEL_2  0.55255 -1.66670 Melanoma
  # 47   ME:SK_MEL_28  1.74250  1.45266 Melanoma
  # 48    ME:SK_MEL_5  1.74749 -1.47817 Melanoma

and now the values in X[,2] are indeed in the correct numerical order,
yet essentially the same command as your has been executed.

I have not succeeded in repoducing your result by ordering on other
columns of "x" or on the row-names of "x".

So it is a mystery! The only thing I can think of is that the
columns of "x" (as seen by R) are different from what you think
they should be. Since your file "x.txt" looks like the value
of "x" after your re-ordering, it is impossible to test such
guesses on the original "x".

Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 23-May-10                                       Time: 23:31:25
------------------------------ XFMail ------------------------------
#
This is what I get:

str(x)

 chr [1:60, 1:4] "ME:SK_MEL_5" "ME:SK_MEL_28" "ME:SK_MEL_2" ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:60] "48" "47" "46" "50" ...
  ..$ : chr [1:4] "Product" "hsa.miR.204" "hsa.miR.210" "Tissue"

It doesn't make much sense to me...

I would like to have the second column ordered from max to min, or from min to max (with the argument decreasing=TRUE), but "order" seems to reorder everything without considering negative number as smaller than positive ones...


Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppolig at mail.nih.gov
#
crazy stuff!!! I tried to reload the txt file, and now it's working...

this is the original (attached)

thanks!

Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppolig at mail.nih.gov
#
On May 23, 2010, at 6:32 PM, Zoppoli, Gabriele (NIH/NCI) [G] wrote:

            
How did you bring that text file into R? Both Ted and I are getting:

 > str(x)
'data.frame':	60 obs. of  4 variables:
  $ Product: Factor w/ 60 levels "BR:BT_549","BR:HS578T",..: 37 10 30  
36 42 35 33 18 56 32 ...
  $ A      : num  -0.192 -0.232 -0.582 -0.673 -0.724 ...
  $ B      : num  -0.167 1.039 1.858 -1.335 -1.848 ...
  $ Tissue : Factor w/ 9 levels "Breast","CNS",..: 6 2 4 6 6 6 4 3 9  
4 ...

Your "x" is a  60 x 4 matrix of all character elements.

If I try:
x[ order(as.character(x[,2])),]

I get the same behavior as you describe.
#
When you "reloaded" the txt file (with what function?) it
probably was made into a "data.frame", with some columns
factors or characters and some columns numerics.  It looks
like your original problem arose after you converted that
data.frame into a "matrix", all of whose columns must be
the same (character in this case).  Sorting character
representations of numbers is different than sorting the
numbers as numbers.
  > sort(c(1, 0.05, 0.0000, -0.10, -2))
  [1] -2.00 -0.10  0.00  0.05  1.00
  > sort(as.character(c(1, 0.05, 0.0000, -0.10, -2)))
  [1] "-0.1" "-2"   "0"    "0.05" "1"

Use str(x) again to see if this is what is happening. 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
after read.delim:

'data.frame':   60 obs. of  4 variables:
 $ Cell       : Factor w/ 60 levels "BR:BT_549","BR:HS578T",..: 23 51 20 25 34 16 44 3 60 55 ...
 $ hsa-miR-204: num  -4.37 -4.34 -4.33 -4.29 -4.26 ...
 $ hsa-miR-210: num  -0.223 1.575 1.66 1.668 0.373 ...
 $ Tissue     : Factor w/ 9 levels "Breast","CNS",..: 5 8 5 5 6 3 7 1 9 9 ...

before:

 chr [1:60, 1:4] "ME:SK_MEL_5" "ME:SK_MEL_28" "ME:SK_MEL_2" ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:60] "48" "47" "46" "50" ...
  ..$ : chr [1:4] "Product" "hsa.miR.204" "hsa.miR.210" "Tissue"

Looks like the issue is that, after the first time I "read.delim"med the txt file, I removed the first three raws by doing

x=x[-c(1:3),]

because the first three raws were characters (parameters like "probe name", "chromosomal position" ecc.)

So maybe R remembers that the columns used were characters and not numeric... How would you "explain" R (sorry for the naive definitions but I've learnt R over time by myself and I misuse some words, hope it's clear anyway) that a matrix is all numeric? by doing as.numeric(x), it transforms everything in a long colum of number, but loses the matrix structure...

Thank you all guys! You're really precious!

Now, how can you "explain" (sorry for my naive definitions...) R that now all of your values are numeric in a matrix? If you do as.numeric, everything becomes a long column of n 



Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppolig at mail.nih.gov