Incorrect behavior for ordering timepoints in "reshape" (PR#7669)
On Feb 7, 2005, at 6:38 PM, Peter Dalgaard wrote:
davclark@nyu.edu writes:
Full_Name: Dav Clark
Version: 2.0.1
OS: OS X 10.3
Submission from: (NULL) (128.122.87.35)
When the timepoints that reshape uses (in direction="long") are
negative or
fractional, the time label is assigned incorrectly. It is easier to
give an
example than to describe the problem abstractly:
Assume you have a data.frame header with values related to
peri-stimulus time
like this:
"HRF -5" "HRF -2.5" "HRF 0" "HRF 2.5" ... "HRF 10"
And you give reshape a split argument of a space " ".
Then the labels will be assigned strangely, based on alphabetical
ordering. So
the above list order maps to:
-2.5, -5, 0, 10, ... 2.5
Items under the "HRF -5" column in wide format recieve a -2.5 label,
items under
"HRF 2.5" receive a label of 10, and so on.
Somewhere, the time labels are being used before conversion to
numbers. But,
reshape returns an error if it is not possible to convert the
timepoints to
numeric! So obviously, more functionality could be provided, or at
least the
documentation should reflect the current shortfall.
For completeness, here is a minimal example demonstrating the bug:
df <- data.frame(id="S1", V1="from -2", V2="from -1")
names(df)[2:3] <- c("vals.-2", "vals.-1")
df
reshape(df, direction="long", varying=2:3)
Hmm, this looks messed up even without the negatives. The guess() function inside reshape always sorts before converting to numeric, so you get the 1 10 11 2 3 4 5 6 7 8 9 effect, but what is worse: the sorting decouples the values from the variable names, as demonstrated by modifying your example slightly
reshape(df, direction="long", varying=3:2)
id time vals
S1.-1 S1 -1 from -1
S1.-2 S1 -2 from -2
I'm not at all sure I understand what was supposed to happen here,
perhaps the sort in
varying <- unique(nn[, 1])
times <- sort(unique(nn[, 2]))
is a thinko? Over to Thomas, I think.
Just to throw it out there, my current solution is to convert to
integers, then run the following on the row numbers:
new.nums <- formatC(new.nums, flag="0",
width=max(nchar(new.nums)))
But thanks for the observation, I was scratching my head so hard it
hurt.
DC
[[alternative text/enriched version deleted]]