Skip to content

Help with Order

4 messages · Duncan Murdoch, David Winsemius, Steve Sidney

#
Dear List

As a fairly new R programmer I seem to have run into a strange problem - 
probably my inexperience with R

After reading and merging successive files into a single data frame, I find 
that order does not sort the data as expected.

I have multiple references in each file but each file refers to measurement 
data obtained at a different time.

Here's the code

library(reshape)
# Enter file name to Read & Save data
FileName=readline("Enter File name:\n")
# Find first occurance of file
for ( round1 in 1 : 6) {
ReadFile=paste(round1,"C_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile))
break
}

x = data.frame(read.csv(ReadFile, header=TRUE),rnd=round1)
for ( round2 in (round1+1) : 6) {
#
ReadFile=paste(round2,"C_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile)) {
y = data.frame(read.csv(ReadFile, header=TRUE),rnd = round2)
    if (round2 == (round1 +1))
    z=data.frame(merge(x,y,all=TRUE))
    z=data.frame(merge(y,z,all=TRUE))
}
}
ordered = order(z$lab_id)

results = z[ordered,]

res = data.frame( 
lab=results[,"lab_id"],bw=results[,"ZBW"],wi=results[,"ZWI"],pf_zbw=0,pf_zwi=0,r 
= results[,"rnd"])


#
# Establish no of samples recorded
nsmpls = length(res[,c("lab")])

# Evaluate Z_scores for Between Lab Results
for ( i in 1 : nsmpls) {
if (res[i,"bw"] > 3 | res[i,"bw"] < -3)
res[i,"pf_zbw"]=1
}
# Evaluate Z_scores for Within Lab Results
for ( i in 1 : nsmpls) {
if (res[i,"wi"] > 3 | res[i,"wi"] < -3)
res[i,"pf_zwi"]=1
}

dd = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(dd, lab ~ r)
If anyone could see why the ordering only works for about 55 of 70 records 
and could steer me in the right direction I would be obliged

Thanks very much
#
On 11/01/2010 7:37 AM, Steve Sidney wrote:
I can't try out your code, but I'd guess it's due to conversion of 
strings to factors.  Sorting factors will sort them by their numerical 
value, not by the strings.

So the solution is to set stringsAsFactors=FALSE, either in each 
read.csv call, or globally with options(stringsAsFactors=FALSE).

Duncan Murdoch
#
On Jan 11, 2010, at 7:49 AM, Duncan Murdoch wrote:

            
Following Duncan's hypothesis, perhaps change this to :
ordered = order(as.character(z$lab_id))
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
#
David , Duncan

Thanks for the swift response.

You guys hit the nail on the head. That's exactly what the problem was.

All the best
Steve
----- Original Message ----- 
From: "David Winsemius" <dwinsemius at comcast.net>
To: "Duncan Murdoch" <murdoch at stats.uwo.ca>
Cc: "Steve Sidney" <sbsidney at mweb.co.za>; <r-help at r-project.org>
Sent: Monday, January 11, 2010 3:49 PM
Subject: Re: [R] Help with Order