Sorry to repeat the meassage, not sure if the HTML version has been
received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a data frame and
there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum the occurances
using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
lab 1 2 3 4 5 6
1 4er66 1 NA 1 0 NA 0
2 4gcyi 0 0 1 0 0 0
3 5d3hh 0 0 0 NA 0 0
4 5d3wt 0 0 0 0 0 0
.
. lines deleted to save space
.
69 v3st5 NA NA 1 NA NA NA
70 a22g5 NA 0 NA NA NA NA
71 b5dd3 NA 0 NA NA NA NA
72 g44d2 NA 0 NA NA NA NA
Data after using sum and margins
lab 1 2 3 4 5 6 (all)
1 4er66 1 0 1 0 0 0 2
2 4gcyi 0 0 1 0 0 0 1
3 5d3hh 0 0 0 0 0 0 0
4 5d3wt 0 0 0 0 0 0 0
5 6n44r 0 0 0 0 0 0 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 0
71 b5dd3 0 0 0 0 0 0 0
72 g44d2 0 0 0 0 0 0 0
73 (all) 5 2 4 3 5 7 26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in this case
either a one (1) or a zero (0) in b
I have a report to construct for tomorrow Mon so any help would be
appreciated
Regards
Steve
Help using Cast (Text) Version
16 messages · David Winsemius, Ista Zahn, Simon Knapp +2 more
On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
Sorry to repeat the meassage, not sure if the HTML version has been
received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a data
frame and there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum the
occurances using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
lab 1 2 3 4 5 6
1 4er66 1 NA 1 0 NA 0
2 4gcyi 0 0 1 0 0 0
3 5d3hh 0 0 0 NA 0 0
4 5d3wt 0 0 0 0 0 0
.
. lines deleted to save space
.
69 v3st5 NA NA 1 NA NA NA
70 a22g5 NA 0 NA NA NA NA
71 b5dd3 NA 0 NA NA NA NA
72 g44d2 NA 0 NA NA NA NA
Data after using sum and margins
lab 1 2 3 4 5 6 (all)
1 4er66 1 0 1 0 0 0 2
2 4gcyi 0 0 1 0 0 0 1
3 5d3hh 0 0 0 0 0 0 0
4 5d3wt 0 0 0 0 0 0 0
5 6n44r 0 0 0 0 0 0 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 0
71 b5dd3 0 0 0 0 0 0 0
72 g44d2 0 0 0 0 0 0 0
73 (all) 5 2 4 3 5 7 26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in this case either a one (1) or a zero (0) in b
I'm guessing that you mean to apply that test to the column in b labeled "(all)" . If that's the case, then something like (obviously untested): sum( b$'(all)' == 1 | b$'(all)' == 0)
I have a report to construct for tomorrow Mon so any help would be appreciated Regards Steve
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Thanks, I'll try that......but no what I need is the total (1's) for each of the rows, labelled 1-6 at the top of each col in the table provided. What I guess I am not sure of is how to identify the col after the melt and cast. Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net> To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 4:39 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
Sorry to repeat the meassage, not sure if the HTML version has been
received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a data frame
and there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum the
occurances using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
lab 1 2 3 4 5 6
1 4er66 1 NA 1 0 NA 0
2 4gcyi 0 0 1 0 0 0
3 5d3hh 0 0 0 NA 0 0
4 5d3wt 0 0 0 0 0 0
.
. lines deleted to save space
.
69 v3st5 NA NA 1 NA NA NA
70 a22g5 NA 0 NA NA NA NA
71 b5dd3 NA 0 NA NA NA NA
72 g44d2 NA 0 NA NA NA NA
Data after using sum and margins
lab 1 2 3 4 5 6 (all)
1 4er66 1 0 1 0 0 0 2
2 4gcyi 0 0 1 0 0 0 1
3 5d3hh 0 0 0 0 0 0 0
4 5d3wt 0 0 0 0 0 0 0
5 6n44r 0 0 0 0 0 0 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 0
71 b5dd3 0 0 0 0 0 0 0
72 g44d2 0 0 0 0 0 0 0
73 (all) 5 2 4 3 5 7 26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in this case either a one (1) or a zero (0) in b
I'm guessing that you mean to apply that test to the column in b labeled "(all)" . If that's the case, then something like (obviously untested): sum( b$'(all)' == 1 | b$'(all)' == 0)
I have a report to construct for tomorrow Mon so any help would be appreciated Regards Steve
David Winsemius, MD Heritage Laboratories West Hartford, CT
Hi Steve, It's still not clear to me what you want. Please give a minimal example so I can understand what you're trying to do. -Ista
On Sun, Jan 17, 2010 at 11:56 AM, Steve Sidney <sbsidney at mweb.co.za> wrote:
David Thanks, I'll try that......but no what I need is the total (1's) for each of the rows, labelled 1-6 at the top of each col in the table provided. What I guess I am not sure of is how to identify the col after the melt and cast. Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net> To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 4:39 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
Sorry to repeat the meassage, not sure if the HTML version has been
received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a data ?frame
and there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum the
occurances using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
? lab ?1 ?2 ?3 ?4 ?5 ?6
1 ?4er66 ?1 NA ?1 ?0 NA ?0
2 ?4gcyi ?0 ?0 ?1 ?0 ?0 ?0
3 ?5d3hh ?0 ?0 ?0 NA ?0 ?0
4 ?5d3wt ?0 ?0 ?0 ?0 ?0 ?0
.
. lines deleted to save space
.
69 v3st5 NA NA ?1 NA NA NA
70 a22g5 NA ?0 NA NA NA NA
71 b5dd3 NA ?0 NA NA NA NA
72 g44d2 NA ?0 NA NA NA NA
Data after using sum and margins
? lab 1 2 3 4 5 6 (all)
1 ?4er66 1 0 1 0 0 0 ? ? 2
2 ?4gcyi 0 0 1 0 0 0 ? ? 1
3 ?5d3hh 0 0 0 0 0 0 ? ? 0
4 ?5d3wt 0 0 0 0 0 0 ? ? 0
5 ?6n44r 0 0 0 0 0 0 ? ? 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 ? ? 0
71 b5dd3 0 0 0 0 0 0 ? ? 0
72 g44d2 0 0 0 0 0 0 ? ? 0
73 (all) 5 2 4 3 5 7 ? ?26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in ?this case either a one (1) or a zero (0) in b
I'm guessing that you mean to apply that test to the column in b ?labeled "(all)" . If that's the case, then something like (obviously ?untested): sum( b$'(all)' == 1 | b$'(all)' == 0)
I have a report to construct for tomorrow Mon so any help would be appreciated Regards Steve
David Winsemius, MD Heritage Laboratories West Hartford, CT
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org
On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
David Thanks, I'll try that......but no what I need is the total (1's) for each of the rows, labelled 1-6 at the top of each col in the table provided.
Part of my confusion with your request (which remains unaddressed) is what you mean by "valid". The melt-cast operation has turned a bunch of NA's into 0's which are now indistinguishable from the original 0's. So I don't see any way that operating on "b" could tell you the numbers you are asking for. If you were working on the original data, "res", you might have gotten the column-wise "valid" counts of column 2 with something like: sum( !is.na(res[,2]) )
What I guess I am not sure of is how to identify the col after the melt and cast.
The cast object represents columns as a list of vectors. The i-th column is b[[i]] which could be further referenced as a vector. So the j-th row entry for the i-th column would be b[[i]][j].
Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 4:39 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
Sorry to repeat the meassage, not sure if the HTML version has
been received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a data
frame and there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum the
occurances using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
lab 1 2 3 4 5 6
1 4er66 1 NA 1 0 NA 0
2 4gcyi 0 0 1 0 0 0
3 5d3hh 0 0 0 NA 0 0
4 5d3wt 0 0 0 0 0 0
.
. lines deleted to save space
.
69 v3st5 NA NA 1 NA NA NA
70 a22g5 NA 0 NA NA NA NA
71 b5dd3 NA 0 NA NA NA NA
72 g44d2 NA 0 NA NA NA NA
Data after using sum and margins
lab 1 2 3 4 5 6 (all)
1 4er66 1 0 1 0 0 0 2
2 4gcyi 0 0 1 0 0 0 1
3 5d3hh 0 0 0 0 0 0 0
4 5d3wt 0 0 0 0 0 0 0
5 6n44r 0 0 0 0 0 0 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 0
71 b5dd3 0 0 0 0 0 0 0
72 g44d2 0 0 0 0 0 0 0
73 (all) 5 2 4 3 5 7 26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in this case either a one (1) or a zero (0) in b
I'm guessing that you mean to apply that test to the column in b labeled "(all)" . If that's the case, then something like (obviously untested): sum( b$'(all)' == 1 | b$'(all)' == 0)
I have a report to construct for tomorrow Mon so any help would be appreciated Regards Steve
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
Bingo,
I knew it was something simple and that I wasn't seeing the wood for the
trees.
David, Ista
Apologies for the vague description, which I thought was clear enough, but
yes I now think that I understand that I need to count the 1's and be able
to sum the total of 1' and 0's by ignoring the NA's, which as David you have
correctly identified is in res. Of course as you have quite correctly said
by the time it's melt-cast there is now way to distinguish between NA's and
0's.
Here is the original code so that you can see where res comes from; Ista I
hope that this is now clearer for you.
library(reshape)
# Enter file name to Read & Save data
FileName=readline("Enter File name:\n")
SampleName=readline("Enter Sample (A,B or C):\n")
#for (sname in 1 : 3) {
#if (sname == 1)
# SampleName = "A"
# if (sname == 2)
# SampleName = "B"
# if (sname == 3)
# SampleName = "C"
#for ( fname in 1 : 4) {
#if (fname == 1)
# FileName = "SPC"
# if (fname == 2)
# FileName = "Coli"
# if (fname == 3)
# FileName = "Colif"
# if (fname == 4)
# FileName = "Ecoli"
# Find first occurance of file
for ( rloop1 in 1 : 6) {
ReadFile=paste(rloop1,SampleName,"_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile))
break
}
x = data.frame(read.csv(ReadFile, header=T),rnd=rloop1)
for ( rloop2 in (rloop1+1) : 6) {
ReadFile=paste(rloop2,SampleName,"_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile)) {
y = data.frame(read.csv(ReadFile, header=T),rnd = rloop2)
if (rloop2 == (rloop1+1))
z=merge(x,y,all=T)
z=merge(y,z,all=T)
## The next piece of code is where there are not two successive rounds of
data.
## It must be modified for each year's summary
##
##if ( (FileName == "Coli") & (SampleName == "B")) {
## if (rloop2 == (rloop1+3))
## z=merge(x,y,all=T)
## z=merge(y,z,all=T)
## }
##
##
}
}
results <- z
res = data.frame(
lab=results[,"lab_id"],bw=results[,"ZBW"],wi=results[,"ZWI"],pf_zbw=0,pf_zwi=0,r
= results[,"rnd"])
#
# Establish no of samples recorded
nsmpls = length(res[,c("lab")])
#Evaluate Z_scores for Between Lab Results
for ( i in 1 : nsmpls) {
if (res[i,"bw"] > 3 | res[i,"bw"] < -3)
res[i,"pf_zbw"]=1
}
#Evaluate Z_scores for Within Lab Results
for ( i in 1 : nsmpls) {
if (res[i,"wi"] > 3 | res[i,"wi"] < -3)
res[i,"pf_zwi"]=1
}
# Melt and Cast the 'res' frame and then order it
bw = melt(res, id=c("lab","r"), "pf_zbw")
# b = cast(bw, lab ~ r)
# bw_eval = b[order(as.character(b$lab)),]
##### Code for summing the no of Fails for Between Results
#bsum = cast(bw, lab ~ r, margins=TRUE, sum)
## Save Summary of Between Results
## FileSaveBw=paste(SampleName,"_",FileName,"_2009Between.csv",sep="")
## write.csv(bw_eval,file=FileSaveBw)
##
##
##
####
# Melt and Cast the 'res' frame and then order it
wi = melt(res, id=c("lab","r"), "pf_zwi")
w = cast(wi, lab ~ r)
wi_eval = w[order(as.character(w$lab)),]
##### Code for summing the no of Fails for Within Results
#wsum = cast(wi, lab ~ r, margins=TRUE, sum)
##
## Save Summary of Within Results
## FileSaveWi=paste(SampleName,"_",FileName,"_2009Within.csv",sep="")
## write.csv(wi_eval,file=FileSaveWi)
###### cat ("File Name: ",FileName,"Sample Name: ",SampleName, "\n")
# }
#}
end
Once again thanks for your interest
Steve
----- Original Message -----
From: "David Winsemius" <dwinsemius at comcast.net>
To: "Steve Sidney" <sbsidney at mweb.co.za>
Cc: <r-help at r-project.org>
Sent: Sunday, January 17, 2010 7:36 PM
Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
David Thanks, I'll try that......but no what I need is the total (1's) for each of the rows, labelled 1-6 at the top of each col in the table provided.
Part of my confusion with your request (which remains unaddressed) is what you mean by "valid". The melt-cast operation has turned a bunch of NA's into 0's which are now indistinguishable from the original 0's. So I don't see any way that operating on "b" could tell you the numbers you are asking for. If you were working on the original data, "res", you might have gotten the column-wise "valid" counts of column 2 with something like: sum( !is.na(res[,2]) )
What I guess I am not sure of is how to identify the col after the melt and cast.
The cast object represents columns as a list of vectors. The i-th column is b[[i]] which could be further referenced as a vector. So the j-th row entry for the i-th column would be b[[i]][j].
Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 4:39 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
Sorry to repeat the meassage, not sure if the HTML version has been
received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a data frame
and there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum the
occurances using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
lab 1 2 3 4 5 6
1 4er66 1 NA 1 0 NA 0
2 4gcyi 0 0 1 0 0 0
3 5d3hh 0 0 0 NA 0 0
4 5d3wt 0 0 0 0 0 0
.
. lines deleted to save space
.
69 v3st5 NA NA 1 NA NA NA
70 a22g5 NA 0 NA NA NA NA
71 b5dd3 NA 0 NA NA NA NA
72 g44d2 NA 0 NA NA NA NA
Data after using sum and margins
lab 1 2 3 4 5 6 (all)
1 4er66 1 0 1 0 0 0 2
2 4gcyi 0 0 1 0 0 0 1
3 5d3hh 0 0 0 0 0 0 0
4 5d3wt 0 0 0 0 0 0 0
5 6n44r 0 0 0 0 0 0 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 0
71 b5dd3 0 0 0 0 0 0 0
72 g44d2 0 0 0 0 0 0 0
73 (all) 5 2 4 3 5 7 26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in this case either a one (1) or a zero (0) in b
I'm guessing that you mean to apply that test to the column in b labeled "(all)" . If that's the case, then something like (obviously untested): sum( b$'(all)' == 1 | b$'(all)' == 0)
I have a report to construct for tomorrow Mon so any help would be appreciated Regards Steve
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
Well now I am totally baffled !!!!!!!!!! Using sum( !is.na(b[,3])) I get the total of all col 3 except those that are NA - Great solves the first problem What I can't seem to do is use the same logic to count all the 1's in that col, which are there before I use the cast with margins. So it seems to me that somehow sum((b[,3]) == 1) is wrong and is the part of my understanding that's missing. My guess is that that before using margins and sum in the cast statement the col is a character type and in order for == 1 to work I need to convert this to an integer. Hope this helps you to understand the problem. Regards Steve Your help is much appreciated ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net> To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 7:36 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
David Thanks, I'll try that......but no what I need is the total (1's) for each of the rows, labelled 1-6 at the top of each col in the table provided.
Part of my confusion with your request (which remains unaddressed) is what you mean by "valid". The melt-cast operation has turned a bunch of NA's into 0's which are now indistinguishable from the original 0's. So I don't see any way that operating on "b" could tell you the numbers you are asking for. If you were working on the original data, "res", you might have gotten the column-wise "valid" counts of column 2 with something like: sum( !is.na(res[,2]) )
What I guess I am not sure of is how to identify the col after the melt and cast.
The cast object represents columns as a list of vectors. The i-th column is b[[i]] which could be further referenced as a vector. So the j-th row entry for the i-th column would be b[[i]][j].
Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 4:39 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
Sorry to repeat the meassage, not sure if the HTML version has been
received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a data frame
and there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum the
occurances using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
lab 1 2 3 4 5 6
1 4er66 1 NA 1 0 NA 0
2 4gcyi 0 0 1 0 0 0
3 5d3hh 0 0 0 NA 0 0
4 5d3wt 0 0 0 0 0 0
.
. lines deleted to save space
.
69 v3st5 NA NA 1 NA NA NA
70 a22g5 NA 0 NA NA NA NA
71 b5dd3 NA 0 NA NA NA NA
72 g44d2 NA 0 NA NA NA NA
Data after using sum and margins
lab 1 2 3 4 5 6 (all)
1 4er66 1 0 1 0 0 0 2
2 4gcyi 0 0 1 0 0 0 1
3 5d3hh 0 0 0 0 0 0 0
4 5d3wt 0 0 0 0 0 0 0
5 6n44r 0 0 0 0 0 0 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 0
71 b5dd3 0 0 0 0 0 0 0
72 g44d2 0 0 0 0 0 0 0
73 (all) 5 2 4 3 5 7 26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in this case either a one (1) or a zero (0) in b
I'm guessing that you mean to apply that test to the column in b labeled "(all)" . If that's the case, then something like (obviously untested): sum( b$'(all)' == 1 | b$'(all)' == 0)
I have a report to construct for tomorrow Mon so any help would be appreciated Regards Steve
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
On Jan 17, 2010, at 4:37 PM, Steve Sidney wrote:
Well now I am totally baffled !!!!!!!!!! Using sum( !is.na(b[,3])) I get the total of all col 3 except those that are NA - Great solves the first problem What I can't seem to do is use the same logic to count all the 1's in that col, which are there before I use the cast with margins. So it seems to me that somehow is wrong and is the part of my understanding that's missing. My guess is that that before using margins and sum in the cast statement the col is a character type and in order for == 1 to work I need to convert this to an integer.
Yiu can test your theory with: sum(as.integer(b[,3]) == 1) Or you could post some reproducible data using dput ....
David.
>
> Hope this helps you to understand the problem.
>
> Regards
> Steve
>
> Your help is much appreciated
> ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
> >
> To: "Steve Sidney" <sbsidney at mweb.co.za>
> Cc: <r-help at r-project.org>
> Sent: Sunday, January 17, 2010 7:36 PM
> Subject: Re: [R] Help using Cast (Text) Version
>
>
>>
>> On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
>>
>>> David
>>>
>>> Thanks, I'll try that......but no what I need is the total (1's) for
>>> each of the rows, labelled 1-6 at the top of each col in the table
>>> provided.
>>
>> Part of my confusion with your request (which remains unaddressed) is
>> what you mean by "valid". The melt-cast operation has turned a
>> bunch of
>> NA's into 0's which are now indistinguishable from the original
>> 0's. So I
>> don't see any way that operating on "b" could tell you the numbers
>> you
>> are asking for. If you were working on the original data, "res", you
>> might have gotten the column-wise "valid" counts of column 2 with
>> something like:
>>
>> sum( !is.na(res[,2]) )
>>
>>>
>>> What I guess I am not sure of is how to identify the col after
>>> the melt
>>> and cast.
>>
>> The cast object represents columns as a list of vectors. The i-th
>> column
>> is b[[i]] which could be further referenced as a vector. So the j-
>> th row
>> entry for the i-th column would be b[[i]][j].
>>
>>
>>>
>>> Steve
>>>
>>> ----- Original Message ----- From: "David Winsemius"
>>> <dwinsemius at comcast.net
>>> >
>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>> Cc: <r-help at r-project.org>
>>> Sent: Sunday, January 17, 2010 4:39 PM
>>> Subject: Re: [R] Help using Cast (Text) Version
>>>
>>>
>>>>
>>>> On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
>>>>
>>>>> Sorry to repeat the meassage, not sure if the HTML version has
>>>>> been
>>>>> received - Apologies for duplication
>>>>>
>>>>> Dear list
>>>>>
>>>>> I am trying to count the no of occurances in a column of a
>>>>> data frame
>>>>> and there is missing data identifed by NA.
>>>>>
>>>>> I am able to melt and cast the data correctly as well as sum the
>>>>> occurances using margins and sum.
>>>>>
>>>>> Here are the melt and cast commands
>>>>>
>>>>> bw = melt(res, id=c("lab","r"), "pf_zbw")
>>>>> b = cast(bw, lab ~ r, sum, margins = T)
>>>>>
>>>>> Sample Data (before using sum and margins)
>>>>>
>>>>> lab 1 2 3 4 5 6
>>>>> 1 4er66 1 NA 1 0 NA 0
>>>>> 2 4gcyi 0 0 1 0 0 0
>>>>> 3 5d3hh 0 0 0 NA 0 0
>>>>> 4 5d3wt 0 0 0 0 0 0
>>>>> .
>>>>> . lines deleted to save space
>>>>> .
>>>>> 69 v3st5 NA NA 1 NA NA NA
>>>>> 70 a22g5 NA 0 NA NA NA NA
>>>>> 71 b5dd3 NA 0 NA NA NA NA
>>>>> 72 g44d2 NA 0 NA NA NA NA
>>>>>
>>>>> Data after using sum and margins
>>>>>
>>>>> lab 1 2 3 4 5 6 (all)
>>>>> 1 4er66 1 0 1 0 0 0 2
>>>>> 2 4gcyi 0 0 1 0 0 0 1
>>>>> 3 5d3hh 0 0 0 0 0 0 0
>>>>> 4 5d3wt 0 0 0 0 0 0 0
>>>>> 5 6n44r 0 0 0 0 0 0 0
>>>>> .
>>>>> .lines deleted to save space
>>>>> .
>>>>> 70 a22g5 0 0 0 0 0 0 0
>>>>> 71 b5dd3 0 0 0 0 0 0 0
>>>>> 72 g44d2 0 0 0 0 0 0 0
>>>>> 73 (all) 5 2 4 3 5 7 26
>>>>>
>>>>> Uisng length just tells me how many total rows there are.
>>>>
>>>>
>>>>> What I need to do is count how many rows there is valid data,
>>>>> in this
>>>>> case either a one (1) or a zero (0) in b
>>>>
>>>> I'm guessing that you mean to apply that test to the column in b
>>>> labeled "(all)" . If that's the case, then something like
>>>> (obviously
>>>> untested):
>>>>
>>>> sum( b$'(all)' == 1 | b$'(all)' == 0)
>>>>
>>>>
>>>>
>>>>>
>>>>> I have a report to construct for tomorrow Mon so any help would be
>>>>> appreciated
>>>>>
>>>>> Regards
>>>>> Steve
>>>>
>>>> David Winsemius, MD
>>>> Heritage Laboratories
>>>> West Hartford, CT
>>>>
>>>
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>>
>
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100118/7504ee24/attachment.pl>
Hi David Thanks for your patience, as well as thanks to Dennis Murphy and James Rome for trying to help. I have tried all your suggestions but still no joy. In order to try and resolve the problem I am attaching the following files, hope the system allows this. 1) Test_data_res.txt (used dput and this is all the data to be evaluated ) 2) Test_data_b.txt ( after performing the melt-cast. See the code) 3) Annual Results NLA WMS Ver1.r ( the code for one of the parameters to be evaluated. In this case SPC) Background; the data is from a laboratory Proficiency Testing Scheme and the z-scores outside the |3| range, are identified as "fails". My code assigns a 1 or 0 depending on this evaluation and because not every lab participates in every round NA are assigned where there are no results. What I am looking for is the following for each round (1-6) a) The total number of participants which in this case are represented by 1's and 0' per round b) The total number of 1's, ie Fails per round Regards Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net> To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Monday, January 18, 2010 12:38 AM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 4:37 PM, Steve Sidney wrote:
Well now I am totally baffled !!!!!!!!!! Using sum( !is.na(b[,3])) I get the total of all col 3 except those that are NA - Great solves the first problem What I can't seem to do is use the same logic to count all the 1's in that col, which are there before I use the cast with margins. So it seems to me that somehow is wrong and is the part of my understanding that's missing. My guess is that that before using margins and sum in the cast statement the col is a character type and in order for == 1 to work I need to convert this to an integer.
Yiu can test your theory with: sum(as.integer(b[,3]) == 1) Or you could post some reproducible data using dput .... -- David.
Hope this helps you to understand the problem. Regards Steve Your help is much appreciated ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 7:36 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
David Thanks, I'll try that......but no what I need is the total (1's) for each of the rows, labelled 1-6 at the top of each col in the table provided.
Part of my confusion with your request (which remains unaddressed) is what you mean by "valid". The melt-cast operation has turned a bunch of NA's into 0's which are now indistinguishable from the original 0's. So I don't see any way that operating on "b" could tell you the numbers you are asking for. If you were working on the original data, "res", you might have gotten the column-wise "valid" counts of column 2 with something like: sum( !is.na(res[,2]) )
What I guess I am not sure of is how to identify the col after the melt and cast.
The cast object represents columns as a list of vectors. The i-th column is b[[i]] which could be further referenced as a vector. So the j- th row entry for the i-th column would be b[[i]][j].
Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 4:39 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
Sorry to repeat the meassage, not sure if the HTML version has been
received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a data
frame
and there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum the
occurances using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
lab 1 2 3 4 5 6
1 4er66 1 NA 1 0 NA 0
2 4gcyi 0 0 1 0 0 0
3 5d3hh 0 0 0 NA 0 0
4 5d3wt 0 0 0 0 0 0
.
. lines deleted to save space
.
69 v3st5 NA NA 1 NA NA NA
70 a22g5 NA 0 NA NA NA NA
71 b5dd3 NA 0 NA NA NA NA
72 g44d2 NA 0 NA NA NA NA
Data after using sum and margins
lab 1 2 3 4 5 6 (all)
1 4er66 1 0 1 0 0 0 2
2 4gcyi 0 0 1 0 0 0 1
3 5d3hh 0 0 0 0 0 0 0
4 5d3wt 0 0 0 0 0 0 0
5 6n44r 0 0 0 0 0 0 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 0
71 b5dd3 0 0 0 0 0 0 0
72 g44d2 0 0 0 0 0 0 0
73 (all) 5 2 4 3 5 7 26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in this case either a one (1) or a zero (0) in b
I'm guessing that you mean to apply that test to the column in b labeled "(all)" . If that's the case, then something like (obviously untested): sum( b$'(all)' == 1 | b$'(all)' == 0)
I have a report to construct for tomorrow Mon so any help would be appreciated Regards Steve
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
-------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Test_data_b.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100118/d34a9fa5/attachment.txt> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Test_data_res.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100118/d34a9fa5/attachment-0001.txt>
On Jan 18, 2010, at 7:58 AM, Steve Sidney wrote:
Hi David Thanks for your patience, as well as thanks to Dennis Murphy and James Rome for trying to help. I have tried all your suggestions but still no joy. In order to try and resolve the problem I am attaching the following files, hope the system allows this. 1) Test_data_res.txt (used dput and this is all the data to be evaluated ) 2) Test_data_b.txt ( after performing the melt-cast. See the code) 3) Annual Results NLA WMS Ver1.r ( the code for one of the parameters to be evaluated. In this case SPC) Background; the data is from a laboratory Proficiency Testing Scheme and the z-scores outside the |3| range, are identified as "fails". My code assigns a 1 or 0 depending on this evaluation and because not every lab participates in every round NA are assigned where there are no results. What I am looking for is the following for each round (1-6) a) The total number of participants which in this case are represented by 1's and 0' per round
> apply(b[,-1], 2, function(x) sum(is.na(x) ) ) [1] 32 21 21 18 14 15
b) The total number of 1's, ie Fails per round
> apply(b[,-1], 2, sum, na.rm=TRUE ) [1] 5 2 4 3 5 7
Regards Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Monday, January 18, 2010 12:38 AM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 4:37 PM, Steve Sidney wrote:
Well now I am totally baffled !!!!!!!!!! Using sum( !is.na(b[,3])) I get the total of all col 3 except those that are NA - Great solves the first problem What I can't seem to do is use the same logic to count all the 1's in that col, which are there before I use the cast with margins. So it seems to me that somehow is wrong and is the part of my understanding that's missing. My guess is that that before using margins and sum in the cast statement the col is a character type and in order for == 1 to work I need to convert this to an integer.
Yiu can test your theory with: sum(as.integer(b[,3]) == 1) Or you could post some reproducible data using dput .... -- David.
Hope this helps you to understand the problem. Regards Steve Your help is much appreciated ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 7:36 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
David Thanks, I'll try that......but no what I need is the total (1's) for each of the rows, labelled 1-6 at the top of each col in the table provided.
Part of my confusion with your request (which remains unaddressed) is what you mean by "valid". The melt-cast operation has turned a bunch of NA's into 0's which are now indistinguishable from the original 0's. So I don't see any way that operating on "b" could tell you the numbers you are asking for. If you were working on the original data, "res", you might have gotten the column-wise "valid" counts of column 2 with something like: sum( !is.na(res[,2]) )
What I guess I am not sure of is how to identify the col after the melt and cast.
The cast object represents columns as a list of vectors. The i-th column is b[[i]] which could be further referenced as a vector. So the j- th row entry for the i-th column would be b[[i]][j].
Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 4:39 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
Sorry to repeat the meassage, not sure if the HTML version
has been
received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a
data frame
and there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum the
occurances using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
lab 1 2 3 4 5 6
1 4er66 1 NA 1 0 NA 0
2 4gcyi 0 0 1 0 0 0
3 5d3hh 0 0 0 NA 0 0
4 5d3wt 0 0 0 0 0 0
.
. lines deleted to save space
.
69 v3st5 NA NA 1 NA NA NA
70 a22g5 NA 0 NA NA NA NA
71 b5dd3 NA 0 NA NA NA NA
72 g44d2 NA 0 NA NA NA NA
Data after using sum and margins
lab 1 2 3 4 5 6 (all)
1 4er66 1 0 1 0 0 0 2
2 4gcyi 0 0 1 0 0 0 1
3 5d3hh 0 0 0 0 0 0 0
4 5d3wt 0 0 0 0 0 0 0
5 6n44r 0 0 0 0 0 0 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 0
71 b5dd3 0 0 0 0 0 0 0
72 g44d2 0 0 0 0 0 0 0
73 (all) 5 2 4 3 5 7 26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in this case either a one (1) or a zero (0) in b
I'm guessing that you mean to apply that test to the column in b labeled "(all)" . If that's the case, then something like (obviously untested): sum( b$'(all)' == 1 | b$'(all)' == 0)
I have a report to construct for tomorrow Mon so any help would be appreciated Regards Steve
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
<Test_data_b.txt><Test_data_res.txt><Annual Results NLA WMS Ver1.r>
David Winsemius, MD Heritage Laboratories West Hartford, CT
On Jan 18, 2010, at 8:53 AM, David Winsemius wrote:
On Jan 18, 2010, at 7:58 AM, Steve Sidney wrote:
Hi David Thanks for your patience, as well as thanks to Dennis Murphy and James Rome for trying to help. I have tried all your suggestions but still no joy. In order to try and resolve the problem I am attaching the following files, hope the system allows this. 1) Test_data_res.txt (used dput and this is all the data to be evaluated ) 2) Test_data_b.txt ( after performing the melt-cast. See the code) 3) Annual Results NLA WMS Ver1.r ( the code for one of the parameters to be evaluated. In this case SPC) Background; the data is from a laboratory Proficiency Testing Scheme and the z-scores outside the |3| range, are identified as "fails". My code assigns a 1 or 0 depending on this evaluation and because not every lab participates in every round NA are assigned where there are no results. What I am looking for is the following for each round (1-6) a) The total number of participants which in this case are represented by 1's and 0' per round
apply(b[,-1], 2, function(x) sum(is.na(x) ) )
[1] 32 21 21 18 14 15
Ooops, forgot the negation operator to turn not(NA) into TRUE: > apply(b[,-1], 2, function(x) sum(!is.na(x) ) ) [1] 40 51 51 54 58 57
b) The total number of 1's, ie Fails per round
apply(b[,-1], 2, sum, na.rm=TRUE )
[1] 5 2 4 3 5 7
Regards Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Monday, January 18, 2010 12:38 AM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 4:37 PM, Steve Sidney wrote:
Well now I am totally baffled !!!!!!!!!! Using sum( !is.na(b[,3])) I get the total of all col 3 except those that are NA - Great solves the first problem What I can't seem to do is use the same logic to count all the 1's in that col, which are there before I use the cast with margins. So it seems to me that somehow is wrong and is the part of my understanding that's missing. My guess is that that before using margins and sum in the cast statement the col is a character type and in order for == 1 to work I need to convert this to an integer.
Yiu can test your theory with: sum(as.integer(b[,3]) == 1) Or you could post some reproducible data using dput .... -- David.
Hope this helps you to understand the problem. Regards Steve Your help is much appreciated ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 7:36 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
David Thanks, I'll try that......but no what I need is the total (1's) for each of the rows, labelled 1-6 at the top of each col in the table provided.
Part of my confusion with your request (which remains unaddressed) is what you mean by "valid". The melt-cast operation has turned a bunch of NA's into 0's which are now indistinguishable from the original 0's. So I don't see any way that operating on "b" could tell you the numbers you are asking for. If you were working on the original data, "res", you might have gotten the column-wise "valid" counts of column 2 with something like: sum( !is.na(res[,2]) )
What I guess I am not sure of is how to identify the col after the melt and cast.
The cast object represents columns as a list of vectors. The i- th column is b[[i]] which could be further referenced as a vector. So the j- th row entry for the i-th column would be b[[i]][j].
Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 4:39 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
Sorry to repeat the meassage, not sure if the HTML version
has been
received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a
data frame
and there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum
the
occurances using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
lab 1 2 3 4 5 6
1 4er66 1 NA 1 0 NA 0
2 4gcyi 0 0 1 0 0 0
3 5d3hh 0 0 0 NA 0 0
4 5d3wt 0 0 0 0 0 0
.
. lines deleted to save space
.
69 v3st5 NA NA 1 NA NA NA
70 a22g5 NA 0 NA NA NA NA
71 b5dd3 NA 0 NA NA NA NA
72 g44d2 NA 0 NA NA NA NA
Data after using sum and margins
lab 1 2 3 4 5 6 (all)
1 4er66 1 0 1 0 0 0 2
2 4gcyi 0 0 1 0 0 0 1
3 5d3hh 0 0 0 0 0 0 0
4 5d3wt 0 0 0 0 0 0 0
5 6n44r 0 0 0 0 0 0 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 0
71 b5dd3 0 0 0 0 0 0 0
72 g44d2 0 0 0 0 0 0 0
73 (all) 5 2 4 3 5 7 26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in this case either a one (1) or a zero (0) in b
I'm guessing that you mean to apply that test to the column in b labeled "(all)" . If that's the case, then something like (obviously untested): sum( b$'(all)' == 1 | b$'(all)' == 0)
I have a report to construct for tomorrow Mon so any help would be appreciated Regards Steve
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
<Test_data_b.txt><Test_data_res.txt><Annual Results NLA WMS Ver1.r>
David Winsemius, MD Heritage Laboratories West Hartford, CT
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Excellent !!!!! It its exactly what I was looking for. Two very small questions to conclude 1) I don't understand the significance of the -1 in the sq brackets. 2) Not sure I really understand how function(x)works in this context. If you can point me towards a doc that explains this in simple terms I would be obliged. Don't expect you to have to provide the answer. Once again many thanks for your patience and help Regards Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net> To: "David Winsemius" <dwinsemius at comcast.net> Cc: "Steve Sidney" <sbsidney at mweb.co.za>; <r-help at r-project.org> Sent: Monday, January 18, 2010 3:58 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 18, 2010, at 8:53 AM, David Winsemius wrote:
On Jan 18, 2010, at 7:58 AM, Steve Sidney wrote:
Hi David Thanks for your patience, as well as thanks to Dennis Murphy and James Rome for trying to help. I have tried all your suggestions but still no joy. In order to try and resolve the problem I am attaching the following files, hope the system allows this. 1) Test_data_res.txt (used dput and this is all the data to be evaluated ) 2) Test_data_b.txt ( after performing the melt-cast. See the code) 3) Annual Results NLA WMS Ver1.r ( the code for one of the parameters to be evaluated. In this case SPC) Background; the data is from a laboratory Proficiency Testing Scheme and the z-scores outside the |3| range, are identified as "fails". My code assigns a 1 or 0 depending on this evaluation and because not every lab participates in every round NA are assigned where there are no results. What I am looking for is the following for each round (1-6) a) The total number of participants which in this case are represented by 1's and 0' per round
apply(b[,-1], 2, function(x) sum(is.na(x) ) )
[1] 32 21 21 18 14 15
Ooops, forgot the negation operator to turn not(NA) into TRUE:
apply(b[,-1], 2, function(x) sum(!is.na(x) ) )
[1] 40 51 51 54 58 57
b) The total number of 1's, ie Fails per round
apply(b[,-1], 2, sum, na.rm=TRUE )
[1] 5 2 4 3 5 7
Regards Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Monday, January 18, 2010 12:38 AM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 4:37 PM, Steve Sidney wrote:
Well now I am totally baffled !!!!!!!!!! Using sum( !is.na(b[,3])) I get the total of all col 3 except those that are NA - Great solves the first problem What I can't seem to do is use the same logic to count all the 1's in that col, which are there before I use the cast with margins. So it seems to me that somehow is wrong and is the part of my understanding that's missing. My guess is that that before using margins and sum in the cast statement the col is a character type and in order for == 1 to work I need to convert this to an integer.
Yiu can test your theory with: sum(as.integer(b[,3]) == 1) Or you could post some reproducible data using dput .... -- David.
Hope this helps you to understand the problem. Regards Steve Your help is much appreciated ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 7:36 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
David Thanks, I'll try that......but no what I need is the total (1's) for each of the rows, labelled 1-6 at the top of each col in the table provided.
Part of my confusion with your request (which remains unaddressed) is what you mean by "valid". The melt-cast operation has turned a bunch of NA's into 0's which are now indistinguishable from the original 0's. So I don't see any way that operating on "b" could tell you the numbers you are asking for. If you were working on the original data, "res", you might have gotten the column-wise "valid" counts of column 2 with something like: sum( !is.na(res[,2]) )
What I guess I am not sure of is how to identify the col after the melt and cast.
The cast object represents columns as a list of vectors. The i- th column is b[[i]] which could be further referenced as a vector. So the j- th row entry for the i-th column would be b[[i]][j].
Steve ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
To: "Steve Sidney" <sbsidney at mweb.co.za> Cc: <r-help at r-project.org> Sent: Sunday, January 17, 2010 4:39 PM Subject: Re: [R] Help using Cast (Text) Version
On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
Sorry to repeat the meassage, not sure if the HTML version has
been
received - Apologies for duplication
Dear list
I am trying to count the no of occurances in a column of a data
frame
and there is missing data identifed by NA.
I am able to melt and cast the data correctly as well as sum the
occurances using margins and sum.
Here are the melt and cast commands
bw = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(bw, lab ~ r, sum, margins = T)
Sample Data (before using sum and margins)
lab 1 2 3 4 5 6
1 4er66 1 NA 1 0 NA 0
2 4gcyi 0 0 1 0 0 0
3 5d3hh 0 0 0 NA 0 0
4 5d3wt 0 0 0 0 0 0
.
. lines deleted to save space
.
69 v3st5 NA NA 1 NA NA NA
70 a22g5 NA 0 NA NA NA NA
71 b5dd3 NA 0 NA NA NA NA
72 g44d2 NA 0 NA NA NA NA
Data after using sum and margins
lab 1 2 3 4 5 6 (all)
1 4er66 1 0 1 0 0 0 2
2 4gcyi 0 0 1 0 0 0 1
3 5d3hh 0 0 0 0 0 0 0
4 5d3wt 0 0 0 0 0 0 0
5 6n44r 0 0 0 0 0 0 0
.
.lines deleted to save space
.
70 a22g5 0 0 0 0 0 0 0
71 b5dd3 0 0 0 0 0 0 0
72 g44d2 0 0 0 0 0 0 0
73 (all) 5 2 4 3 5 7 26
Uisng length just tells me how many total rows there are.
What I need to do is count how many rows there is valid data, in this case either a one (1) or a zero (0) in b
I'm guessing that you mean to apply that test to the column in b labeled "(all)" . If that's the case, then something like (obviously untested): sum( b$'(all)' == 1 | b$'(all)' == 0)
I have a report to construct for tomorrow Mon so any help would be appreciated Regards Steve
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius, MD Heritage Laboratories West Hartford, CT
<Test_data_b.txt><Test_data_res.txt><Annual Results NLA WMS Ver1.r>
David Winsemius, MD Heritage Laboratories West Hartford, CT
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD Heritage Laboratories West Hartford, CT
On Jan 18, 2010, at 10:26 AM, Steve Sidney wrote:
David Excellent !!!!! It its exactly what I was looking for. Two very small questions to conclude 1) I don't understand the significance of the -1 in the sq brackets.
It removes the first column from the object that is offered to apply(). I didn't think it would not have made any sense to offer the vector of text entries, but now that I think about it again, it would have given you a total count as a check, I suppose. Read up on "negative indexing". ?"["
2) Not sure I really understand how function(x)works in this context.
The apply function with a "2" argument takes individual columns of matrices, arrays or data.frames and offers them to the function that follows. In this case each successive column temporally becomes "x" and then the body of that function works on "x" and returns a value for the sum of the !is.na() values, i.e. the count of the non-missing entries in that column.
If you can point me towards a doc that explains this in simple terms I would be obliged. Don't expect you to have to provide the answer.
Any of the introductory texts should explain the various forms of indexing and the use of the apply family of functions. They are both central to effective R programming.
David
>
> Once again many thanks for your patience and help
>
> Regards
> Steve
>
> ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
> >
> To: "David Winsemius" <dwinsemius at comcast.net>
> Cc: "Steve Sidney" <sbsidney at mweb.co.za>; <r-help at r-project.org>
> Sent: Monday, January 18, 2010 3:58 PM
> Subject: Re: [R] Help using Cast (Text) Version
>
>
>>
>> On Jan 18, 2010, at 8:53 AM, David Winsemius wrote:
>>
>>>
>>> On Jan 18, 2010, at 7:58 AM, Steve Sidney wrote:
>>>
>>>> Hi David
>>>>
>>>> Thanks for your patience, as well as thanks to Dennis Murphy and
>>>> James Rome for trying to help.
>>>>
>>>> I have tried all your suggestions but still no joy.
>>>>
>>>> In order to try and resolve the problem I am attaching the
>>>> following files, hope the system allows this.
>>>>
>>>> 1) Test_data_res.txt (used dput and this is all the data to be
>>>> evaluated )
>>>> 2) Test_data_b.txt ( after performing the melt-cast. See the code)
>>>> 3) Annual Results NLA WMS Ver1.r ( the code for one of the
>>>> parameters to be evaluated. In this case SPC)
>>>>
>>>> Background; the data is from a laboratory Proficiency Testing
>>>> Scheme and the z-scores outside the |3| range, are identified as
>>>> "fails". My code assigns a 1 or 0 depending on this evaluation
>>>> and because not every lab participates in every round NA are
>>>> assigned where there are no results.
>>>>
>>>> What I am looking for is the following for each round (1-6)
>>>> a) The total number of participants which in this case are
>>>> represented by 1's and 0' per round
>>>
>>> > apply(b[,-1], 2, function(x) sum(is.na(x) ) )
>>> [1] 32 21 21 18 14 15
>>
>> Ooops, forgot the negation operator to turn not(NA) into TRUE:
>>
>> > apply(b[,-1], 2, function(x) sum(!is.na(x) ) )
>> [1] 40 51 51 54 58 57
>>
>>>
>>>
>>>
>>>> b) The total number of 1's, ie Fails per round
>>>
>>> > apply(b[,-1], 2, sum, na.rm=TRUE )
>>> [1] 5 2 4 3 5 7
>>>
>>>>
>>>>
>>>>
>>>> Regards
>>>> Steve
>>>>
>>>>
>>>>
>>>> ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
>>>> >
>>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>>> Cc: <r-help at r-project.org>
>>>> Sent: Monday, January 18, 2010 12:38 AM
>>>> Subject: Re: [R] Help using Cast (Text) Version
>>>>
>>>>
>>>>>
>>>>> On Jan 17, 2010, at 4:37 PM, Steve Sidney wrote:
>>>>>
>>>>>> Well now I am totally baffled !!!!!!!!!!
>>>>>>
>>>>>> Using
>>>>>>
>>>>>> sum( !is.na(b[,3])) I get the total of all col 3 except those
>>>>>> that are NA -
>>>>>> Great solves the first problem
>>>>>>
>>>>>> What I can't seem to do is use the same logic to count all the
>>>>>> 1's in that
>>>>>> col, which are there before I use the cast with margins.
>>>>>>
>>>>>> So it seems to me that somehow is wrong and is the part of my
>>>>>> understanding that's missing.
>>>>>>
>>>>>> My guess is that that before using margins and sum in the cast
>>>>>> statement the col is a character type and in order for == 1 to
>>>>>> work I need to convert this to an integer.
>>>>>
>>>>> Yiu can test your theory with:
>>>>>
>>>>> sum(as.integer(b[,3]) == 1)
>>>>>
>>>>> Or you could post some reproducible data using dput ....
>>>>>
>>>>> --
>>>>> David.
>>>>>
>>>>>
>>>>>>
>>>>>> Hope this helps you to understand the problem.
>>>>>>
>>>>>> Regards
>>>>>> Steve
>>>>>>
>>>>>> Your help is much appreciated
>>>>>> ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
>>>>>> >
>>>>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>>>>> Cc: <r-help at r-project.org>
>>>>>> Sent: Sunday, January 17, 2010 7:36 PM
>>>>>> Subject: Re: [R] Help using Cast (Text) Version
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>> Thanks, I'll try that......but no what I need is the total
>>>>>>>> (1's) for
>>>>>>>> each of the rows, labelled 1-6 at the top of each col in the
>>>>>>>> table
>>>>>>>> provided.
>>>>>>>
>>>>>>> Part of my confusion with your request (which remains
>>>>>>> unaddressed) is
>>>>>>> what you mean by "valid". The melt-cast operation has turned a
>>>>>>> bunch of
>>>>>>> NA's into 0's which are now indistinguishable from the
>>>>>>> original 0's. So I
>>>>>>> don't see any way that operating on "b" could tell you the
>>>>>>> numbers you
>>>>>>> are asking for. If you were working on the original data,
>>>>>>> "res", you
>>>>>>> might have gotten the column-wise "valid" counts of column 2
>>>>>>> with
>>>>>>> something like:
>>>>>>>
>>>>>>> sum( !is.na(res[,2]) )
>>>>>>>
>>>>>>>>
>>>>>>>> What I guess I am not sure of is how to identify the col
>>>>>>>> after the melt
>>>>>>>> and cast.
>>>>>>>
>>>>>>> The cast object represents columns as a list of vectors. The
>>>>>>> i- th column
>>>>>>> is b[[i]] which could be further referenced as a vector. So
>>>>>>> the j- th row
>>>>>>> entry for the i-th column would be b[[i]][j].
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Steve
>>>>>>>>
>>>>>>>> ----- Original Message ----- From: "David Winsemius"
>>>>>>>> <dwinsemius at comcast.net
>>>>>>>> >
>>>>>>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>>>>>>> Cc: <r-help at r-project.org>
>>>>>>>> Sent: Sunday, January 17, 2010 4:39 PM
>>>>>>>> Subject: Re: [R] Help using Cast (Text) Version
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
>>>>>>>>>
>>>>>>>>>> Sorry to repeat the meassage, not sure if the HTML version
>>>>>>>>>> has been
>>>>>>>>>> received - Apologies for duplication
>>>>>>>>>>
>>>>>>>>>> Dear list
>>>>>>>>>>
>>>>>>>>>> I am trying to count the no of occurances in a column of
>>>>>>>>>> a data frame
>>>>>>>>>> and there is missing data identifed by NA.
>>>>>>>>>>
>>>>>>>>>> I am able to melt and cast the data correctly as well as
>>>>>>>>>> sum the
>>>>>>>>>> occurances using margins and sum.
>>>>>>>>>>
>>>>>>>>>> Here are the melt and cast commands
>>>>>>>>>>
>>>>>>>>>> bw = melt(res, id=c("lab","r"), "pf_zbw")
>>>>>>>>>> b = cast(bw, lab ~ r, sum, margins = T)
>>>>>>>>>>
>>>>>>>>>> Sample Data (before using sum and margins)
>>>>>>>>>>
>>>>>>>>>> lab 1 2 3 4 5 6
>>>>>>>>>> 1 4er66 1 NA 1 0 NA 0
>>>>>>>>>> 2 4gcyi 0 0 1 0 0 0
>>>>>>>>>> 3 5d3hh 0 0 0 NA 0 0
>>>>>>>>>> 4 5d3wt 0 0 0 0 0 0
>>>>>>>>>> .
>>>>>>>>>> . lines deleted to save space
>>>>>>>>>> .
>>>>>>>>>> 69 v3st5 NA NA 1 NA NA NA
>>>>>>>>>> 70 a22g5 NA 0 NA NA NA NA
>>>>>>>>>> 71 b5dd3 NA 0 NA NA NA NA
>>>>>>>>>> 72 g44d2 NA 0 NA NA NA NA
>>>>>>>>>>
>>>>>>>>>> Data after using sum and margins
>>>>>>>>>>
>>>>>>>>>> lab 1 2 3 4 5 6 (all)
>>>>>>>>>> 1 4er66 1 0 1 0 0 0 2
>>>>>>>>>> 2 4gcyi 0 0 1 0 0 0 1
>>>>>>>>>> 3 5d3hh 0 0 0 0 0 0 0
>>>>>>>>>> 4 5d3wt 0 0 0 0 0 0 0
>>>>>>>>>> 5 6n44r 0 0 0 0 0 0 0
>>>>>>>>>> .
>>>>>>>>>> .lines deleted to save space
>>>>>>>>>> .
>>>>>>>>>> 70 a22g5 0 0 0 0 0 0 0
>>>>>>>>>> 71 b5dd3 0 0 0 0 0 0 0
>>>>>>>>>> 72 g44d2 0 0 0 0 0 0 0
>>>>>>>>>> 73 (all) 5 2 4 3 5 7 26
>>>>>>>>>>
>>>>>>>>>> Uisng length just tells me how many total rows there are.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> What I need to do is count how many rows there is valid
>>>>>>>>>> data, in this
>>>>>>>>>> case either a one (1) or a zero (0) in b
>>>>>>>>>
>>>>>>>>> I'm guessing that you mean to apply that test to the column
>>>>>>>>> in b
>>>>>>>>> labeled "(all)" . If that's the case, then something like
>>>>>>>>> (obviously
>>>>>>>>> untested):
>>>>>>>>>
>>>>>>>>> sum( b$'(all)' == 1 | b$'(all)' == 0)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I have a report to construct for tomorrow Mon so any help
>>>>>>>>>> would be
>>>>>>>>>> appreciated
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>> Steve
>>>>>>>>>
>>>>>>>>> David Winsemius, MD
>>>>>>>>> Heritage Laboratories
>>>>>>>>> West Hartford, CT
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> David Winsemius, MD
>>>>>>> Heritage Laboratories
>>>>>>> West Hartford, CT
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> David Winsemius, MD
>>>>> Heritage Laboratories
>>>>> West Hartford, CT
>>>>>
>>>> <Test_data_b.txt><Test_data_res.txt><Annual Results NLA WMS Ver1.r>
>>>
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
If you can point me towards a doc that explains this in simple terms I would be obliged. Don't expect you to have to provide the answer.
Any of the introductory texts should explain the various forms of indexing and the use of the apply family of functions. They are both central to effective R programming.
See also the plyr package, http://had.co.nz/plyr, which tries to provide a more uniform interface to the apply family of functions. I've also tried to document everything in one place, so hopefully it's a bit easier to learn and to see how all the different functions fit together. Hadley
Hi Hadley Thanks I have downloaded the intro and the material and will work through it once get a chance Thanks for your interest Regards Steve ----- Original Message ----- From: "hadley wickham" <h.wickham at gmail.com> To: "David Winsemius" <dwinsemius at comcast.net> Cc: "Steve Sidney" <sbsidney at mweb.co.za>; <r-help at r-project.org> Sent: Monday, January 18, 2010 9:43 PM Subject: Re: [R] Help using Cast (Text) Version
If you can point me towards a doc that explains this in simple terms I would be obliged. Don't expect you to have to provide the answer.
Any of the introductory texts should explain the various forms of indexing and the use of the apply family of functions. They are both central to effective R programming.
See also the plyr package, http://had.co.nz/plyr, which tries to provide a more uniform interface to the apply family of functions. I've also tried to document everything in one place, so hopefully it's a bit easier to learn and to see how all the different functions fit together. Hadley -- http://had.co.nz/