"BANNISTER, Keith" <keith.bannister at astrium.eads.net> 09/20/05
09:46AM >>>
Hi,
I'd like to use R to do what excel pivot tables do, and plot
results.
R does not have pivot tables and I hope that it never does.
My experiance with pivot tables is that they encourage poor initial
design followed
by non-easily-reproducable post-hoc twiddling.
R encourages proper initial design followed by fixing the core design
in cases
where things don't turn out the way you intended.
In R I prefer to work with script files and save the file. If the
table or graph
does not turn out the way I intended, then I just edit the script file
and rerun it.
While this may be a little more work than clicking on a pivot table at
first, in the
long run I find it saves more time.
Consider the situation where you create a table/graph, then a month
later your
boss/client/coworker finds some typos in the original data and needs
the table
and/or graph recreated with the corrected data (or maybe a new dataset
that
needs a similar graph/table). With the pivot table you need to try and
remember
everything that you clicked on and click on it again. With the R
script file you
just fix the data (or load in the new data) and rerun the script and
your done.
OK, enough of my ranting, on to helping with your problem.
I've never used R before, and I've managed to do something, but it's
quite a
lot of code to do something simple. I can't help but think I'm not
"Doing it
the R way".
I could be using R for the wrong thing, in which case, please tell
me off.
[snip]
"by" is a bit of an overkill for this situation, tapply will probably
work better.
try this basic script as a starting place:
### start ###
my.df <- data.frame( SNR=rep( c(4,6,8), each=3),
timeError = c(1.3,2.1,1.2,2.1,2.2,2.1,3.2,3.7,3.1))
tmp.mean <- tapply( my.df$timeError, my.df$SNR, mean)
tmp.sd <- tapply( my.df$timeError, my.df$SNR, sd)
tmp.x <- unique(my.df$SNR)
plot( tmp.x, tmp.mean,
ylim=range(tmp.mean+3*tmp.sd,tmp.mean-3*tmp.sd),
xlab='SNR',ylab='timeError')
segments(tmp.x, tmp.mean-3*tmp.sd, tmp.x, tmp.mean+3*tmp.sd,
col='green')
### optional
points(tmp.x, tmp.mean+3*tmp.sd, pch='-',cex=3,col='green')
points(tmp.x, tmp.mean-3*tmp.sd, pch='-',cex=3,col='green')
points(tmp.x, tmp.mean)
### end script ###
This may be even simpler with a loaded package. a quick search shows
the following functions (package in parens) that may help:
plotCI(gplots) Plot Error Bars and Confidence Intervals
errbar(Hmisc) Plot Error Bars
xYplot(Hmisc) xyplot and dotplot with Matrix Variables to
Plot Error Bars and Bands
plotCI(plotrix) Plot confidence intervals/error bars
errbar(sfsmisc) Scatter Plot with Error Bars
plotCI(sfsmisc) Plot Confidence Intervals / Error Bars
Appreciate any helpful hints from the pros.
hope this helps,
Cheers!
p.s. We've been having rather a good time around the office recently
with
"International Talk Like a Pirate Day" (www.yarr.org.uk). R fits in
very
well: "I be usin' Arrrgghhhh for my post processin'".
Keith Bannister
Greg Snow, Ph.D.
Statistical Data Center, LDS Hospital
Intermountain Health Care
greg.snow at ihc.com
(801) 408-8111
On 9/20/05, Greg Snow <greg.snow at ihc.com> wrote:
"BANNISTER, Keith" <keith.bannister at astrium.eads.net> 09/20/05
09:46AM >>>
Hi,
I'd like to use R to do what excel pivot tables do, and plot
results.
R does not have pivot tables and I hope that it never does.
My experiance with pivot tables is that they encourage poor initial
design followed
by non-easily-reproducable post-hoc twiddling.
R encourages proper initial design followed by fixing the core design
in cases
where things don't turn out the way you intended.
In R I prefer to work with script files and save the file. If the
table or graph
does not turn out the way I intended, then I just edit the script file
and rerun it.
While this may be a little more work than clicking on a pivot table at
first, in the
long run I find it saves more time.
Consider the situation where you create a table/graph, then a month
later your
boss/client/coworker finds some typos in the original data and needs
the table
and/or graph recreated with the corrected data (or maybe a new dataset
that
needs a similar graph/table). With the pivot table you need to try and
remember
everything that you clicked on and click on it again. With the R
script file you
just fix the data (or load in the new data) and rerun the script and
your done.
OK, enough of my ranting, on to helping with your problem.
Just one comment here lest we be arguing against a strawman.
While I agree that reproducibility can be a problem with pivot tables
if created interactively and this applies to just about anything you do
in Excel if done interactively, it should also be realized that Excel is
completely programmable, like R, using VBA or any language (including R!)
via its COM object interface.
The fact that Excel has both an interactive interface and a script-based
interface whereas R has only a script-based interface puts it ahead, not
behind, R in at least some respects.
The fact that Excel has both an interactive interface and a script-based
interface whereas R has only a script-based interface puts it ahead, not
behind, R in at least some respects.
Sorry, but I can't resist: That very much depends on if
you are doing something that is appropriate to be done
in a spreadsheet. The set of tasks appropriate for R is
very much larger than the set appropriate for Excel.
http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html
Patrick Burns
patrick at burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")
On 9/20/05, Patrick Burns <pburns at pburns.seanet.com> wrote:
Gabor Grothendieck wrote:
...
The fact that Excel has both an interactive interface and a script-based
interface whereas R has only a script-based interface puts it ahead, not
behind, R in at least some respects.
Sorry, but I can't resist: That very much depends on if
you are doing something that is appropriate to be done
in a spreadsheet. The set of tasks appropriate for R is
very much larger than the set appropriate for Excel.
http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html
I certainly don't want to be an apologist for Excel but I would
not asses its domain of applicability to be a subset of that of
R. I agree with most of the points made in the link you cited
but its mainly concerned with stretching the use of spreadsheets
to situations where R would be better
At the same time the domain where spreadsheets are appropriate
and preferable is very large and probably exceeds the domain where R
is preferable to Excel due to the fact that financial, accounting
and budgetary work done by every organization is mostly in the domain
of applicabilty of Excel. Also I think the link overstates the case,
at least in reference to Excel, since some of the criticisms can
be overcome using Excel's scripting capability.