Skip to content

Transforming simulation data which is spread across many files into a barplot

7 messages · Ian Bentley, Hadley Wickham, Bert Gunter +1 more

#
On Fri, Jun 11, 2010 at 1:32 PM, Ian Bentley <ian.bentley at gmail.com> wrote:
# Load data
library(plyr)

paths <- dir(base, pattern = "\\.log", full = TRUE)
names(paths) <- basename(paths)

df <- ddply(paths, read.table)

# Compute averages:
avg <- ddply(df, ".id", summarise,
  sent = mean(sent),
  received = mean(received)

You can read more about plyr at http://had.co.nz/plyr.

Hadley
#
Ouch! Lousy plot. Instead, plot the  50 (mean sent, mean received)pairs as a
y vs x scatterplot to see the relationship. 

Bert Gunter
Genentech Nonclinical Biostatistics
 
 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Hadley Wickham
Sent: Friday, June 11, 2010 11:53 AM
To: Ian Bentley
Cc: r-help at r-project.org
Subject: Re: [R] Transforming simulation data which is spread across
manyfiles into a barplot
On Fri, Jun 11, 2010 at 1:32 PM, Ian Bentley <ian.bentley at gmail.com> wrote:
like
produce
file,
and
# Load data
library(plyr)

paths <- dir(base, pattern = "\\.log", full = TRUE)
names(paths) <- basename(paths)

df <- ddply(paths, read.table)

# Compute averages:
avg <- ddply(df, ".id", summarise,
  sent = mean(sent),
  received = mean(received)

You can read more about plyr at http://had.co.nz/plyr.

Hadley
#
Try this:

base <- "file" # replace as appropriate
N <- 50
filenames <- paste(base, seq_len(N)*100, ".log", sep = "")
mat <- sapply(filenames, function(fn)
	colMeans(read.table(fn, col.names = c("Sent", "Received")))
)
barplot(mat)
On Fri, Jun 11, 2010 at 2:32 PM, Ian Bentley <ian.bentley at gmail.com> wrote:
#
So two time series? Fair enough. But less is more. Plot them as separates
series of points connected by lines, different colors for the two different
series. Or as two trellises plots. You may also wish to overlay a smooth to
help the reader see the "trend"(e.g via a loess or other nonparametric
smooth, or perhaps just a fitted line).

The only part of a bar that conveys information is the top. The rest of the
fill is "chartjunk" (Tufte's term) and distracts. 


Bert Gunter
Genentech Nonclinical Biostatistics
 
 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Ian Bentley
Sent: Friday, June 11, 2010 12:15 PM
To: Bert Gunter
Cc: r-help at r-project.org; Hadley Wickham
Subject: Re: [R] Transforming simulation data which is spread
acrossmanyfiles into a barplot

I'm not trying to see the relation between sent and received, but rather to
show how these grow across the increasing complexity of the 50 data points.
On 11 June 2010 15:02, Bert Gunter <gunter.berton at gene.com> wrote:

            
and
the