Skip to content

Histogram and Density on the the same graph

11 messages · Trafim, Derek Eder, (Ted Harding) +4 more

#
The lattice function panel.identify() allows the labeling of mouse-selected plot points. 

But ... how does one save the resulting plot? E.g., to png() or pdf()

Thank you! 



# example code
library(lattice)
xyplot(rnorm(10,0,1)~rnorm(10,0,1))
trellis.focus() # left click on plot panel to select 
panel.identify() # left click on selected points.  Right click to exit process
trellis.unfocus()



Derek N. Eder
Gothenburg University

"The most dangerous thing in the jungle is not the snakes, the spiders, the tigers, ...
The most dangerous thing in the jungle is your mind!"
#
On 30-Nov-09 11:09:12, Trafim wrote:
plot() initiates a completely new plot, and therefore erases what
was there before. Use lines(), or points(), instead:

  x <- seq(1,40,1)
  y <- 2*x+1+5*rnorm(length(x))
  
  hist(y,freq = FALSE)
  lines(density(y))

lines() *adds* extra elements (line segments) to an *existing* plot.

See ?plot abd ?lines (and also ?points) for more explanation.
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 30-Nov-09                                       Time: 11:27:49
------------------------------ XFMail ------------------------------
#
Trafim,

If you are plotting more than one variables on the same plot e.g. by using
the lines() or points() function, then the limits of the X and Y axes are
set based on the first variable you plot. So, you would have to set the xlim
and ylim to the limits of the variable with the widest range, otherwise you
would sometimes see some data left out.
#
To follow up, and expand, Hrishi's advice above:

When you create a plot using plot(), or by a function such as
hist() which (without "add=TRUE") will use plot(), by default R
chooses the limits on the X range and the Y range according to
the values encountered in whatever is being plotted. This will
be done by a rather general rule designed to achieve a "pretty"
result. Subsequent additions to the plot, using say lines() or
points(), will be made using the same limits.

Therefore, to ensure that everything that you want, which you
will add in separate stages, will be wholly visible, you first
need to ascertain what the necessary limits will be by inspecting
all of the elements to find their global minimum and maximum.

Going back to your orotingal example (and using set.seed() for
a reproducible result):

  set.seed(12345)
  x <- seq(1,40,1)
  y <- 2*x+1+5*rnorm(length(x))

  hist(y,freq = FALSE)
  lines(density(y))

You will see that, although the maximum value of density(y) does
not go above the Y range already allocated for hist(y), it seems
that the X range does go beyond the X range which was alloocated.
So have a look at density(y):

  density(y)
  # Call:
  #        density.default(x = y)
  # Data: y (40 obs.);      Bandwidth 'bw' = 10.7
  #        x                 y            
  #  Min.   :-28.188   Min.   :1.297e-05  
  #  1st Qu.:  8.843   1st Qu.:1.261e-03  
  #  Median : 45.874   Median :8.337e-03  
  #  Mean   : 45.874   Mean   :6.744e-03  
  #  3rd Qu.: 82.905   3rd Qu.:1.104e-02  
  #  Max.   :119.937   Max.   :1.243e-02  

Therefore the full X range for density(y) needs (-30,120). So
start by setting as suitable xlim for the hist(y), and then put
in the lines() for density(y):

  hist(y,freq = FALSE,xlim=c(-30,120))
  lines(density(y))

Now you have the full plot of density(y), but now the X-axis
which is shown only ranges over (0,100). You can change this,
but will not find out how by looking at ?hist, since the secret
is hidden in the "..." which are described as "further arguments
and graphical parameters passed to 'plot.histogram' and thence
to 'title' and 'axis' (if 'plot=TRUE')."

So you need to look at ?plot.histogram which will in turn pass you
on to "...: further graphical parameters to 'title' and 'axis'."

At this point you are almost there, but need to realise that what
you should be looking at is ?axis. Here you find the paramater "at".
So try augmenting the hist() command by setting an "at":

  hist(y,freq = FALSE,xlim=c(-30,120),at=10*(-3:12))
  lines(density(y))

This produces an axis over (-30,120), but with a warning that
"at is not a grapohical parameter" (which is a bit mysterious,
since in fact it is what does the job); but the tic-marks
are not placed at every desired value (-30, 100 and 120 are
omitted) -- I confess I do not understand why!

I give the above explanation in detail to illustrate that, when
you go beyond basic use of R's graphical functions you may have
to embark on a possibly length search through a chain of links
in the documentation before you find what you are looking for,
R's graphics, despite apparent simplicity for default simple
usages, is in fact very complicated, and the documentation is
Byzantine!

In this particular case, Aysun's suggestion of first plotting
density(y) and then adding the histogram is simpler -- but now
bear in mind that the heights of the histogram bars will go
above the default limits set when density(y) is plotted-- see:

  plot(density(y))
  hist(y,freq=FALSE,add=TRUE)

So use "ylim" to specify this:

  plot(density(y),xlim=c(-30,120),ylim=c(0,0.015))
  hist(y,freq=FALSE,add=TRUE)

Hoping this helps!
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 30-Nov-09                                       Time: 13:55:18
------------------------------ XFMail ------------------------------
#
It's easy with the ggplot2 package

set.seed(1234)
dataset <- data.frame(x = seq(1,40,1))
dataset$Response <- with(dataset, 2*x+1+5*rnorm(length(x)))

library(ggplot2)
ggplot(dataset, aes(x = Response)) + geom_histogram(aes(y =
..density..)) + geom_density(colour = "red")

HTH,

Thierry

------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
Namens Trafim
Verzonden: maandag 30 november 2009 12:01
Aan: r-help at r-project.org
Onderwerp: [R] Histogram and Density on the the same graph

Dear all,

I cannot find a function which would allow drawing hist and density on
the same graph.

x <- seq(1,40,1)
y <- 2*x+1+5*rnorm(length(x))

hist(y,freq = FALSE)
plot(density(y))

thanks a lot for the help


______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.
#
On Mon, Nov 30, 2009 at 12:01:12PM +0100, Trafim wrote:
The package descr has the function histkdnc() which plots a
histogram with kernel density and normal curve. I mantain the
package but this function was written by Dirk Enzmann.