Skip to content

how to use by() and hist()

3 messages · David White, Brian Ripley, Mark Myatt

#
Hello,

I'm using R 1.2.2 on Sun Solaris.

I have  data frame with 4 levels of factor "type". See the example data
frame below.
     type            token     variance
20   ku n031ku10.10msmeanc  77199422
21   ku n031ku11.10msmeanc  55682249
22   ku n031ku12.10msmeanc  52003965
23   ti n031ti01.10msmeanc  54511040
24   ti n031ti02.10msmeanc  58940197
25   ti n031ti03.10msmeanc  46918442

I'd like to plot histograms by token type as listed in column 1 above.
I tried
by(allvariances[,3], allvariances[,1], hist)
this worked well, but provided histograms with different numbers of bins
for each token type.

I also tried 
 by(allvariances[,3], allvariances[,1], hist(breaks=4))
to specify the number of bins. Hist then complained that I had not
specified a value for the data to be plotted.

Can anyone offer any pointers?

A related question: I'd like to see subsets of the data based on the
levels of factor "type". So in the example above, I'd like see, say, all
observations of type "tu". Any pointers on how to do that?

Thanks,

David

S. David White
sdavidwhite at bigfoot.com
Columbus, Ohio

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
On Sun, 6 May 2001, David White wrote:

            
by(allvariances[,3], allvariances[,1], function(x) hist(x, breaks=4))

or

by(allvariances[,3], allvariances[,1], hist, breaks=4)

(see the documentation of the ... argument in ?hist).
Just index the data frame, for example

allvariances[allvariances$type %in% "tu", ]

(The reason for not using == is the handling of missing values.)
1 day later
#
David White <dwhite at ling.ohio-state.edu> writes:
The syntax for by() is 

        by(data, INDICES, FUN, ...) 

with:

        ...     further arguments to FUN.

Try:


        by(allvariances[,3], allvariances[,1], hist, breaks=4)

The arguments to hist() are specified in the ... list.
To see all subsets use by:

        by(allvariances, type, print)

as this uses the default print() method for the data.frame.

To see a single subset use, surprise, subset():

        subset(allvariances, type == "tu"]

I hope that helps.

Mark

--
Mark Myatt


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._