Skip to content
Prev 258228 / 398502 Next

MASS fitdistr with plyr or data.table?

Hi:

Here's one way to do this with plyr and data.table.

# plyr
As Hadley inferred, when using ddply(), it's convenient to write a
function for a generic (sub-)data frame and have it return a data
frame. Here's my function and call:

f <- function(d) {
     require(MASS)
     est <- fitdistr(d$wind_speed, 'weibull')$estimate
     data.frame(shape = est[1], scale = est[2])
  }

Notice that f() takes a data frame d as input, uses the wind_speed
component of d to fit the Weibull, and returns a data frame with names
for the estimates.
site    shape    scale
1     1 3.063853 8.049467
2     2 2.945982 8.067252
3     3 2.879392 7.999636
4     4 3.097191 8.084453
5     5 3.091117 8.012450
6     6 2.943254 7.912792
7     7 2.957455 7.947545
8     8 2.975732 7.901587
9     9 3.045563 8.061838
10   10 2.995324 8.056820
Warning messages:
1: In dweibull(x, shape, scale, log) : NaNs produced
2: In dweibull(x, shape, scale, log) : NaNs produced
<snip the other eight>

## data.table

In writing a function for data.table, you want a variable as input and
a list as output. We therefore modify the above function slightly:

g <- function(x) {
     require(MASS)
     est <- fitdistr(x, 'weibull')$estimate
     list(shape = est[1], scale = est[2])
  }

library(data.table)
weib.test.dt <- data.table(weib.test.too, key = 'site')
site    shape    scale
 [1,]    1 3.063853 8.049467
 [2,]    2 2.945982 8.067252
 [3,]    3 2.879392 7.999636
 [4,]    4 3.097191 8.084453
 [5,]    5 3.091117 8.012450
 [6,]    6 2.943254 7.912792
 [7,]    7 2.957455 7.947545
 [8,]    8 2.975732 7.901587
 [9,]    9 3.045563 8.061838
[10,]   10 2.995324 8.056820
## <warnings snipped>

HTH,
Dennis
On Wed, Apr 27, 2011 at 1:55 PM, Justin Haynes <jtor14 at gmail.com> wrote: