Skip to content

truncpareto() - doesn't like my data and odd error message

8 messages · John Hillier, Peter Dalgaard

#
Dear All,


I am attempting to describe a distribution of height data.  It appears roughly linear on a log-log plot, so Pareto seems sensible.  However, the data are only reliable in a limited range (e.g. 2000 to 4800 m). So, I would like to fit a Pareto distribution to the reliable (i.e. truncated) section of the data.


I found truncpareto(), and implemented one of its example uses successfully.  Specifically, the third one at http://www.inside-r.org/packages/cran/vgam/docs/paretoff (also see p.s.).


When I try to run my data, I get the output below. Inputs shown with chevrons.
H_to_fit.Height
   Min.   :2000
   1st Qu.:2281

   Median :2666
   Mean   :2825
   3rd Qu.:3212
   Max.   :4794
Error in eval(expr, envir, enclos) :
  the value of argument 'lower' is too high (requires '0 < lower < min(y)')


This is odd as the usage format is - truncpareto(lower, upper), and varying 2000 to 1900 and 2100 makes no difference. Neither do smaller or larger variations. From the summary I think that my lowest input is 2000, which I am taking as min(y). I have also played with the upper limit.  pdataH has 2117 observations in it.


Is this a data format thing? i.e. of pdataH (a tried a few things, but to no avail)

Is truncpareto sensitive to not converging?

Am I using completely the wrong command?


Thank you in advance for any assistance you can give.


John


<http://www.inside-r.org/packages/cran/vgam/docs/paretoff><http://www.inside-r.org/packages/cran/vgam/docs/paretoff>p.s - Example that I did get to run.


# Upper truncated Pareto distribution
lower <- 2; upper <- 8; kay <- exp<http://inside-r.org/r-doc/base/exp>(2)
pdata3 <- data.frame<http://inside-r.org/r-doc/base/data.frame>(y = rtruncpareto(n = 100, lower = lower,
                                      upper = upper, shape = kay))
fit3 <- vglm(y ~ 1, truncpareto(lower, upper), data<http://inside-r.org/r-doc/utils/data> = pdata3, trace<http://inside-r.org/r-doc/base/trace> = TRUE)
coef<http://inside-r.org/r-doc/stats/coef>(fit3, matrix<http://inside-r.org/r-doc/base/matrix> = TRUE)
c<http://inside-r.org/r-doc/base/c>(fit3 at misc$lower, fit3 at misc$upper)


and output
+                                       upper = upper, shape = kay))
VGLM    linear loop  1 :  loglikelihood = 12.127363
VGLM    linear loop  2 :  loglikelihood = 12.130407
VGLM    linear loop  3 :  loglikelihood = 12.130407
loge(shape)
(Intercept)    1.955295
[1] 2 8


-------------------------
Dr John Hillier
Senior Lecturer - Physical Geography
Loughborough University
01509 223727
#
Umm, it doesn't seem to have a column called "y"?
#
Thank you Peter,

I believe this might be the way the error message is hard coded (i.e. it's always y to describe the input).  Anyway, I changed the first line to
This makes the input 'y' instead of 'H_to_fit.Height', but makes no difference to the outcome/error message.

John

-------------------------
Dr John Hillier
Senior Lecturer - Physical Geography
Loughborough University
01509 223727
#
Also if you simultaneously change the 2000 to say 1999?

-p
On 10 Mar 2016, at 09:22 , John Hillier <J.Hillier at lboro.ac.uk> wrote:

            

  
    
#
Thank you Peter,

Yes, it seems to do the same even if I simultaneously make that change.  Output below.
y       
 Min.   :2000  
 1st Qu.:2281  
 Median :2666  
 Mean   :2825  
 3rd Qu.:3212  
 Max.   :4794
Error in eval(expr, envir, enclos) : 
  the value of argument 'upper' is too low (requires 'max(y) < upper')

-------------------------
Dr John Hillier
Senior Lecturer - Physical Geography
Loughborough University
01509 223727
#
Look closer....

-pd

  
    
#
Dear Peter,

Thank you. Apolgies for not looking closer.  It is the end of a long day. Fixed now, and I have learnt more about correctly interpreting R's manual pages.

For the record .... 

Summary: If input to truncpareto() is not explicitly called 'y' it can produce error messages about the values 'lower', which might be confusing.  So, ensure input is called 'y', and that 'lower' and 'upper' are just outside the range of y.

John

-------------------------
Dr John Hillier
Senior Lecturer - Physical Geography
Loughborough University
01509 223727
#
Actually, the issue is that the left hand side of the model formula should be the name of the variable under consideration. If you write y ~ 1 there had better be a "y", but without the renaming you could also have used H_to_fit.Height ~ 1.

-pd