Skip to content

producing a QQ plot.

7 messages · Philip Wong, Stefan Grosse, Joshua Wiley +1 more

#
Hello everyone I'm a beginner in Stats and R, I'm using R 2.10.1.  I need to
create a multivariate qq plot, there is 8 variable group with each has 55
number of input.  An example of what I did so far, just to get my point out:
country     village group    av_expen  P2ary_ed  no_fisher
1       Cook Islands    Aitutaki     D  5239.12747 0.6666667  666.99986
2       Cook Islands     Mangaia     C  4587.36188 0.6021505  207.69228
3       Cook Islands  Palmerston     B  7784.31874 0.1666667   24.00000
...
53 Wallis And Futuna  All Futuna     D 11023.30674 0.2789855 1056.63143
54 Wallis And Futuna      Halalo     B  8783.54979 0.2794118  153.51715
55 Wallis And Futuna     Vailala     A 12231.95400 0.2395833  100.00000

The problem I'm having starts now.  I use the following command trying to
work out the mahalanobis before plotting the QQ plot, but the following
error is prompt:
Error in FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...) : 
  non-numeric argument to binary operator
In addition: Warning messages:
1: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
3: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
4: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
5: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
6: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
7: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
8: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
9: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
10: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA

Then I thought to myself, maybe the error I got the wrong input for
variable.  So I adjust the variable to see is my assumption was correct to
the command below, but I still got the same error:
I absolutely got no clue where I got wrong and don't know how to fix it. 
Anyways I thought to myself, no worries I don't use mahalanobis then I'll
still try the QQ plot and see what happen.  This is the command and the
error I got from it:
qqplot(qchisq(ppoints(data),ncol(data),data)
Error in qchisq(p, df, lower.tail, log.p) : 
  Non-numeric argument to mathematical function
#
Dear Philip,

It is difficult to tell what is wrong without a reproducible example.
It would be very helpful if you would provide sample data.  That said,
the most obvious issue from what you have provided is that some of
your data is character.  mean() and var() will not work with character
data.  It needs to be numeric or coercible to numeric.  I would try
specifically excluding the character data (e.g., data[,3:5] from what
I can make out).

HTH,

Josh
On Sat, Mar 27, 2010 at 2:45 AM, Philip Wong <tomb_fighter at hotmail.com> wrote:

  
    
#
Hello, this is the first 10 data of the population.

country	village	group	av_expen	P2ary_ed	no_fisher	B_Leth	B_Lutjan	Wt_Leth
Wt_Lutjan
Cook Islands	Aitutaki	D	5239.127472	0.666666667	666.9998558	3.286283997
1.971519001	520.6454552	126.2441843
Cook Islands	Mangaia	C	4587.361877	0.602150538	207.69228	0.330248	1.846795	0
0
Cook Islands	Palmerston	B	7784.318736	0.166666667	24.00000002	1.384456001
0.233746	0	57.76351477
Cook Islands	Rarotonga	A	8793.256543	0.764285714	223.8639163	6.790178998
0.751358	51.51418019	30.5970125
French Polynesia	Fakarava	B	7937.3952	0.36	255.3600002	7.485009002
6.282185007	62.28921398	60.39332797
French Polynesia	Maatea	D	12135.84	0.316455696	293.7499998	1.270781	0.526468
1002.39553	648.4578044
French Polynesia	Mataiea	D	12718.57548	0.341880342	2082.386008	2.117207998
0.340852	1830.16527	4239.861263
French Polynesia	Raivavae	B	8741.5104	0.285714286	325.0665956	20.121207
4.458011998	63.49777279	0
French Polynesia	Tikehau	D	6295.66	0.240384615	114.0832839	5.183129001
7.178272997	900.4192224	935.3617853
#
Am 27.03.2010 10:45, schrieb Philip Wong:
As Joshua already pointed out: you are trying mathematical functions on
names. Your Data containes e.g. Village names.

You can exclude those with subset, generating a new data.frame.

If you want averages by categories e.g. by a village have a look at the
doby package, it has a function summaryby that is perfect for generating
aggregated data with multiple functions.

hth
Stefan
#
It is a bit of a side note really, but a convenient way to provide
data (particularly when it is complex) is via dput().  Not only is
this easier to read in, it preserves classes and other handy info.
For instance, once I had played around to get "Cook" and "Islands"
into one column (since there was a space) I could use:

dput(data, file="clipboard") #data is what is being written and it is
output to the clipboard, works decently in Windows at least

to get:

######################################################
structure(list(country = structure(c(1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L), .Label = c("Cook Islands", "French Polynesia"), class = "factor"),
    village = structure(c(1L, 4L, 6L, 8L, 2L, 3L, 5L, 7L, 9L), .Label
= c("Aitutaki",
    "Fakarava", "Maatea", "Mangaia", "Mataiea", "Palmerston",
    "Raivavae", "Rarotonga", "Tikehau"), class = "factor"), group =
structure(c(4L,
    3L, 2L, 1L, 2L, 4L, 4L, 2L, 4L), .Label = c("A", "B", "C",
    "D"), class = "factor"), av_expen = c(5239.127472, 4587.361877,
    7784.318736, 8793.256543, 7937.3952, 12135.84, 12718.57548,
    8741.5104, 6295.66), P2ary_ed = c(0.666666667, 0.602150538,
    0.166666667, 0.764285714, 0.36, 0.316455696, 0.341880342,
    0.285714286, 0.240384615), no_fisher = c(666.9998558, 207.69228,
    24.00000002, 223.8639163, 255.3600002, 293.7499998, 2082.386008,
    325.0665956, 114.0832839), B_Leth = c(3.286283997, 0.330248,
    1.384456001, 6.790178998, 7.485009002, 1.270781, 2.117207998,
    20.121207, 5.183129001), B_Lutjan = c(1.971519001, 1.846795,
    0.233746, 0.751358, 6.282185007, 0.526468, 0.340852, 4.458011998,
    7.178272997), Wt_Leth = c(520.6454552, 0, 0, 51.51418019,
    62.28921398, 1002.39553, 1830.16527, 63.49777279, 900.4192224
    ), Wt_Lutjan = c(126.2441843, 0, 57.76351477, 30.5970125,
    60.39332797, 648.4578044, 4239.861263, 0, 935.3617853)), .Names =
c("country",
"village", "group", "av_expen", "P2ary_ed", "no_fisher", "B_Leth",
"B_Lutjan", "Wt_Leth", "Wt_Lutjan"), class = "data.frame", row.names = c(NA,
-9L))
###########################################

This is easily retrievable by copying the entire block of text and using:

dget("clipboard") # read the data into R


Best regards,


Josh
#
On Mar 27, 2010, at 6:45 AM, Joshua Wiley wrote:

            
Just a further note that I hope in no way diminishes the value of  
advice to use dput(); one could also type:

data1 <-    # and then paste the copied material to the console.

It has the advantage of being general to all OS's while the device  
"clipboard" is not. Had you been on a Mac, you would have needed to use:

data_object <-  dget( pipe("pbpaste") )

David Winsemius, MD
West Hartford, CT