Back to formatted view
Raw Message

Message-ID: <0836935D-106F-4662-B747-C55AD85F5212@me.com>
Date: 2013-11-06T16:52:22Z
From: Marc Schwartz
Subject: Basic question: why does a scatter plot of a variable against itself works like this?
In-Reply-To: <CANdJ3dUC7gEOtoqQc__BoagPXPykfB-w0hgKwBvpPcGOnYS6Rw@mail.gmail.com>

On Nov 6, 2013, at 10:40 AM, Tal Galili <tal.galili at gmail.com> wrote:

> Hello all,
> 
> I just noticed the following behavior of plot:
> x <- c(1,2,9)
> plot(x ~ x) # this is just like doing:
> plot(x)
> # when maybe we would like it to give this:
> plot(x ~ c(x))
> # the same as:
> plot(x ~ I(x))
> 
> I was wondering if there is some reason for this behavior.
> 
> 
> Thanks,
> Tal


Hi Tal,

In your example:

  plot(x ~ x)

the formula method of plot() is called, which essentially does the following internally:

> model.frame(x ~ x)
  x
1 1
2 2
3 9

Note that there is only a single column in the result. Thus, the plot is based upon 'y' = c(1, 2, 9), while 'x' = 1:3, which is NOT the row names for the resultant data frame, but the indices of the vector elements in the 'x' column. 

This is just like:

  plot(c(1, 2, 9))


On the other hand:

> model.frame(x ~ c(x))
  x c(x)
1 1    1
2 2    2
3 9    9

> model.frame(x ~ I(x))
  x I(x)
1 1    1
2 2    2
3 9    9


In both of the above cases, you get two columns of data back, thus the result is essentially:

  plot(c(1, 2, 9), c(1, 2, 9))


Regards,

Marc Schwartz