thr3ads.net - R help - [R] Basic question: why does a scatter plot of a variable against itself works like this? [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Tal Galili

2013-Nov-06 16:40 UTC

[R] Basic question: why does a scatter plot of a variable against itself works like this?

Hello all,

I just noticed the following behavior of plot:
x <- c(1,2,9)
plot(x ~ x) # this is just like doing:
plot(x)
# when maybe we would like it to give this:
plot(x ~ c(x))
# the same as:
plot(x ~ I(x))

I was wondering if there is some reason for this behavior.


Thanks,
Tal



----------------Contact
Details:-------------------------------------------------------
Contact me: Tal.Galili@gmail.com |
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------

	[[alternative HTML version deleted]]

Marc Schwartz

2013-Nov-06 16:52 UTC

head link

[R] Basic question: why does a scatter plot of a variable against itself works like this?

On Nov 6, 2013, at 10:40 AM, Tal Galili <tal.galili at gmail.com> wrote:
> Hello all,
> 
> I just noticed the following behavior of plot:
> x <- c(1,2,9)
> plot(x ~ x) # this is just like doing:
> plot(x)
> # when maybe we would like it to give this:
> plot(x ~ c(x))
> # the same as:
> plot(x ~ I(x))
> 
> I was wondering if there is some reason for this behavior.
> 
> 
> Thanks,
> Tal

Hi Tal,

In your example:

  plot(x ~ x)

the formula method of plot() is called, which essentially does the following
internally:
> model.frame(x ~ x)  x
1 1
2 2
3 9

Note that there is only a single column in the result. Thus, the plot is based
upon 'y' = c(1, 2, 9), while 'x' = 1:3, which is NOT the row
names for the resultant data frame, but the indices of the vector elements in
the 'x' column.

This is just like:

  plot(c(1, 2, 9))

On the other hand:
> model.frame(x ~ c(x))  x c(x)
1 1    1
2 2    2
3 9    9
> model.frame(x ~ I(x))  x I(x)
1 1    1
2 2    2
3 9    9

In both of the above cases, you get two columns of data back, thus the result is
essentially:

  plot(c(1, 2, 9), c(1, 2, 9))

Regards,

Marc Schwartz

William Dunlap

2013-Nov-06 16:59 UTC

head link

[R] Basic question: why does a scatter plot of a variable against itself works like this?

It probably happens because plot(formula) makes one call to terms(formula) to
analyze the formula.  terms() says there is one variable in the formula,
the response, so plot(x~x) is the same a plot(seq_along(x), x).
If you give it plot(~x) , terms() also says there is one variable, but
no response, so you get the same plot as plot(x, rep(1,length(x))).
This is also the reason that plot(y1+y2 ~ x1+x2) makes one plot of the sum of y1
and y2
for each term on the right side instead of 4 plots, plot(x1,y1),
plot(x1,y2),plot(x2,y1),
and plot(x2,y2).

One could write a plot function that called terms separately on the left and
right sides of the formula.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf
> Of Tal Galili
> Sent: Wednesday, November 06, 2013 8:40 AM
> To: r-help at r-project.org
> Subject: [R] Basic question: why does a scatter plot of a variable against
itself works like
> this?
> 
> Hello all,
> 
> I just noticed the following behavior of plot:
> x <- c(1,2,9)
> plot(x ~ x) # this is just like doing:
> plot(x)
> # when maybe we would like it to give this:
> plot(x ~ c(x))
> # the same as:
> plot(x ~ I(x))
> 
> I was wondering if there is some reason for this behavior.
> 
> 
> Thanks,
> Tal
> 
> 
> 
> ----------------Contact
> Details:-------------------------------------------------------
> Contact me: Tal.Galili at gmail.com |
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
>
----------------------------------------------------------------------------------------------
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Barry Rowlingson

2013-Nov-06 17:38 UTC

head link

[R] Basic question: why does a scatter plot of a variable against itself works like this?

Interestingly, fitting an LM with x on both sides gives a warning, and
then drops it from the RHS, leaving you with just an intercept:
> lm(x~x,data=d)
Call:
lm(formula = x ~ x, data = d)

Coefficients:
(Intercept)
          4

Warning messages:
1: In model.matrix.default(mt, mf, contrasts) :
  the response appeared on the right-hand side and was dropped
2: In model.matrix.default(mt, mf, contrasts) :
  problem with term 1 in model.matrix: no columns are assigned

there's no numerical problem fitting a line through the points:

 > d$xx=d$x
 > lm(x~xx,data=d)

Call:
lm(formula = x ~ xx, data = d)

Coefficients:
(Intercept)           xx
  5.128e-16    1.000e+00

It seems to be R saying "Ummm did you really mean to do this? It's
kinda dumb".

I suppose this could occur if you had a nested loop over all columns
in a data frame, fitting an LM with every column, and didn't skip if
i==j

Except of course it doesn't:

 - fit with two indexes set to one:
> i=1;j=1
> lm(d[,i]~d[,j])
Call:
lm(formula = d[, i] ~ d[, j])

Coefficients:
(Intercept)       d[, j]
  5.128e-16    1.000e+00

- fit with two ones:
> lm(d[,1]~d[,1])
Call:
lm(formula = d[, 1] ~ d[, 1])

Coefficients:
(Intercept)
          4

Warning messages:
1: In model.matrix.default(mt, mf, contrasts) :
  the response appeared on the right-hand side and was dropped
2: In model.matrix.default(mt, mf, contrasts) :
  problem with term 1 in model.matrix: no columns are assigned

Obviously this can all be explained in terms of R (or lm's, or
model.matrix's) evaluation schemes, but it seems far from intuitive.

Barry



On Wed, Nov 6, 2013 at 4:59 PM, William Dunlap <wdunlap at tibco.com>
wrote:> It probably happens because plot(formula) makes one call to terms(formula)
to
> analyze the formula.  terms() says there is one variable in the formula,
> the response, so plot(x~x) is the same a plot(seq_along(x), x).
> If you give it plot(~x) , terms() also says there is one variable, but
> no response, so you get the same plot as plot(x, rep(1,length(x))).
> This is also the reason that plot(y1+y2 ~ x1+x2) makes one plot of the sum
of y1 and y2
> for each term on the right side instead of 4 plots, plot(x1,y1),
plot(x1,y2),plot(x2,y1),
> and plot(x2,y2).
>
> One could write a plot function that called terms separately on the left
and
> right sides of the formula.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf
>> Of Tal Galili
>> Sent: Wednesday, November 06, 2013 8:40 AM
>> To: r-help at r-project.org
>> Subject: [R] Basic question: why does a scatter plot of a variable
against itself works like
>> this?
>>
>> Hello all,
>>
>> I just noticed the following behavior of plot:
>> x <- c(1,2,9)
>> plot(x ~ x) # this is just like doing:
>> plot(x)
>> # when maybe we would like it to give this:
>> plot(x ~ c(x))
>> # the same as:
>> plot(x ~ I(x))
>>
>> I was wondering if there is some reason for this behavior.
>>
>>
>> Thanks,
>> Tal
>>
>>
>>
>> ----------------Contact
>> Details:-------------------------------------------------------
>> Contact me: Tal.Galili at gmail.com |
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew)
|
>> www.r-statistics.com (English)
>>
----------------------------------------------------------------------------------------------
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Nov 2013 - Basic question: why does a scatter plot of a variable against itself works like this?

[R] Basic question: why does a scatter plot of a variable against itself works like this?

[R] Basic question: why does a scatter plot of a variable against itself works like this?

[R] Basic question: why does a scatter plot of a variable against itself works like this?

[R] Basic question: why does a scatter plot of a variable against itself works like this?

Apparently Analagous Threads