On Jul 18, 2012, at 05:11 , darnold wrote:
> Hi,
>
> I see a lot of folks verify the regression identity SST = SSE + SSR
> numerically, but I cannot seem to find a proof. I wonder if any folks on
> this list could guide me to a mathematical proof of this fact.
>
Wrong list, isn't it?
http://stats.stackexchange.com/ is -----> _that_ way...
Anyways: Any math stats book should have it somewhere inside. There are two
basic approaches, depending on what level of abstraction one expects from
students.
First principles: Write out SST=sum((y-yhat)+(yhat-ybar))^2 and use the normal
equations to show that the sum of product terms is zero. This is a bit tedious,
but straightforward in principle.
Linear algebra: The least squares fitted values are the orthogonal projection
onto a subspace of R^N (N=number of observations). Hence the vector of residuals
is orthogonal to the vector (yhat - ybar) and the N-dimensional version of the
Pythagorean theorem is
||yhat - ybar||^2 + ||y - yhat||^2 == ||y - ybar||^2
since the three vectors involved form a right-angled triangle.
(http://en.wikipedia.org/wiki/Pythagorean_theorem, scroll down to "Inner
product spaces".)
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com