thr3ads.net - R help - [R] Curve Fitting/Regression with Multiple Observations [Apr 2010]

If this information is useful, please help other people find it:
Share via:

Kyeong Soo (Joseph) Kim

2010-Apr-27 17:30 UTC

[R] Curve Fitting/Regression with Multiple Observations

I recently came to realize the true power of R for statistical
analysis -- mainly for post-processing of data from large-scale
simulations -- and have been converting many of existing Python(SciPy)
scripts to those based on R and/or Perl.

In the middle of this conversion, I revisited the problem of curve
fitting for simulation data with multiple observations resulting from
repetitions.

In the past, I first processed simulation data (i.e., multiple y's
from repetitions) to get a mean with a confidence interval for a given
value of x (independent variable) and then applied spline procedure
for those mean values only (i.e., unique pairs of (x_i, y_i) for i=1,
2, ...) to get a smoothed curve. Because of rather large confidence
intervals, however, the resulting curves were hardly smooth enough for
my purpose, I had to fix the function to exponential and used least
square methods to fit its parameters for data.
>From a plot with confidence intervals, it's rather easy for one tovisually and manually(?) figure out a smoothed curve for it.
So I'm thinking right now of directly applying spline (or whatever
regression procedures for this purpose) to the simulation data with
repetitions rather than means. The simulation data in this case looks
like this (assuming three repetitions):

# x    y
1      1.2
1      0.9
1      1.3
2      2.2
2      1.7
2      2.0
...      ....

So my idea is to let spline procedure handle the fluctuations in the
data (i.e., in repetitions) by itself.
But I wonder whether this direct application of spline procedures for
data with multiple observations makes sense from the statistical
analysis (i.e., theoretical) point of view.

It may be a stupid question and quite obvious to many, but personally
I don't know where to start.
It would be greatly appreciated if anyone can shed a light on this in
this regard.

Many thanks in advance,
Joseph

Bert Gunter

2010-Apr-27 18:13 UTC

head link

[R] Curve Fitting/Regression with Multiple Observations

Joseph:

I believe you need to stop inventing your own statistical methods and
consult a professional statistician. I do not think this list is the proper
place to look for a statistics tutorial when your statistical background
appears to be so inadequate for the task.

Sorry to be so direct -- perhaps I am wrong in my assessment. But if I am
even close, would you like an accountant to fix your car or an auto mechanic
to do your taxes?

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
 
 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
Behalf Of Kyeong Soo (Joseph) Kim
Sent: Tuesday, April 27, 2010 10:31 AM
To: r-help at r-project.org
Subject: [R] Curve Fitting/Regression with Multiple Observations

I recently came to realize the true power of R for statistical
analysis -- mainly for post-processing of data from large-scale
simulations -- and have been converting many of existing Python(SciPy)
scripts to those based on R and/or Perl.

In the middle of this conversion, I revisited the problem of curve
fitting for simulation data with multiple observations resulting from
repetitions.

In the past, I first processed simulation data (i.e., multiple y's
from repetitions) to get a mean with a confidence interval for a given
value of x (independent variable) and then applied spline procedure
for those mean values only (i.e., unique pairs of (x_i, y_i) for i=1,
2, ...) to get a smoothed curve. Because of rather large confidence
intervals, however, the resulting curves were hardly smooth enough for
my purpose, I had to fix the function to exponential and used least
square methods to fit its parameters for data.
>From a plot with confidence intervals, it's rather easy for one tovisually and manually(?) figure out a smoothed curve for it.
So I'm thinking right now of directly applying spline (or whatever
regression procedures for this purpose) to the simulation data with
repetitions rather than means. The simulation data in this case looks
like this (assuming three repetitions):

# x    y
1      1.2
1      0.9
1      1.3
2      2.2
2      1.7
2      2.0
...      ....

So my idea is to let spline procedure handle the fluctuations in the
data (i.e., in repetitions) by itself.
But I wonder whether this direct application of spline procedures for
data with multiple observations makes sense from the statistical
analysis (i.e., theoretical) point of view.

It may be a stupid question and quite obvious to many, but personally
I don't know where to start.
It would be greatly appreciated if anyone can shed a light on this in
this regard.

Many thanks in advance,
Joseph

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Gabor Grothendieck

2010-Apr-27 18:35 UTC

head link

[R] Curve Fitting/Regression with Multiple Observations

This will compute a loess curve and plot it:

example(loess)
plot(dist ~ speed, cars, pch = 20)
lines(cars$speed, fitted(cars.lo))

Also this directly plots it but does not give you the values of the
curve separately:

library(lattice)
xyplot(dist ~ speed, cars, type = c("p", "smooth"))



On Tue, Apr 27, 2010 at 1:30 PM, Kyeong Soo (Joseph) Kim
<kyeongsoo.kim at gmail.com> wrote:> I recently came to realize the true power of R for statistical
> analysis -- mainly for post-processing of data from large-scale
> simulations -- and have been converting many of existing Python(SciPy)
> scripts to those based on R and/or Perl.
>
> In the middle of this conversion, I revisited the problem of curve
> fitting for simulation data with multiple observations resulting from
> repetitions.
>
> In the past, I first processed simulation data (i.e., multiple y's
> from repetitions) to get a mean with a confidence interval for a given
> value of x (independent variable) and then applied spline procedure
> for those mean values only (i.e., unique pairs of (x_i, y_i) for i=1,
> 2, ...) to get a smoothed curve. Because of rather large confidence
> intervals, however, the resulting curves were hardly smooth enough for
> my purpose, I had to fix the function to exponential and used least
> square methods to fit its parameters for data.
>
> >From a plot with confidence intervals, it's rather easy for one to
> visually and manually(?) figure out a smoothed curve for it.
> So I'm thinking right now of directly applying spline (or whatever
> regression procedures for this purpose) to the simulation data with
> repetitions rather than means. The simulation data in this case looks
> like this (assuming three repetitions):
>
> # x ? ?y
> 1 ? ? ?1.2
> 1 ? ? ?0.9
> 1 ? ? ?1.3
> 2 ? ? ?2.2
> 2 ? ? ?1.7
> 2 ? ? ?2.0
> ... ? ? ?....
>
> So my idea is to let spline procedure handle the fluctuations in the
> data (i.e., in repetitions) by itself.
> But I wonder whether this direct application of spline procedures for
> data with multiple observations makes sense from the statistical
> analysis (i.e., theoretical) point of view.
>
> It may be a stupid question and quite obvious to many, but personally
> I don't know where to start.
> It would be greatly appreciated if anyone can shed a light on this in
> this regard.
>
> Many thanks in advance,
> Joseph
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Possibly Parallel Threads

Search for more reasonably related threads

R help - Apr 2010 - Curve Fitting/Regression with Multiple Observations

[R] Curve Fitting/Regression with Multiple Observations

[R] Curve Fitting/Regression with Multiple Observations

[R] Curve Fitting/Regression with Multiple Observations

Possibly Parallel Threads