Hi there,
I have a data frame DF of over 600 people's short term trade data in time
order. Below is the simplified structure of the data.
id invest payout
[1] 1 10 -1
[2] 1 33 33
[3] 1 20 -5
[4] 2 200 33
[5] 2 33 -20
[6] 3 5 -5
[7] 3 5 -5
id is each person's id. Each person have invested many times in the sampling
period, in temporal order.
What I want to check is the correlation between invest and payout.
1. How do I run the regression for each person, with the "invest"
being
devided by the mean or medium of the person's "invest"?
2. How do I plot a graph with y axis being invest/mean(invest) and x axis
being payout, all 600 people's dots superimposed on one graph?
I tried to use
for (i in 1:(dim (DF)[1]-1))
{
if (DF[i,1]=DF[i+1,1]) id.lm <- lm(invest ~ payput, data=DF)
}
But I don't know how to superimpose graphs onto each other.
Thanks a lot!
Su
[[alternative HTML version deleted]]
Here is a start. This should create the plot:> x <- read.table(textConnection(" id invest payout+ 1 10 -1 + 1 33 33 + 1 20 -5 + 2 200 33 + 2 33 -20 + 3 5 -5 + 3 5 -5"), header = TRUE)> closeAllConnections() > # normalize 'invest' by the mean > x$investNorm <- ave(x$invest, x$id, FUN = function(a) a / mean(a)) > xid invest payout investNorm 1 1 10 -1 0.4761905 2 1 33 33 1.5714286 3 1 20 -5 0.9523810 4 2 200 33 1.7167382 5 2 33 -20 0.2832618 6 3 5 -5 1.0000000 7 3 5 -5 1.0000000> # plot the points > plot(x$payout, x$investNorm)On Sat, Apr 16, 2011 at 1:17 PM, ???Su Jiangdong <sujiangdong at gmail.com> wrote:> Hi there, > > I have a data frame DF of over 600 people's short term trade data in time > order. Below is the simplified structure of the data. > > ? ? ? ? id ? ? invest ? ? payout > [1] ? ? ?1 ?10 ? ? ? ? ? ? ? -1 > [2] ? ? ?1 ? ? ? ? ?33 ? ? ? ? ? 33 > [3] ? ? ?1 ?20 ? ? ? ? ? ? ? -5 > [4] ? ? ?2 ? ? ? ? ?200 ? ? ? ? ?33 > [5] ? ? ?2 ? ? ? ? ?33 ? ? ? ?-20 > [6] ? ? ?3 ? ? ? ? ? 5 ? ? ? ? ?-5 > [7] ? ? ?3 ?5 ? ? ? ?-5 > > id is each person's id. Each person have invested many times in the sampling > period, in temporal order. > > What I want to check is the correlation between invest and payout. > > 1. How do I run the regression for each person, with the "invest" being > devided by the mean or medium of the person's "invest"? > 2. How do I plot a graph with y axis being invest/mean(invest) and x axis > being payout, all 600 people's dots superimposed on one graph? > > I tried to use > > for (i in 1:(dim (DF)[1]-1)) > ?{ > if (DF[i,1]=DF[i+1,1]) ? id.lm <- lm(invest ~ payput, data=DF) > } > > But I don't know how to superimpose graphs onto each other. > > Thanks a lot! > > Su > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
This will add the regression line to it:> x <- read.table(textConnection(" id invest payout+ 1 10 -1 + 1 33 33 + 1 20 -5 + 2 200 33 + 2 33 -20 + 3 5 -5 + 3 5 -5"), header = TRUE)> closeAllConnections() > # normalize 'invest' by the mean > x$investNorm <- ave(x$invest, x$id, FUN = function(a) a / mean(a)) > xid invest payout investNorm 1 1 10 -1 0.4761905 2 1 33 33 1.5714286 3 1 20 -5 0.9523810 4 2 200 33 1.7167382 5 2 33 -20 0.2832618 6 3 5 -5 1.0000000 7 3 5 -5 1.0000000> # plot the points > plot(x$payout, x$investNorm) > abline(lm(x$investNorm ~ x$payout)) >On Sat, Apr 16, 2011 at 1:17 PM, ???Su Jiangdong <sujiangdong at gmail.com> wrote:> Hi there, > > I have a data frame DF of over 600 people's short term trade data in time > order. Below is the simplified structure of the data. > > ? ? ? ? id ? ? invest ? ? payout > [1] ? ? ?1 ?10 ? ? ? ? ? ? ? -1 > [2] ? ? ?1 ? ? ? ? ?33 ? ? ? ? ? 33 > [3] ? ? ?1 ?20 ? ? ? ? ? ? ? -5 > [4] ? ? ?2 ? ? ? ? ?200 ? ? ? ? ?33 > [5] ? ? ?2 ? ? ? ? ?33 ? ? ? ?-20 > [6] ? ? ?3 ? ? ? ? ? 5 ? ? ? ? ?-5 > [7] ? ? ?3 ?5 ? ? ? ?-5 > > id is each person's id. Each person have invested many times in the sampling > period, in temporal order. > > What I want to check is the correlation between invest and payout. > > 1. How do I run the regression for each person, with the "invest" being > devided by the mean or medium of the person's "invest"? > 2. How do I plot a graph with y axis being invest/mean(invest) and x axis > being payout, all 600 people's dots superimposed on one graph? > > I tried to use > > for (i in 1:(dim (DF)[1]-1)) > ?{ > if (DF[i,1]=DF[i+1,1]) ? id.lm <- lm(invest ~ payput, data=DF) > } > > But I don't know how to superimpose graphs onto each other. > > Thanks a lot! > > Su > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?