Hi everyone, I've got a question concerning the function Box.test for testing autocorrelation in my data. My data consist of (daily) returns of several stocks over time (first row=time, all other rows=stock returns). I intend to perform a Box-Ljung test for my returns (for each stock). Since I have about 3000 stocks in my list, I'm not able to perform the test individually for each stock. Unfortunately the Box.test only works for univariate series. My goal is to get a list with every p-value (from the output) of the 3000 tests (that is a list with 3000 p-values). Any hint how to do this? I tried to do this with the function mapply, but it didn't work. Many thanks in advance & best regards S.B. [[alternative HTML version deleted]]
Did you try regular apply? If you have univariate input, there's no reason to use the multivariate mapply. Or more generally: apply(P[-1,],1,function(p) Box.test(p)$p.value) Michael On Tue, Sep 27, 2011 at 4:45 AM, Samir Benzerfa <benzerfa@gmx.ch> wrote:> Hi everyone, > > > > I've got a question concerning the function Box.test for testing > autocorrelation in my data. > > > > My data consist of (daily) returns of several stocks over time (first > row=time, all other rows=stock returns). I intend to perform a Box-Ljung > test for my returns (for each stock). Since I have about 3000 stocks in my > list, I'm not able to perform the test individually for each stock. > Unfortunately the Box.test only works for univariate series. My goal is to > get a list with every p-value (from the output) of the 3000 tests (that is > a > list with 3000 p-values). Any hint how to do this? I tried to do this with > the function mapply, but it didn't work. > > > > Many thanks in advance & best regards > > S.B. > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Send this again using dput() to give a plain text output and I'll look at it. Also, I think you should probably look into the difference between a row and a column. Michael On Tue, Sep 27, 2011 at 11:48 AM, Samir Benzerfa <benzerfa@gmx.ch> wrote:> Many thanks for your hint. I tried regular apply now. However, it still > doesn’t work. Function apply works fine with other regular functions like > sum or mean. But for the function Box.test(x,…) it gives me the following > error message: **** > > ** ** > > *Error in Box.test(…) : * > > * x is not a vector or univariate time series* > > * * > > For simplicity, I tried to do the test with a simple 2x20 Matrix for 2 > stocks (see below), but it still does not work. It works well if I do the > test individually for each row à Box.test(x[,1],…) and Box.test(x[,2],…)** > ** > > ** ** > > BANK.ABC ABC.MATERIAL**** > > 1 0.000000000 0.000000000**** > > 2 0.000000000 0.000000000**** > > 3 0.000000000 0.000000000**** > > 4 0.003181659 -0.008194479**** > > 5 -0.006386799 -0.008352074**** > > 6 0.028028724 0.008352074**** > > 7 -0.015347692 0.004116566**** > > 8 -0.015910002 0.016086820**** > > 9 0.003228970 0.019305155**** > > 10 -0.013062473 -0.011479818**** > > 11 0.000000000 0.000000000**** > > 12 -0.038090050 -0.011791525**** > > 13 0.021189299 -0.008042720**** > > 14 -0.003460532 -0.008194479**** > > 15 -0.010550182 -0.012589127**** > > 16 0.017443890 0.016705694**** > > 17 0.010139631 0.000000000**** > > 18 -0.017033339 0.012120633**** > > 19 0.010299957 0.023271342**** > > 20 0.000000000 -0.007619397**** > > ** ** > > Any other hints? My goal is to do the Box.test for each row (for each > stock) separately. So I want R to take each row one by one and perform the > test.**** > > ** ** > > ** ** > > *Von:* R. Michael Weylandt [mailto:michael.weylandt@gmail.com] > *Gesendet:* Dienstag, 27. September 2011 13:12 > *An:* Samir Benzerfa > *Cc:* r-help@r-project.org > *Betreff:* Re: [R] Question concerning Box.test**** > > ** ** > > Did you try regular apply? If you have univariate input, there's no reason > to use the multivariate mapply. Or more generally: > > apply(P[-1,],1,function(p) Box.test(p)$p.value) > > Michael**** > > On Tue, Sep 27, 2011 at 4:45 AM, Samir Benzerfa <benzerfa@gmx.ch> wrote:** > ** > > Hi everyone, > > > > I've got a question concerning the function Box.test for testing > autocorrelation in my data. > > > > My data consist of (daily) returns of several stocks over time (first > row=time, all other rows=stock returns). I intend to perform a Box-Ljung > test for my returns (for each stock). Since I have about 3000 stocks in my > list, I'm not able to perform the test individually for each stock. > Unfortunately the Box.test only works for univariate series. My goal is to > get a list with every p-value (from the output) of the 3000 tests (that is > a > list with 3000 p-values). Any hint how to do this? I tried to do this with > the function mapply, but it didn't work. > > > > Many thanks in advance & best regards > > S.B. > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.**** > > ** ** >[[alternative HTML version deleted]]
Plaintext data looks like this: P <- structure(list(X77.BANK = c(0, 0, 0, 0.003181659, -0.006386799, 0.028028724, -0.015347692, -0.015910002, 0.00322897, -0.013062473, 0, -0.03809005, 0.021189299, -0.003460532, -0.010550182, 0.01744389, 0.010139631, -0.017033339, 0.010299957, 0), A...A.MATERIAL = c(0, 0, 0, -0.008194479, -0.008352074, 0.008352074, 0.004116566, 0.01608682, 0.019305155, -0.011479818, 0, -0.011791525, -0.00804272, -0.008194479, -0.012589127, 0.016705694, 0, 0.012120633, 0.023271342, -0.007619397 )), .Names = c("X77.BANK", "A...A.MATERIAL"), row.names = c(NA, 20L), class = "data.frame") Can you provide your test code exactly? The following both work for me: R> apply(P,2,Box.test) $X77.BANK Box-Pierce test data: newX[, i] X-squared = 3.1825, df = 1, p-value = 0.07443 $A...A.MATERIAL Box-Pierce test data: newX[, i] X-squared = 0.3258, df = 1, p-value = 0.5682 R> apply(P,2,function(x) Box.test(x)$p.value) X77.BANK A...A.MATERIAL 0.07443097 0.56815070 Michael On Tue, Sep 27, 2011 at 5:42 PM, Samir Benzerfa <benzerfa@gmx.ch> wrote:> Please find the plain text output in the attachment.**** > > ** ** > > Yes, I’m sorry! I apologize for the confusion of rows and columns. > Actually, I want to perform the test for each column and not row.**** > > ** ** > > *Von:* R. Michael Weylandt [mailto:michael.weylandt@gmail.com] > *Gesendet:* Dienstag, 27. September 2011 17:51 > *An:* Samir Benzerfa; r-help > > *Betreff:* Re: [R] Question concerning Box.test**** > > ** ** > > Send this again using dput() to give a plain text output and I'll look at > it. > > Also, I think you should probably look into the difference between a row > and a column. > > Michael**** > > On Tue, Sep 27, 2011 at 11:48 AM, Samir Benzerfa <benzerfa@gmx.ch> wrote:* > *** > > Many thanks for your hint. I tried regular apply now. However, it still > doesn’t work. Function apply works fine with other regular functions like > sum or mean. But for the function Box.test(x,…) it gives me the following > error message: **** > > **** > > *Error in Box.test(…) : ***** > > * x is not a vector or univariate time series***** > > * ***** > > For simplicity, I tried to do the test with a simple 2x20 Matrix for 2 > stocks (see below), but it still does not work. It works well if I do the > test individually for each row à Box.test(x[,1],…) and Box.test(x[,2],…)** > ** > > **** > > BANK.ABC ABC.MATERIAL**** > > 1 0.000000000 0.000000000**** > > 2 0.000000000 0.000000000**** > > 3 0.000000000 0.000000000**** > > 4 0.003181659 -0.008194479**** > > 5 -0.006386799 -0.008352074**** > > 6 0.028028724 0.008352074**** > > 7 -0.015347692 0.004116566**** > > 8 -0.015910002 0.016086820**** > > 9 0.003228970 0.019305155**** > > 10 -0.013062473 -0.011479818**** > > 11 0.000000000 0.000000000**** > > 12 -0.038090050 -0.011791525**** > > 13 0.021189299 -0.008042720**** > > 14 -0.003460532 -0.008194479**** > > 15 -0.010550182 -0.012589127**** > > 16 0.017443890 0.016705694**** > > 17 0.010139631 0.000000000**** > > 18 -0.017033339 0.012120633**** > > 19 0.010299957 0.023271342**** > > 20 0.000000000 -0.007619397**** > > **** > > Any other hints? My goal is to do the Box.test for each row (for each > stock) separately. So I want R to take each row one by one and perform the > test.**** > > **** > > **** > > *Von:* R. Michael Weylandt [mailto:michael.weylandt@gmail.com] > *Gesendet:* Dienstag, 27. September 2011 13:12 > *An:* Samir Benzerfa > *Cc:* r-help@r-project.org > *Betreff:* Re: [R] Question concerning Box.test**** > > **** > > Did you try regular apply? If you have univariate input, there's no reason > to use the multivariate mapply. Or more generally: > > apply(P[-1,],1,function(p) Box.test(p)$p.value) > > Michael**** > > On Tue, Sep 27, 2011 at 4:45 AM, Samir Benzerfa <benzerfa@gmx.ch> wrote:** > ** > > Hi everyone, > > > > I've got a question concerning the function Box.test for testing > autocorrelation in my data. > > > > My data consist of (daily) returns of several stocks over time (first > row=time, all other rows=stock returns). I intend to perform a Box-Ljung > test for my returns (for each stock). Since I have about 3000 stocks in my > list, I'm not able to perform the test individually for each stock. > Unfortunately the Box.test only works for univariate series. My goal is to > get a list with every p-value (from the output) of the 3000 tests (that is > a > list with 3000 p-values). Any hint how to do this? I tried to do this with > the function mapply, but it didn't work. > > > > Many thanks in advance & best regards > > S.B. > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.**** > > **** > > ** ** >[[alternative HTML version deleted]]
Well, you could to the following to throw out the NAs on a column basis before being passed to Box.test" apply(P, 2, function(x) Box.test(na.omit(x))$p.value) and that probably will work for you. This will only throw out the NAs in each column and won't cancel across columns. However, all the standard points about throwing away data apply: specifically, you should ask yourself why you have NA returns on a certain day -- perhaps that is itself deeply significant -- and make sure the pattern is random (enough). Hope this helps, Michael Weylandt On Wed, Sep 28, 2011 at 5:11 AM, Samir Benzerfa <benzerfa@gmx.ch> wrote:> Many many thanks! My code looked like this: *apply(P,2, > FUN=Box.test(P)$p.value)***** > > So the problem was that I didn’t include “function(x)” before > Box.test(x)$p.value and that I used my object “P” in the brackets instead of > x. It works fine now.**** > > ** ** > > I have one last question regarding this problem: The help file for Box.test > states that “Missing values are not handled”. Any idea what that technically > means? Does it mean that NA’s are included in the calculations or does R > just skip missing values? Since I have many NA Values in my data I would > like R to exclude these values temporarily for the calculations. > Specifically I’d like R to consider each column without any NA’s before > performing the Box.test for that column and then pass to the next one, such > that I get the test results for each column as if there where no NA’s in it. > I tried to do this with *na.rm=T or na.action=… *as an argument for the > function apply, but R refuses (error message: *error in FUN(…); unused > argument na.rm=T*). I know that I could clean my data from NA’s by using > na.omit or na.exclude, but this cancels all rows which include an NA and > since I have about 3’000 columns (stocks) and each one includes several > different NA’s I would end up with very few data. Any hints for this issue? > **** > > ** ** > > Best, S.B.**** > > ** ** > > ** ** > > *Von:* R. Michael Weylandt [mailto:michael.weylandt@gmail.com] > *Gesendet:* Mittwoch, 28. September 2011 00:39 > > *An:* Samir Benzerfa; r-help > *Betreff:* Re: [R] Question concerning Box.test**** > > ** ** > > Plaintext data looks like this: > > P <- structure(list(X77.BANK = c(0, 0, 0, 0.003181659, -0.006386799, > 0.028028724, -0.015347692, -0.015910002, 0.00322897, -0.013062473, > 0, -0.03809005, 0.021189299, -0.003460532, -0.010550182, 0.01744389, > 0.010139631, -0.017033339, 0.010299957, 0), A...A.MATERIAL = c(0, > 0, 0, -0.008194479, -0.008352074, 0.008352074, 0.004116566, 0.01608682, > 0.019305155, -0.011479818, 0, -0.011791525, -0.00804272, -0.008194479, > -0.012589127, 0.016705694, 0, 0.012120633, 0.023271342, -0.007619397 > )), .Names = c("X77.BANK", "A...A.MATERIAL"), row.names = c(NA, > 20L), class = "data.frame") > > Can you provide your test code exactly? The following both work for me: > > R> apply(P,2,Box.test) > $X77.BANK > > Box-Pierce test > > data: newX[, i] > X-squared = 3.1825, df = 1, p-value = 0.07443 > > > $A...A.MATERIAL > > Box-Pierce test > > data: newX[, i] > X-squared = 0.3258, df = 1, p-value = 0.5682 > > R> apply(P,2,function(x) Box.test(x)$p.value) > X77.BANK A...A.MATERIAL > 0.07443097 0.56815070 > > Michael > > **** > > On Tue, Sep 27, 2011 at 5:42 PM, Samir Benzerfa <benzerfa@gmx.ch> wrote:** > ** > > Please find the plain text output in the attachment.**** > > **** > > Yes, I’m sorry! I apologize for the confusion of rows and columns. > Actually, I want to perform the test for each column and not row.**** > > **** > > *Von:* R. Michael Weylandt [mailto:michael.weylandt@gmail.com] > *Gesendet:* Dienstag, 27. September 2011 17:51 > *An:* Samir Benzerfa; r-help**** > > > *Betreff:* Re: [R] Question concerning Box.test**** > > **** > > Send this again using dput() to give a plain text output and I'll look at > it. > > Also, I think you should probably look into the difference between a row > and a column. > > Michael**** > > On Tue, Sep 27, 2011 at 11:48 AM, Samir Benzerfa <benzerfa@gmx.ch> wrote:* > *** > > Many thanks for your hint. I tried regular apply now. However, it still > doesn’t work. Function apply works fine with other regular functions like > sum or mean. But for the function Box.test(x,…) it gives me the following > error message: **** > > **** > > *Error in Box.test(…) : ***** > > * x is not a vector or univariate time series***** > > * ***** > > For simplicity, I tried to do the test with a simple 2x20 Matrix for 2 > stocks (see below), but it still does not work. It works well if I do the > test individually for each row à Box.test(x[,1],…) and Box.test(x[,2],…)** > ** > > **** > > BANK.ABC ABC.MATERIAL**** > > 1 0.000000000 0.000000000**** > > 2 0.000000000 0.000000000**** > > 3 0.000000000 0.000000000**** > > 4 0.003181659 -0.008194479**** > > 5 -0.006386799 -0.008352074**** > > 6 0.028028724 0.008352074**** > > 7 -0.015347692 0.004116566**** > > 8 -0.015910002 0.016086820**** > > 9 0.003228970 0.019305155**** > > 10 -0.013062473 -0.011479818**** > > 11 0.000000000 0.000000000**** > > 12 -0.038090050 -0.011791525**** > > 13 0.021189299 -0.008042720**** > > 14 -0.003460532 -0.008194479**** > > 15 -0.010550182 -0.012589127**** > > 16 0.017443890 0.016705694**** > > 17 0.010139631 0.000000000**** > > 18 -0.017033339 0.012120633**** > > 19 0.010299957 0.023271342**** > > 20 0.000000000 -0.007619397**** > > **** > > Any other hints? My goal is to do the Box.test for each row (for each > stock) separately. So I want R to take each row one by one and perform the > test.**** > > **** > > **** > > *Von:* R. Michael Weylandt [mailto:michael.weylandt@gmail.com] > *Gesendet:* Dienstag, 27. September 2011 13:12 > *An:* Samir Benzerfa > *Cc:* r-help@r-project.org > *Betreff:* Re: [R] Question concerning Box.test**** > > **** > > Did you try regular apply? If you have univariate input, there's no reason > to use the multivariate mapply. Or more generally: > > apply(P[-1,],1,function(p) Box.test(p)$p.value) > > Michael**** > > On Tue, Sep 27, 2011 at 4:45 AM, Samir Benzerfa <benzerfa@gmx.ch> wrote:** > ** > > Hi everyone, > > > > I've got a question concerning the function Box.test for testing > autocorrelation in my data. > > > > My data consist of (daily) returns of several stocks over time (first > row=time, all other rows=stock returns). I intend to perform a Box-Ljung > test for my returns (for each stock). Since I have about 3000 stocks in my > list, I'm not able to perform the test individually for each stock. > Unfortunately the Box.test only works for univariate series. My goal is to > get a list with every p-value (from the output) of the 3000 tests (that is > a > list with 3000 p-values). Any hint how to do this? I tried to do this with > the function mapply, but it didn't work. > > > > Many thanks in advance & best regards > > S.B. > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.**** > > **** > > **** > > ** ** >[[alternative HTML version deleted]]