I have never used R-help to pose a question to the R-users community; is sending this Email the right way to do so? I am trying to use the ddply function in the plyr package to accomplish the following: I have a data frame of the type: ? ? ?ticker monthend_n wgtdiff ? ?ret 156 ? ? ?AA ? 19990228 ?0.7172 ?-2.58 545 ? ?AAPL ? 19990228 -0.0828 -15.48 925 ? ?ABCW ? 19990228 ?0.0966 ?-7.36 1041 ? ABFS ? 19990228 ?0.1320 ?-8.89 1165 ? ?ABI ? 19990228 ?0.2355 ? 4.61 1482 ? ?ABS ? 19990228 ?0.1668 ?-6.56 1563 ? ?ABT ? 19990228 ?0.1650 ?-0.27 1790 ? ACAT ? 19990228 ?0.1540 -13.82 2498 ? ?ACN ? 19990228 ?0.0000 ?12.15 2532 ? ?ACO ? 19990228 ?0.1320 ? 8.48 2857 ? ?ACV ? 19990228 ?0.1540 ?-6.54 2942 ? ACXM ? 19990228 ?0.0000 ?-6.13 3303 ? ADCT ? 19990228 ?0.1035 ? 1.73 3568 ? ?ADM ? 19990228 ?0.1540 ? 0.33 4072 ? ADSK ? 19990228 -0.1035 ?-9.19 4672 ? ?AEH ? 19990228 ?0.1650 ? ? NA 4673 ? AEIC ? 19990228 ?0.1314 ?-6.95 4867 ? ?AEP ? 19990228 ?0.1540 ?-3.62 157 ? ? ?AA ? 19990331 ?0.1932 ? 1.70 546 ? ?AAPL ? 19990331 ?0.0330 ? 3.23 1005 ? ?ABF ? 19990331 ?0.1540 -20.51 1166 ? ?ABI ? 19990331 ?0.2860 ? 8.33 1255 ? ?ABK ? 19990331 ?0.0966 ?-3.57 1483 ? ?ABS ? 19990331 ?0.0000 ?-4.50 1564 ? ?ABT ? 19990331 ?0.3955 ? 1.08 1733 ? ?ABX ? 19990331 ?0.2340 ?-3.53 2533 ? ?ACO ? 19990331 ?0.0966 ? 5.26 3304 ? ADCT ? 19990331 ?0.2925 ?17.75 3418 ? ?ADI ? 19990331 ?0.2688 ?18.70 3724 ? ?ADP ? 19990331 ?0.1540 -38.43 4514 ? ?AEE ? 19990331 ?0.1540 ?-1.31 4868 ? ?AEP ? 19990331 -0.0966 ?-4.65 I am trying to generate quintile cutoff points across the distribution of tickers for every month, using the command:> result <- ddply(test, .(monthend_n), .fun=cut, test$wgtdiff,5)I get the message: Error in cut.default(piece, ...) : 'x' must be numeric I tried creating a monthly list of data frames, extracting the wgtdiff column and passing that into the cut function, but that did not work either (as below) pieces <- split(test,test$monthend_n) vectors<- lapply(pieces,"[[","wgtdiff") quintiles <- lapply(vectors,cut(vectors[1:2],5)) Error in cut.default(vectors[1:2], 5) : 'x' must be numeric However, the cut function does the job correctly when I pass it only an individual month's data, as below: first <- pieces[[1]] quintiles <- cut(first$wgtdiff,5) levels(quintiles) What is the correct way to solve this problem? Thanks for your help, everyone!
Hi, Try: ddply(test,.(monthend_n),mutate,quintiles=cut(wgtdiff,5)) A.K. On Monday, January 13, 2014 5:32 PM, Amitabh Dugar <cleverchap at yahoo.com> wrote: I have never used R-help to pose a question to the R-users community; is sending this Email the right way to do so? I am trying to use the ddply function in the plyr package to accomplish the following: I have a data frame of the type: ? ? ?ticker monthend_n wgtdiff ? ?ret 156 ? ? ?AA ? 19990228 ?0.7172 ?-2.58 545 ? ?AAPL ? 19990228 -0.0828 -15.48 925 ? ?ABCW ? 19990228 ?0.0966 ?-7.36 1041 ? ABFS ? 19990228 ?0.1320 ?-8.89 1165 ? ?ABI ? 19990228 ?0.2355 ? 4.61 1482 ? ?ABS ? 19990228 ?0.1668 ?-6.56 1563 ? ?ABT ? 19990228 ?0.1650 ?-0.27 1790 ? ACAT ? 19990228 ?0.1540 -13.82 2498 ? ?ACN ? 19990228 ?0.0000 ?12.15 2532 ? ?ACO ? 19990228 ?0.1320 ? 8.48 2857 ? ?ACV ? 19990228 ?0.1540 ?-6.54 2942 ? ACXM ? 19990228 ?0.0000 ?-6.13 3303 ? ADCT ? 19990228 ?0.1035 ? 1.73 3568 ? ?ADM ? 19990228 ?0.1540 ? 0.33 4072 ? ADSK ? 19990228 -0.1035 ?-9.19 4672 ? ?AEH ? 19990228 ?0.1650 ? ? NA 4673 ? AEIC ? 19990228 ?0.1314 ?-6.95 4867 ? ?AEP ? 19990228 ?0.1540 ?-3.62 157 ? ? ?AA ? 19990331 ?0.1932 ? 1.70 546 ? ?AAPL ? 19990331 ?0.0330 ? 3.23 1005 ? ?ABF ? 19990331 ?0.1540 -20.51 1166 ? ?ABI ? 19990331 ?0.2860 ? 8.33 1255 ? ?ABK ? 19990331 ?0.0966 ?-3.57 1483 ? ?ABS ? 19990331 ?0.0000 ?-4.50 1564 ? ?ABT ? 19990331 ?0.3955 ? 1.08 1733 ? ?ABX ? 19990331 ?0.2340 ?-3.53 2533 ? ?ACO ? 19990331 ?0.0966 ? 5.26 3304 ? ADCT ? 19990331 ?0.2925 ?17.75 3418 ? ?ADI ? 19990331 ?0.2688 ?18.70 3724 ? ?ADP ? 19990331 ?0.1540 -38.43 4514 ? ?AEE ? 19990331 ?0.1540 ?-1.31 4868 ? ?AEP ? 19990331 -0.0966 ?-4.65 I am trying to generate quintile cutoff points across the distribution of tickers for every month, using the command:> result <- ddply(test, .(monthend_n), .fun=cut, test$wgtdiff,5)I get the message: Error in cut.default(piece, ...) : 'x' must be numeric I tried creating a monthly list of data frames, extracting the wgtdiff column and passing that into the cut function, but that did not work either (as below) pieces <- split(test,test$monthend_n) vectors<- lapply(pieces,"[[","wgtdiff") quintiles <- lapply(vectors,cut(vectors[1:2],5)) Error in cut.default(vectors[1:2], 5) : 'x' must be numeric However, the cut function does the job correctly when I pass it only an individual month's data, as below: first <- pieces[[1]] quintiles <- cut(first$wgtdiff,5) levels(quintiles) What is the correct way to solve this problem? Thanks for your help, everyone! ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Jan 13, 2014, at 1:29 PM, Amitabh Dugar wrote:> I have never used R-help to pose a question to the R-users community; is sending this Email the right way to do so? > > I am trying to use the ddply function in the plyr package to accomplish the following: > I have a data frame of the type: > > ticker monthend_n wgtdiff ret > 156 AA 19990228 0.7172 -2.58 > 545 AAPL 19990228 -0.0828 -15.48 > 925 ABCW 19990228 0.0966 -7.36 > 1041 ABFS 19990228 0.1320 -8.89 > 1165 ABI 19990228 0.2355 4.61 > 1482 ABS 19990228 0.1668 -6.56 > 1563 ABT 19990228 0.1650 -0.27 > 1790 ACAT 19990228 0.1540 -13.82 > 2498 ACN 19990228 0.0000 12.15 > 2532 ACO 19990228 0.1320 8.48 > 2857 ACV 19990228 0.1540 -6.54 > 2942 ACXM 19990228 0.0000 -6.13 > 3303 ADCT 19990228 0.1035 1.73 > 3568 ADM 19990228 0.1540 0.33 > 4072 ADSK 19990228 -0.1035 -9.19 > 4672 AEH 19990228 0.1650 NA > 4673 AEIC 19990228 0.1314 -6.95 > 4867 AEP 19990228 0.1540 -3.62 > 157 AA 19990331 0.1932 1.70 > 546 AAPL 19990331 0.0330 3.23 > 1005 ABF 19990331 0.1540 -20.51 > 1166 ABI 19990331 0.2860 8.33 > 1255 ABK 19990331 0.0966 -3.57 > 1483 ABS 19990331 0.0000 -4.50 > 1564 ABT 19990331 0.3955 1.08 > 1733 ABX 19990331 0.2340 -3.53 > 2533 ACO 19990331 0.0966 5.26 > 3304 ADCT 19990331 0.2925 17.75 > 3418 ADI 19990331 0.2688 18.70 > 3724 ADP 19990331 0.1540 -38.43 > 4514 AEE 19990331 0.1540 -1.31 > 4868 AEP 19990331 -0.0966 -4.65 > > I am trying to generate quintile cutoff points across the distribution of tickers for every month, using the command: >> result <- ddply(test, .(monthend_n), .fun=cut, test$wgtdiff,5) > > I get the message: > Error in cut.default(piece, ...) : 'x' must be numeric > > I tried creating a monthly list of data frames, extracting the wgtdiff column and passing that into the cut function, but that did not work either (as below) > pieces <- split(test,test$monthend_n) > vectors<- lapply(pieces,"[[","wgtdiff") > quintiles <- lapply(vectors,cut(vectors[1:2],5)) > Error in cut.default(vectors[1:2], 5) : 'x' must be numeric > > However, the cut function does the job correctly when I pass it only an individual month's data, as below: > first <- pieces[[1]] > quintiles <- cut(first$wgtdiff,5) > levels(quintiles) > > What is the correct way to solve this problem?This will deliver classification results within categories of monthend_n. You should not need to supply the data name as test$wgtdiff. result <- ddply(test, .(monthend_n), summarise, cut(wgtdiff,breaks=5) )> > Thanks for your help, everyone! > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA