thr3ads.net - R help - [R] data.frame and ddply [Apr 2010]

If this information is useful, please help other people find it:
Share via:

arnaud Gaboury

2010-Apr-15 16:04 UTC

[R] data.frame and ddply

Dear group,

I have this following data.frame:

> c      DESCRIPTION CREATED.DATE QUANITY CLOSING.PRICE
26 PRM HGH GD ALU   2010-04-09      -1    2,415.9000
27 PRM HGH GD ALU   2010-04-09       1    2,415.9000
28 PRIMARY NICKEL   2010-03-04       1   25,755.7100
29 PRIMARY NICKEL   2010-03-05      -1   25,755.7100
30 PRIMARY NICKEL   2010-03-10      -1   25,760.8600
31 PRIMARY NICKEL   2010-03-10       1   25,760.8600
32 STANDARD LEAD    2010-04-01       1    2,355.9600
33 STANDARD LEAD    2010-04-01      -1    2,355.9600
34 STANDARD LEAD    2010-04-01       1    2,355.9600
35 STANDARD LEAD    2010-04-01      -1    2,355.9600
36 STANDARD LEAD    2010-04-01      -1    2,355.9600
37 STANDARD LEAD    2010-04-01       1    2,355.9600
38 STANDARD LEAD    2010-04-01      -1    2,355.9600
39 STANDARD LEAD    2010-04-06       1    2,357.1200
40 STANDARD LEAD    2010-04-08       1    2,420.7300
41 SPCL HIGH GRAD   2010-04-08       1    2,420.7300
42 SPCL HIGH GRAD   2010-04-08      -1    2,420.7300
43 SPCL HIGH GRAD   2010-04-09      -1    2,421.0500
44 SPCL HIGH GRAD   2010-04-09       1    2,421.0500
45 SPCL HIGH GRAD   2010-04-09      -1    2,421.0500
46 SPCL HIGH GRAD   2010-04-09       1    2,421.0500
47 SPCL HIGH GRAD   2010-04-09       1    2,421.0500
48 SPCL HIGH GRAD   2010-01-13       1    2,388.4300
49 SPCL HIGH GRAD   2010-01-25      -1    2,388.4300

The goal is to get a smaller df, with only PRM HGH GD ALU, PRIMARY NICKEL,
STANDARD LEAD, STANDARD LEAD as rows, with QUANITY column as a sum of
QUANITY, DATE column as the max CREATED.DATE with the corresponding
CLOSING.PRICE.

If I pass this :

    >  op=ddply(c, c("DESCRIPTION"), summarise,
POSITIONsum(QUANITY),DATE=max(CREATED.DATE))

It returns this :
> op     DESCRIPTION POSITION       DATE
1 PRIMARY NICKEL        0 2010-03-10
2 PRM HGH GD ALU        0 2010-04-09
3 SPCL HIGH GRAD        2 2010-04-09
4 STANDARD LEAD         0 2010-04-06

Not bad, as I have my 4 elements, the sum of QUANITY for each one, the max
DATE for each one, BUT I would like to add the CLOSING.PRICE column with the
CLOSING.PRICE corresponding to the max date.
I have no idea how to do this.

Any help would be appreciated.

TY

arnaud Gaboury

2010-Apr-16 13:28 UTC

head link

[R] data.frame and ddply

Dear group,

Here is my df :


futures <-
structure(list(CONTRAT = c("WHEAT May/10 ", "WHEAT May/10 ",
"WHEAT May/10 ", "WHEAT May/10 ", "COTTON NO.2 May/10
", "COTTON NO.2 May/10
", 
"COTTON NO.2 May/10 ", "PLATINUM Jul/10 ", " SUGAR
NO.11 May/10 ",
" SUGAR NO.11 May/10 ", " SUGAR NO.11 May/10 ", " SUGAR
NO.11 May/10 ",
" SUGAR NO.11 May/10 ", "ROBUSTA COFFEE (10) May/10 ",
"ROBUSTA COFFEE (10)
May/10 ", 
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10
",
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10
",
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10
",
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10
",
"ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10
"),
    QUANTITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1, 
    1, 1, 2, 1, 1, 1, 1, 2, 1, 1), SETTLEMENT = c("467.7500", 
    "467.7500", "467.7500", "467.7500",
"78.1300", "78.1300",
    "78.1300", "1,739.4000", "16.5400",
"16.5400", "16.5400",
    "16.5400", "16.5400", "1,353.0000",
"1,353.0000", "1,353.0000",
    "1,353.0000", "1,353.0000", "1,353.0000",
"1,353.0000", "1,353.0000",
    "1,353.0000", "1,353.0000", "1,353.0000",
"1,353.0000")), .Names c("CONTRAT",
"QUANTITY", "SETTLEMENT"), row.names = c(NA, 25L), class =
"data.frame")

Here is my code :
>opfut=ddply(futures, c("CONTRAT","SETTLEMENT"),
summarise, POSITIONsum(QUANTITY))
Here is the output:
> opfut                      CONTRAT SETTLEMENT POSITION
1         SUGAR NO.11 May/10     16.5400        5
2         COTTON NO.2 May/10     78.1300        3
3            PLATINUM Jul/10  1,739.4000       -1
4 ROBUSTA COFFEE (10) May/10  1,353.0000       15
5               WHEAT May/10    467.7500        4

It is almost exactly what I want, except I am expecting the POSITION column
before the SETTLEMENT column. How can I modified my code to obtain this?

TY



***************************
Arnaud Gaboury
Mobile: +41 79 392 79 56
BBM: 255B488F

David Winsemius

2010-Apr-16 15:54 UTC

head link

[R] data.frame and ddply

On Apr 16, 2010, at 9:28 AM, arnaud Gaboury wrote:
> Dear group,
>
> Here is my df :
>
>
> futures <-
> structure(list(CONTRAT = c("WHEAT May/10 ", "WHEAT May/10
",
> "WHEAT May/10 ", "WHEAT May/10 ", "COTTON NO.2
May/10 ", "COTTON NO.
> 2 May/10
> ",
> "COTTON NO.2 May/10 ", "PLATINUM Jul/10 ", " SUGAR
NO.11 May/10 ",
> " SUGAR NO.11 May/10 ", " SUGAR NO.11 May/10 ", "
SUGAR NO.11 May/10
> ",
> " SUGAR NO.11 May/10 ", "ROBUSTA COFFEE (10) May/10 ",
"ROBUSTA
> COFFEE (10)
> May/10 ",
> "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10
",
> "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10
",
> "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10
",
> "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10
",
> "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10
"),
>    QUANTITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1,
>    1, 1, 2, 1, 1, 1, 1, 2, 1, 1), SETTLEMENT = c("467.7500",
>    "467.7500", "467.7500", "467.7500",
"78.1300", "78.1300",
>    "78.1300", "1,739.4000", "16.5400",
"16.5400", "16.5400",
>    "16.5400", "16.5400", "1,353.0000",
"1,353.0000", "1,353.0000",
>    "1,353.0000", "1,353.0000", "1,353.0000",
"1,353.0000",
> "1,353.0000",
>    "1,353.0000", "1,353.0000", "1,353.0000",
"1,353.0000")), .Names > c("CONTRAT",
> "QUANTITY", "SETTLEMENT"), row.names = c(NA, 25L),
class =
> "data.frame")
>
> Here is my code :
>
>> opfut=ddply(futures, c("CONTRAT","SETTLEMENT"),
summarise, POSITION> sum(QUANTITY))
>
> Here is the output:
>
>> opfut
>                      CONTRAT SETTLEMENT POSITION
> 1         SUGAR NO.11 May/10     16.5400        5
> 2         COTTON NO.2 May/10     78.1300        3
> 3            PLATINUM Jul/10  1,739.4000       -1
> 4 ROBUSTA COFFEE (10) May/10  1,353.0000       15
> 5               WHEAT May/10    467.7500        4
>
> It is almost exactly what I want, except I am expecting the POSITION  
> column
> before the SETTLEMENT column. How can I modified my code to obtain  
> this?
Most compact way:
opfut[, c(1,3,2)]

Better readability:
opfut[, c("CONTRAT", "POSITION", "SETTLEMENT")]

And that particular operation is really basic stuff, suggesting that  
you should go back to the basic R documents and spend some more time  
educating yourself. Either of those extract operations could have been  
appended on the ddply call as well, e.g.:

  opfut=ddply(futures, c("CONTRAT","SETTLEMENT"), summarise,
POSITION=
sum(QUANTITY))[, c(1,3,2)]
  opfut
#----------------
                        CONTRAT POSITION SETTLEMENT
1          SUGAR NO.11 May/10         5    16.5400
2         COTTON NO.2 May/10\n        1    78.1300
3          COTTON NO.2 May/10         2    78.1300
4             PLATINUM Jul/10        -1 1,739.4000
5 ROBUSTA COFFEE (10)\nMay/10         1 1,353.0000
6  ROBUSTA COFFEE (10) May/10        14 1,353.0000
7                WHEAT May/10         4   467.7500

("Functional programming")


-- 

David Winsemius, MD
West Hartford, CT

R help - Apr 2010 - data.frame and ddply

[R] data.frame and ddply

[R] data.frame and ddply

[R] data.frame and ddply