thr3ads.net - R help - [R] Min of [Sep 2011]

If this information is useful, please help other people find it:
Share via:

bradford

2011-Sep-13 17:17 UTC

[R] Min of

With the help of Andrie on StackOverflow.com, I was able to learn about
ddply.  I have another question that is more trivial and cannot seem to find
help on IRC and do not want to bother Andrie again.  I can't seem to figure
out what to google for, so I thought I'd ask here.

I have:
library(plyr)
df_diff <- ddply(df, .(SOURCE), summarize,
TIME_DIFF=-unclass(diff(REQUEST_DATE)))
df_diff
  SOURCE TIME_DIFF
1      A      7.55
2      A      5.55
3      A      3.40
4      D     35.00
5      D    563.00
6      D     37.00
7      D     35.00
8      D    996.00

... with a lot more records.

I want to essentially sort SOURCE asc, TIME_DIFF asc and output the top 15
lowest TIME_DIFFS for each SOURCE.  How do I do this?

Also, what is the data type of df_diff called so that I can look into it
some more?

	[[alternative HTML version deleted]]

David Winsemius

2011-Sep-13 18:54 UTC

head link

[R] Min of

On Sep 13, 2011, at 1:17 PM, bradford wrote:
> With the help of Andrie on StackOverflow.com, I was able to learn  
> about
> ddply.  I have another question that is more trivial and cannot seem  
> to find
> help on IRC and do not want to bother Andrie again.
It's doubtful that he would have considered it a bother. Just post a  
question and anyone up for rep points could do it. I certainly haven't  
noticed that Andrie is slacking off despite his 14+K points.
>  I can't seem to figure
> out what to google for, so I thought I'd ask here.
>
> I have:
> library(plyr)
> df_diff <- ddply(df, .(SOURCE), summarize,
> TIME_DIFF=-unclass(diff(REQUEST_DATE)))
> df_diff
>  SOURCE TIME_DIFF
> 1      A      7.55
> 2      A      5.55
> 3      A      3.40
> 4      D     35.00
> 5      D    563.00
> 6      D     37.00
> 7      D     35.00
> 8      D    996.00
>
> ... with a lot more records.
>
> I want to essentially sort SOURCE asc, TIME_DIFF asc and output the  
> top 15
> lowest TIME_DIFFS for each SOURCE.  How do I do this?
You might (I say "might" in the absence of a reproducible example for
testing) do this with ave:

df_diff[ with( df.diff, ave(TIME_DIFF, SOURCE , FUN= order) < 16), ]

>
> Also, what is the data type of df_diff called so that I can look  
> into it
> some more?
The second letter in a **ply call tells you. if it's a "d", then
it
returns a dataframe. First letter is input class, second is output.
>
> 	[[alternative HTML version deleted]]
>
> _____________________________________________
David Winsemius, MD
West Hartford, CT

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Sep 2011 - Min of

[R] Min of

[R] Min of

Apparently Analagous Threads