thr3ads.net - R help - [R] a pickle with ranks and reals? [Aug 2003]

If this information is useful, please help other people find it:
Share via:

John Christie

2003-Aug-22 04:44 UTC

[R] a pickle with ranks and reals?

I predicted that y would increase as x increased.  However, I only made 
the prediction on the ranks of the scores.  The ranks don't correlate 
with predicted.  And, I don't think a regression on the ranks is 
warranted.  However, the actual scores do yield a significant slope for 
b, and a significant R^2 using a linear regression (y is the value and 
x is the predicted rank).  What should my argument be here?  Should I 
have endorsed using the actual scores instead of ranks to begin for 
some reason that doesn't have anything to do with my current result? :)

Oh, on another note, I can use rcorr to get the Spearman correlations, 
but I'd like to be able to just add
the ranks as a column.  I was going to just use order and add a simple 
factor.  But, that doesn't deal with ties correctly.

And, I also wanted to analyze correlations subject by subject and 
compare my two groups.  However, there doesn't seem to be a good way to 
get this.  I tried using "by" with "cor".  However, this
requires
binding x and y which causes cor to return a matrix (if you could pass 
it x and y separate it would just return a number).

given

data frame s
x	y	subj
4	7	harry
5	1	harry
6	9	harry
2	4	steve
3	7	steve
...

i'd like to be able to produce

r	subj
.12	harry
.52	steve
...

any tips?

John Christie

2003-Aug-22 12:18 UTC

head link

[R] a pickle (solved first part now need r's from data)

On Friday, August 22, 2003, at 01:44  AM, John Christie wrote:
> I predicted that y would increase as x increased.  However, I only 
> made the prediction on the ranks of the scores.  The ranks don't 
> correlate with predicted.  And, I don't think a regression on the 
> ranks is warranted.  However, the actual scores do yield a significant 
> slope for b, and a significant R^2 using a linear regression (y is the 
> value and x is the predicted rank).  What should my argument be here?  
> Should I have endorsed using the actual scores instead of ranks to 
> begin for some reason that doesn't have anything to do with my current 
> result? :)
OK, now I realize that I should probably not have been correlating 
ranks in the first place because my real data may have had a 
non-linear, but still steadily increasing, slope.  The ranks would tend 
to increase variance where the slope was low and ruined my chance of 
finding an effect.
> Oh, on another note, I can use rcorr to get the Spearman correlations, 
> but I'd like to be able to just add
> the ranks as a column.  I was going to just use order and add a simple 
> factor.  But, that doesn't deal with ties correctly.
still don't have these yet.
> And, I also wanted to analyze correlations subject by subject and 
> compare my two groups.  However, there doesn't seem to be a good way 
> to get this.  I tried using "by" with "cor".  However,
this requires
> binding x and y which causes cor to return a matrix (if you could pass 
> it x and y separate it would just return a number).
>
> given
>
> data frame s
> x	y	subj
> 4	7	harry
> 5	1	harry
> 6	9	harry
> 2	4	steve
> 3	7	steve
> ...
>
> i'd like to be able to produce
>
> r	subj
> .12	harry
> .52	steve
> ...
>
> any tips?

Thomas W Blackwell

2003-Aug-22 13:03 UTC

head link

[R] a pickle with ranks and reals?

John  -

Here are two equivalent solutions to your final question:

data <- data.frame(x=seq(15), y=sample(seq(15), 15),

subj=sample(c("harry","steve","nathan","john"),
15, T))

result.1 <- unclass(by(data, data$subj, function(dd) cor(dd$x, dd$y)))

result.2 <- unclass(by(data, data$subj, function(dd) cor(dd[c(1,2)])[1,2]))

I guess I prefer  result.1  since the code is easier to read,
even though it does bury literal column names into the code.

The "function(dd)" stuff is a very common construction in  by(),
sapply(), lapply()  constructs.  It defines a little function
in-line, without ever naming it, and passes it as the third
argument to  by().  I use this all the time, when I need to
rearrange the order, or do a little bit of subscripting (as here),
in the arguments of a function (cor()) which I would otherwise
just pass directly as the third argument to  by().

I'll let others comment on my use of  unclass()  here.  The
goal was to get a numeric vector with a names attribute, so
it can be incorporated into further processing.  I'm surprised
just how much tinkering it took to get this all to work.

This might actually make a useful example to add to the help
page for  by().

-  tom blackwell  -  u michigan medical school  -  ann arbor  -

On Fri, 22 Aug 2003, John Christie wrote:
> . . .  And, I also wanted to analyze correlations subject by subject and
> compare my two groups.  However, there doesn't seem to be a good way to
> get this.  I tried using "by" with "cor".  However,
this requires
> binding x and y which causes cor to return a matrix (if you could pass
> it x and y separate it would just return a number).
>
> given
>
> data frame s
> x	y	subj
> 4	7	harry
> 5	1	harry
> 6	9	harry
> 2	4	steve
> 3	7	steve
> ...
>
> i'd like to be able to produce
>
> r	subj
> .12	harry
> .52	steve
> ...
>
> any tips?

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Aug 2003 - a pickle with ranks and reals?

[R] a pickle with ranks and reals?

[R] a pickle (solved first part now need r's from data)

[R] a pickle with ranks and reals?

Seemingly Similar Threads