thr3ads.net - R help - [R] How to reference or sort rownames in a data frame [May 2007]

If this information is useful, please help other people find it:
Share via:

Robert A. LaBudde

2007-May-27 20:55 UTC

[R] How to reference or sort rownames in a data frame

As I was working through elementary examples, I was using dataset 
"plasma" of package "HSAUR".

In performing a logistic regression of the data, and making the 
diagnostic plots (R-2.5.0)

data(plasma,package='HSAUR')
plasma_1<- glm(ESR ~ fibrinogen * globulin, data=plasma, family=binomial())
layout(matrix(1:4,nrow=2))
plot(plasma_1)

I find that data points corresponding to rownames 17 and 23 are 
outliers and high leverage.

I would then like to perform a fit without these two rows.

In principle this should be easy, using an update() with subset=-c(17,23).

The problem is that the rownames in this dataset are not ordered, 
and, in fact, the relevant rows are 30 and 31, not 17 and 23.

This brings up the following (elementary?) questions:

1. How do you reference rows in "subset=" for which you know the 
rownames, but not the row numbers?

2. How do you discovery the rows corresponding to particular 
rownames? (Using plasma[rownames(plasma)==17,] shows the data, but 
NOT the row number!) (Probably the same answer as in Q. 1 above.)

3. How do you sort (order) the rows of an existing data frame so that 
the rownames are in order?

I don't seem to know the magic words to find the answers to these 
questions in the help systems.

Obviously this can be done by writing new, brute force, functions 
scanning the subscripts, but there must be an (obvious?) direct way 
of doing this more elegantly.

Thanks for any pointers.
===============================================================Robert A.
LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: ral at lcfltd.com
Least Cost Formulations, Ltd.            URL: http://lcfltd.com/
824 Timberlake Drive                     Tel: 757-467-0954
Virginia Beach, VA 23464-3239            Fax: 757-467-2947

"Vere scire est per causas scire"

Gabor Grothendieck

2007-May-28 02:29 UTC

head link

[R] How to reference or sort rownames in a data frame

On 5/27/07, Robert A. LaBudde <ral at lcfltd.com>
wrote:> As I was working through elementary examples, I was using dataset
> "plasma" of package "HSAUR".
>
> In performing a logistic regression of the data, and making the
> diagnostic plots (R-2.5.0)
>
> data(plasma,package='HSAUR')
> plasma_1<- glm(ESR ~ fibrinogen * globulin, data=plasma,
family=binomial())
> layout(matrix(1:4,nrow=2))
> plot(plasma_1)
>
> I find that data points corresponding to rownames 17 and 23 are
> outliers and high leverage.
>
> I would then like to perform a fit without these two rows.
>
> In principle this should be easy, using an update() with subset=-c(17,23).
>
> The problem is that the rownames in this dataset are not ordered,
> and, in fact, the relevant rows are 30 and 31, not 17 and 23.
>
> This brings up the following (elementary?) questions:
>
> 1. How do you reference rows in "subset=" for which you know the
> rownames, but not the row numbers?
Use a logical vector:

   rownames(plasma) %in% c(17, 23)
>
> 2. How do you discovery the rows corresponding to particular
> rownames? (Using plasma[rownames(plasma)==17,] shows the data, but
> NOT the row number!) (Probably the same answer as in Q. 1 above.)
  which(rownames(plasma) %in% c(17, 23)) # 30, 31
>
> 3. How do you sort (order) the rows of an existing data frame so that
> the rownames are in order?

  plasma[order(as.numeric(rownames(plasma))), ]

Apparently Analagous Threads

Search for more reasonably related threads

R help - May 2007 - How to reference or sort rownames in a data frame

[R] How to reference or sort rownames in a data frame

[R] How to reference or sort rownames in a data frame

Apparently Analagous Threads