thr3ads.net - R help - [R] How does the data.frame function generate column names? [Jan 2011]

If this information is useful, please help other people find it:
Share via:

H Roark

2011-Jan-23 21:53 UTC

[R] How does the data.frame function generate column names?

Hi all,

I'm a new R user and am confused about how R behaves when converting a
vector to a data frame when using the data.frame function.  I'm specifically
interested in cases where the vector is expressed as a subset of another data
frame.  For example, say I want to create a data frame from the last three rows
of the third column of the data frame, d, that I've created below:

a<-(1:10)
b<-(11:20)
c<-(21:30)
d<-data.frame(a,b,c)

To do that, I know that I could do:

e<-d[8:10,"c"]
f<-data.frame(e)

However, I would like for the single column in the data frame, f, to be named
"c".  Obviously, I could just use the vector,
c<-d[8:10,"c"], in place of the vector e.  However, I wonder why I
can't do:

g<-data.frame(d[8:10,"c"])

This expression returns the proper values, but the resulting variable is named
"d.8.10...c.." and not "c" as I expected it to be named.

Could someone explain the mechanics of this statement and tell me why it
produced such an oddly named variable?  I'm especially confused as to why I
get the result I expect if I use the data.frame function on multiple vectors, as
in:

g2<-data.frame(d[8:10,c("b","c")]) 

which produces a data frame with columns named "b" and "c".

Many thanks in advance,
Alec
 		 	   		  
	[[alternative HTML version deleted]]

Joshua Wiley

2011-Jan-24 00:22 UTC

head link

[R] How does the data.frame function generate column names?

Hi,

Welcome to R!  What you have run into is a feature of how subsetting
works.  By default, it converts to the lowest possible dimensions.
The odd name you see, "d.8.10...c..",  is an attempt to convert "
d[8:10, "c"]  " into a valid name.  R does this approximately by
converting disallowed characters (like ":") into periods (.).  This is
because data.frame() uses whatever was passed to it as the name of the
column, unless whatever it is already has a column name.  Here is some
code (you should be able to copy and paste), with comments that
explains a bit further and hopefully gives you a better feel for
indexing and creating data frame objects.

Cheers,

Josh

################################################
## your data (in one step)
d <- data.frame(a = 1:10, b = 11:20, c = 21:30)

## because only one column of 'd' is selected, the conversion
## to lowest possible dimensions is 1 (a vector)
## and that loses its column name, so use drop = FALSE
f <- data.frame(d[8:10, "c", drop = FALSE])

## another option is to explicitly name the column
g <- data.frame(c = d[8:10, "c"])

## here you have selected two columns so there must
## be at least two dimensions, and names are kept
g2 <-data.frame(d[8:10, c("b", "c")])

## to "see" what is happening
d[8:10, "c", drop = FALSE]
d[8:10, "c", drop = TRUE] # default

## for more details, see the documentation
?"["  # see the "drop" argument description
?data.frame # under the "value" section on names

################################################

On Sun, Jan 23, 2011 at 1:53 PM, H Roark <hrbuilder at hotmail.com>
wrote:>
> Hi all,
>
> I'm a new R user and am confused about how R behaves when converting a
vector to a data frame when using the data.frame function. ?I'm specifically
interested in cases where the vector is expressed as a subset of another data
frame. ?For example, say I want to create a data frame from the last three rows
of the third column of the data frame, d, that I've created below:
>
> a<-(1:10)
> b<-(11:20)
> c<-(21:30)
> d<-data.frame(a,b,c)
>
> To do that, I know that I could do:
>
> e<-d[8:10,"c"]
> f<-data.frame(e)
>
> However, I would like for the single column in the data frame, f, to be
named "c". ?Obviously, I could just use the vector,
c<-d[8:10,"c"], in place of the vector e. ?However, I wonder why I
can't do:
>
> g<-data.frame(d[8:10,"c"])
>
> This expression returns the proper values, but the resulting variable is
named "d.8.10...c.." and not "c" as I expected it to be
named.
>
> Could someone explain the mechanics of this statement and tell me why it
produced such an oddly named variable? ?I'm especially confused as to why I
get the result I expect if I use the data.frame function on multiple vectors, as
in:
>
> g2<-data.frame(d[8:10,c("b","c")])
>
> which produces a data frame with columns named "b" and
"c".
>
> Many thanks in advance,
> Alec
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

Reasonably Related Threads

Search for more possibly parallel threads

R help - Jan 2011 - How does the data.frame function generate column names?

[R] How does the data.frame function generate column names?

[R] How does the data.frame function generate column names?

Reasonably Related Threads