Basic question:
when I use names() to extract the name of a dataframe element, why does it
have "." instead of " " between words?
Context:
I'm importing a CSV file of survey results for analysis. I read them like
this:
df <- read.csv("surveydata.csv",nrows=40,header=TRUE,
na.string=c("N/A",""),comment.char="",strip.white=TRUE)
To do a summary of the responses to a question, I can now do something like
this:
table(df[13])
1 2 3 4
13 13 4 2
and can then do a barplot with:
barplot(table(df[13])))
Which is fine. But since the first row of my data file includes the questions
themselves, I want to use those questions as chart titles, like this:
barplot(table(df[13]),main=names(df[13]))
This gives an unexpected result: the title has a "." where every space
should
be between words, like this: "How.many.cats.do.you.own"
I can't figure out why, or how to get spaces instead of "." to
show up. I've
tried using gsub but without success, to substitute " " for
"."
I'm think I'm confused about something fundamental here, and hope
someone has
the patience to enlighten me. I confess to a being a bit vague in my
understanding of R's handling of arrays, dataframes, and vectors. My
background is in C programming and I keep looking for a "string"...
Thank you very much. Please reply or copy me directly if you respond.
Tom Arnold
Managing Partner, Summit Media Partners LLC
Visit our web site at http://www.summitmediapartners.com
Tom, Tom Arnold wrote:> Basic question: > when I use names() to extract the name of a dataframe element, why does it > have "." instead of " " between words? > > Context: > I'm importing a CSV file of survey results for analysis. I read them like > this: > > df <- read.csv("surveydata.csv",nrows=40,header=TRUE, > na.string=c("N/A",""),comment.char="",strip.white=TRUE) > > To do a summary of the responses to a question, I can now do something like > this: > > table(df[13]) > > 1 2 3 4 > 13 13 4 2 > > and can then do a barplot with: > barplot(table(df[13]))) > > Which is fine. But since the first row of my data file includes the questions > themselves, I want to use those questions as chart titles, like this: > > barplot(table(df[13]),main=names(df[13])) > > This gives an unexpected result: the title has a "." where every space should > be between words, like this: "How.many.cats.do.you.own" > > I can't figure out why, or how to get spaces instead of "." to show up. I've > tried using gsub but without success, to substitute " " for "." > > I'm think I'm confused about something fundamental here, and hope someone has > the patience to enlighten me. I confess to a being a bit vague in my > understanding of R's handling of arrays, dataframes, and vectors. My > background is in C programming and I keep looking for a "string"... > > Thank you very much. Please reply or copy me directly if you respond.You can use check.names = FALSE in your read.csv(...) call. Or you can use gsub as in: names(df2) = gsub("\\.", " ", names(df2)) The "." must be escaped in gsub if used in the pattern argument. Regards, Sundar
Dear Tom,
When you read the data into a data frame via read.csv, the character
strings in the first row of the data file, which you've indicated is to be
interpreted as a header, are used for column names; in the process, blanks
are converted to periods, since nonstandard names including blanks are more
difficult to deal with; names(df[13]) just returns the name of column 13 in
the data frame. You could use the gsub function to recover the blanks --
something like gsub("\\."," ", names(df[2])). Alternatively,
you could
specify the argument check.names=FALSE to read.csv to avoid substituting
periods for blanks in the first place, but this probably isn't a good idea.
See ?read.csv for details.
I hope that this helps,
John
At 09:33 AM 2/11/2003 -0700, Tom Arnold wrote:>Basic question:
>when I use names() to extract the name of a dataframe element, why does it
>have "." instead of " " between words?
>
>Context:
>I'm importing a CSV file of survey results for analysis. I read them
like
>this:
>
>df <- read.csv("surveydata.csv",nrows=40,header=TRUE,
>
na.string=c("N/A",""),comment.char="",strip.white=TRUE)
>
>To do a summary of the responses to a question, I can now do something like
>this:
>
>table(df[13])
>
> 1 2 3 4
>13 13 4 2
>
>and can then do a barplot with:
>barplot(table(df[13])))
>
>Which is fine. But since the first row of my data file includes the
questions
>themselves, I want to use those questions as chart titles, like this:
>
>barplot(table(df[13]),main=names(df[13]))
>
>This gives an unexpected result: the title has a "." where every
space should
>be between words, like this: "How.many.cats.do.you.own"
>
>I can't figure out why, or how to get spaces instead of "." to
show up. I've
>tried using gsub but without success, to substitute " " for
"."
>
>I'm think I'm confused about something fundamental here, and hope
someone has
>the patience to enlighten me. I confess to a being a bit vague in my
>understanding of R's handling of arrays, dataframes, and vectors. My
>background is in C programming and I keep looking for a
"string"...
>
>Thank you very much. Please reply or copy me directly if you respond.
>
>Tom Arnold
>Managing Partner, Summit Media Partners LLC
>Visit our web site at http://www.summitmediapartners.com
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>http://www.stat.math.ethz.ch/mailman/listinfo/r-help
-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox
-----------------------------------------------------
Possibly Parallel Threads
- Putting value labels inside the bars of a bar plot
- Faster way to combine data sets with different date ranges
- trouble positioning legends on barplot written to a file
- Simple (?) subset problem
- How do I read a text (.csv) file to match a matrix/cross tab? (Object confusion??)