I have a data frame, where two last columns - "month" and
"year" - are
character vectors. The "year" vector is made of two numbers (i.e.
"97"
for 1997, "07" for 2007, etc)
What I want to do is to create a variable "Year" that is mode numeric
and where each record is a four-figure number (1997, 2007,...)
I have about 40000 rows in the dataframe, the observations are for 10
years (so there are multiple rows for each year).
I tried the following, but the program runs and runs, and if I abort
it all the years in "Year" are 1997:
for(i in 1:dim(database)[1]){
if(database$year[i]>90) {
database$Year[i] <- as.numeric(database$year[i])+1900 } else {
database$Year[i] <- as.numeric(database$year[i])+2000
}
}
Thanks in advance for explanations.
Regards,
JM
--
Jonas Malmros
Stockholm University
Stockholm, Sweden
try this:> x <- data.frame(month=as.character(sample(1:12,1000,TRUE)),+ year=as.character(sample(c(0:7, 95:99), 1000, TRUE)), stringsAsFactors=FALSE)> # convert to integer > x$Year <- as.integer(x$year) > # now add the century > x$Year <- ifelse(x$Year < 10, x$Year+2000, x$Year+1900) > > > head(x)month year Year 1 10 96 1996 2 9 5 2005 3 8 95 1995 4 1 7 2007 5 11 3 2003 6 2 99 1999>On Dec 17, 2007 7:52 PM, Jonas Malmros <jonas.malmros at gmail.com> wrote:> I have a data frame, where two last columns - "month" and "year" - are > character vectors. The "year" vector is made of two numbers (i.e. "97" > for 1997, "07" for 2007, etc) > What I want to do is to create a variable "Year" that is mode numeric > and where each record is a four-figure number (1997, 2007,...) > I have about 40000 rows in the dataframe, the observations are for 10 > years (so there are multiple rows for each year). > I tried the following, but the program runs and runs, and if I abort > it all the years in "Year" are 1997: > > for(i in 1:dim(database)[1]){ > if(database$year[i]>90) { > database$Year[i] <- as.numeric(database$year[i])+1900 } else { > database$Year[i] <- as.numeric(database$year[i])+2000 > } > } > > Thanks in advance for explanations. > > Regards, > JM > > -- > Jonas Malmros > Stockholm University > Stockholm, Sweden > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
Without actually running the code, one problem seems to be that the line
if(database$year[i]>90)
is comparing a character vector to 90 and not the numbers that you are
expecting.
You could try this instead:
database$year <- as.numeric(database$year)
ifelse(database$year > 90, database$year + 1900, database$year + 2000)
Jason Law
Statistician
City of Portland, Bureau of Environmental Services
Water Pollution Control Laboratory
6543 N Burlington Avenue
Portland, OR 97203 -5452
jlaw at bes.ci.portland.or.us
-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org]On Behalf Of Jonas Malmros
Sent: Monday, December 17, 2007 4:53 PM
To: r-help at r-project.org
Subject: [R] Why is conversion not working?
I have a data frame, where two last columns - "month" and
"year" - are
character vectors. The "year" vector is made of two numbers (i.e.
"97"
for 1997, "07" for 2007, etc)
What I want to do is to create a variable "Year" that is mode numeric
and where each record is a four-figure number (1997, 2007,...)
I have about 40000 rows in the dataframe, the observations are for 10
years (so there are multiple rows for each year).
I tried the following, but the program runs and runs, and if I abort
it all the years in "Year" are 1997:
for(i in 1:dim(database)[1]){
if(database$year[i]>90) {
database$Year[i] <- as.numeric(database$year[i])+1900 } else {
database$Year[i] <- as.numeric(database$year[i])+2000
}
}
Thanks in advance for explanations.
Regards,
JM
--
Jonas Malmros
Stockholm University
Stockholm, Sweden
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.