I have a data frame, where two last columns - "month" and "year" - are character vectors. The "year" vector is made of two numbers (i.e. "97" for 1997, "07" for 2007, etc) What I want to do is to create a variable "Year" that is mode numeric and where each record is a four-figure number (1997, 2007,...) I have about 40000 rows in the dataframe, the observations are for 10 years (so there are multiple rows for each year). I tried the following, but the program runs and runs, and if I abort it all the years in "Year" are 1997: for(i in 1:dim(database)[1]){ if(database$year[i]>90) { database$Year[i] <- as.numeric(database$year[i])+1900 } else { database$Year[i] <- as.numeric(database$year[i])+2000 } } Thanks in advance for explanations. Regards, JM -- Jonas Malmros Stockholm University Stockholm, Sweden
try this:> x <- data.frame(month=as.character(sample(1:12,1000,TRUE)),+ year=as.character(sample(c(0:7, 95:99), 1000, TRUE)), stringsAsFactors=FALSE)> # convert to integer > x$Year <- as.integer(x$year) > # now add the century > x$Year <- ifelse(x$Year < 10, x$Year+2000, x$Year+1900) > > > head(x)month year Year 1 10 96 1996 2 9 5 2005 3 8 95 1995 4 1 7 2007 5 11 3 2003 6 2 99 1999>On Dec 17, 2007 7:52 PM, Jonas Malmros <jonas.malmros at gmail.com> wrote:> I have a data frame, where two last columns - "month" and "year" - are > character vectors. The "year" vector is made of two numbers (i.e. "97" > for 1997, "07" for 2007, etc) > What I want to do is to create a variable "Year" that is mode numeric > and where each record is a four-figure number (1997, 2007,...) > I have about 40000 rows in the dataframe, the observations are for 10 > years (so there are multiple rows for each year). > I tried the following, but the program runs and runs, and if I abort > it all the years in "Year" are 1997: > > for(i in 1:dim(database)[1]){ > if(database$year[i]>90) { > database$Year[i] <- as.numeric(database$year[i])+1900 } else { > database$Year[i] <- as.numeric(database$year[i])+2000 > } > } > > Thanks in advance for explanations. > > Regards, > JM > > -- > Jonas Malmros > Stockholm University > Stockholm, Sweden > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
Without actually running the code, one problem seems to be that the line if(database$year[i]>90) is comparing a character vector to 90 and not the numbers that you are expecting. You could try this instead: database$year <- as.numeric(database$year) ifelse(database$year > 90, database$year + 1900, database$year + 2000) Jason Law Statistician City of Portland, Bureau of Environmental Services Water Pollution Control Laboratory 6543 N Burlington Avenue Portland, OR 97203 -5452 jlaw at bes.ci.portland.or.us -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]On Behalf Of Jonas Malmros Sent: Monday, December 17, 2007 4:53 PM To: r-help at r-project.org Subject: [R] Why is conversion not working? I have a data frame, where two last columns - "month" and "year" - are character vectors. The "year" vector is made of two numbers (i.e. "97" for 1997, "07" for 2007, etc) What I want to do is to create a variable "Year" that is mode numeric and where each record is a four-figure number (1997, 2007,...) I have about 40000 rows in the dataframe, the observations are for 10 years (so there are multiple rows for each year). I tried the following, but the program runs and runs, and if I abort it all the years in "Year" are 1997: for(i in 1:dim(database)[1]){ if(database$year[i]>90) { database$Year[i] <- as.numeric(database$year[i])+1900 } else { database$Year[i] <- as.numeric(database$year[i])+2000 } } Thanks in advance for explanations. Regards, JM -- Jonas Malmros Stockholm University Stockholm, Sweden ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.