Folks, i thought it should be straightforward but after a few hours poking around, I decided it's best to post my question on this list. I have a data frame consisting of a (large) number of date columns, which are read in from a csv file as character string. I want to convert them to Date type. Following is an example, where the first column is of integer type, while the rest are type character.> head(df)TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB 1 250897 7/1/2010 7/31/2010 PSTRUCT Z 2 250617 8/1/2010 8/31/2010 PSTRUCT Z 3 250364 4/1/2011 6/30/2011 PLR Z 4 250176 4/1/2011 6/30/2011 PLR Z 5 250176 4/1/2011 6/30/2011 PLR Z 6 250364 4/1/2011 6/30/2011 PLR Z> sapply(df, class)TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB "integer" "character" "character" "character" "character" I thought it's just a matter of applying with a as.Date, df[,c(2,3)] = apply(df[,c(2,3)], 2, function(x)as.Date(x,"%m/%d/%Y")) Well, the Date conversion fails and I got, TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB 1 250897 14791 14821 PSTRUCT Z 2 250617 14822 14852 PSTRUCT Z 3 250364 15065 15155 PLR Z 4 250176 15065 15155 PLR Z 5 250176 15065 15155 PLR Z 6 250364 15065 15155 PLR Z The character columns are indeed converted, but they became integer, not Date type. OK, that's strange and so I started reading the help pages. It turns out in apply, the result is coerced to some basic vector types. And apparently Date is not one of the basic vector types. "In all cases the result is coerced by as.vector<http://127.0.0.1:19182/library/base/help/as.vector> to one of the basic vector types before the dimensions are set, so that (for example) factor results will be coerced to a character array. " The question then is how type conversion can be carried out on some columns of a data frame without using a loop. Thanks. Horace Tso [[alternative HTML version deleted]]
The short answer is, don't use apply() on data.frame's. Use lapply to loop over the columns of a data.frame.> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Horace Tso > Sent: Tuesday, June 08, 2010 2:19 PM > To: r-help at r-project.org > Subject: [R] type conversion with apply or not > > Folks, i thought it should be straightforward but after a few > hours poking around, I decided it's best to post my question > on this list. > > I have a data frame consisting of a (large) number of date > columns, which are read in from a csv file as character > string. I want to convert them to Date type. Following is an > example, where the first column is of integer type, while the > rest are type character. > > > head(df) > TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB > 1 250897 7/1/2010 7/31/2010 PSTRUCT Z > 2 250617 8/1/2010 8/31/2010 PSTRUCT Z > 3 250364 4/1/2011 6/30/2011 PLR Z > 4 250176 4/1/2011 6/30/2011 PLR Z > 5 250176 4/1/2011 6/30/2011 PLR Z > 6 250364 4/1/2011 6/30/2011 PLR Z > > sapply(df, class) > TRANSNO TRANS.START_DATE TRANS.END_DATE > DIVISION FASB > "integer" "character" "character" > "character" "character" > I thought it's just a matter of applying with a as.Date, > > df[,c(2,3)] = apply(df[,c(2,3)], 2, function(x)as.Date(x,"%m/%d/%Y")) > > Well, the Date conversion fails and I got, > > TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB > 1 250897 14791 14821 PSTRUCT Z > 2 250617 14822 14852 PSTRUCT Z > 3 250364 15065 15155 PLR Z > 4 250176 15065 15155 PLR Z > 5 250176 15065 15155 PLR Z > 6 250364 15065 15155 PLR Z > The character columns are indeed converted, but they became > integer, not Date type. OK, that's strange and so I started > reading the help pages. > > It turns out in apply, the result is coerced to some basic > vector types. And apparently Date is not one of the basic > vector types. > > "In all cases the result is coerced by > as.vector<http://127.0.0.1:19182/library/base/help/as.vector> > to one of the basic vector types before the dimensions are > set, so that (for example) factor results will be coerced to > a character array. " > > The question then is how type conversion can be carried out > on some columns of a data frame without using a loop. > > Thanks. > > Horace Tso > > > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
You need lapply here: df[2:3] <- lapply(df[2:3], as.Date, '%m/%d/%Y') On Tue, Jun 8, 2010 at 6:19 PM, Horace Tso <Horace.Tso@pgn.com> wrote:> Folks, i thought it should be straightforward but after a few hours poking > around, I decided it's best to post my question on this list. > > I have a data frame consisting of a (large) number of date columns, which > are read in from a csv file as character string. I want to convert them to > Date type. Following is an example, where the first column is of integer > type, while the rest are type character. > > > head(df) > TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB > 1 250897 7/1/2010 7/31/2010 PSTRUCT Z > 2 250617 8/1/2010 8/31/2010 PSTRUCT Z > 3 250364 4/1/2011 6/30/2011 PLR Z > 4 250176 4/1/2011 6/30/2011 PLR Z > 5 250176 4/1/2011 6/30/2011 PLR Z > 6 250364 4/1/2011 6/30/2011 PLR Z > > sapply(df, class) > TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION > FASB > "integer" "character" "character" "character" > "character" > I thought it's just a matter of applying with a as.Date, > > df[,c(2,3)] = apply(df[,c(2,3)], 2, function(x)as.Date(x,"%m/%d/%Y")) > > Well, the Date conversion fails and I got, > > TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB > 1 250897 14791 14821 PSTRUCT Z > 2 250617 14822 14852 PSTRUCT Z > 3 250364 15065 15155 PLR Z > 4 250176 15065 15155 PLR Z > 5 250176 15065 15155 PLR Z > 6 250364 15065 15155 PLR Z > The character columns are indeed converted, but they became integer, not > Date type. OK, that's strange and so I started reading the help pages. > > It turns out in apply, the result is coerced to some basic vector types. > And apparently Date is not one of the basic vector types. > > "In all cases the result is coerced by as.vector< > http://127.0.0.1:19182/library/base/help/as.vector> to one of the basic > vector types before the dimensions are set, so that (for example) factor > results will be coerced to a character array. " > > The question then is how type conversion can be carried out on some columns > of a data frame without using a loop. > > Thanks. > > Horace Tso > > > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]