This is a best practices / style question. The way I use RODBC is I something like this: > foo <- sqlQuery(db, "select * from foo") > apply(foo, 1, function{...}) That is, I use apply to iterate over each result -- row -- in the RODBC-produced dataframe. Is this how one generally wants to do this? My concern is that when apply iterates over the rows, it uses as.matrix() to convert the dataframe to a character representation of itself. Thus my database's carefully planned data types (that RODBC carefully preserved when returning query results) get completely lost as I process the data. I've taken to judiciously sprinkling as.numeric() and friends here and there, but this is just begging for bugs. In other words, what is the smart way to process a dataframe by rows? Or is there, by chance, a specific technique or practice that is available for RODBC results but not for dataframes in general? Thank you for your thoughts.
Jack Tanner <ihok <at> hotmail.com> writes: : : This is a best practices / style question. : : The way I use RODBC is I something like this: : : > foo <- sqlQuery(db, "select * from foo") : > apply(foo, 1, function{...}) : : That is, I use apply to iterate over each result -- row -- in the : RODBC-produced dataframe. Is this how one generally wants to do this? : : My concern is that when apply iterates over the rows, it uses : as.matrix() to convert the dataframe to a character representation of : itself. Thus my database's carefully planned data types (that RODBC : carefully preserved when returning query results) get completely lost as : I process the data. I've taken to judiciously sprinkling as.numeric() : and friends here and there, but this is just begging for bugs. : : In other words, what is the smart way to process a dataframe by rows? Or : is there, by chance, a specific technique or practice that is available : for RODBC results but not for dataframes in general? : Don't know about the best way but here is one way that does not convert to charadter: R> data(iris) R> irish <- head(iris) R> irish Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa R> f <- function(i) with(irish[i,], Sepal.Length + Sepal.Width) R> sapply(1:nrow(irish), f) [1] 8.6 7.9 7.9 7.7 8.6 9.3
On Sun, 28 Nov 2004 21:25:24 -0500, Jack Tanner <ihok at hotmail.com> wrote:>This is a best practices / style question. > >The way I use RODBC is I something like this: > > > foo <- sqlQuery(db, "select * from foo") > > apply(foo, 1, function{...}) > >That is, I use apply to iterate over each result -- row -- in the >RODBC-produced dataframe. Is this how one generally wants to do this? > >My concern is that when apply iterates over the rows, it uses >as.matrix() to convert the dataframe to a character representation of >itself. Thus my database's carefully planned data types (that RODBC >carefully preserved when returning query results) get completely lost as >I process the data. I've taken to judiciously sprinkling as.numeric() >and friends here and there, but this is just begging for bugs. > >In other words, what is the smart way to process a dataframe by rows? Or >is there, by chance, a specific technique or practice that is available >for RODBC results but not for dataframes in general?I would just use a for() loop if I didn't care about the speed too much. If I did, I'd avoid dealing with rows of dataframes: access using dataframe indexing is slow. Depending what your function is, you're probably better off extracting the columns of the dataframe as vectors, and working with those. Duncan Murdoch