Hi Thanks much for the prompt response to my earlier enquiry on packages for regression analyses. Along the same topic(?), I have another question about which I could use some input. I am retreiving data from a MySQL database using RODBC. The table has many BLOB columns and each BLOB column has data in the format "id1 \t id2 \t measure \n id3 \t id4 \t measure...." (i.e. multiple rows compressed as one long string) I am retreiving them as follows. dataFromDB <- sqlQuery(channel, "select uncompress(columnName) from tableName"); I am looking for ways to convert this long "string" into a table/dataframe in R, making it easier for further post processing etc without reading/writing it to a file first. Although by doing write.table and reading it in again, I got the result in a data frame, with the \t and \n interpreted correctly, I wish to sidestep this as I need to carry out this analyses for over 4 million such entries. I tried write.table(dataFromDB, file="FileName"); dataFromFile <- read.table(FileName, sep="\t") dataFromFile is of the form 92_8_nmenA 993_7_mpul 1.042444 92_8_nmenA 3_5_cpneuA 0.900939 190_1_rpxx 34_4_ctraM 0.822532 190_1_rpxx 781_6_pmul 0.870016 Your input on the above is greatly appreciated. Thanks Lalitha ____________________________________________________________________________________ Never miss an email again!
On Thu, 2007-03-01 at 08:34 -0800, lalitha viswanath wrote:> Hi > Thanks much for the prompt response to my earlier > enquiry on packages for regression analyses. > Along the same topic(?), I have another question about > which I could use some input. > > I am retreiving data from a MySQL database using > RODBC. > The table has many BLOB columns and each BLOB column > has data in the format > "id1 \t id2 \t measure \n id3 \t id4 \t measure...." > (i.e. multiple rows compressed as one long string) > > I am retreiving them as follows. > > dataFromDB <- sqlQuery(channel, "select > uncompress(columnName) from tableName"); > > > I am looking for ways to convert this long "string" > into a table/dataframe in R, making it easier for > further post processing etc without reading/writing it > to a file first. > > Although by doing write.table and reading it in again, > I got the result in a data frame, with the \t and \n > interpreted correctly, I wish to sidestep this as I > need to carry out this analyses for over 4 million > such entries. > I tried > write.table(dataFromDB, file="FileName"); > dataFromFile <- read.table(FileName, sep="\t") > dataFromFile is of the form > > 92_8_nmenA 993_7_mpul 1.042444 > 92_8_nmenA 3_5_cpneuA 0.900939 > 190_1_rpxx 34_4_ctraM 0.822532 > 190_1_rpxx 781_6_pmul 0.870016 > > Your input on the above is greatly appreciated. > Thanks > LalithaThe easiest way might be to use a textConnection(). Let's say that you have read in your data as above and you have a column called 'blob':> dataFromDBblob 1 id1 \t id2 \t measure \n id3 \t id4 \t measure #Open textConnection. Note coercion to character BLOB <- textConnection(as.character(dataFromDB$blob)) # Read in the column DF <- read.table(BLOB, sep = "\t") # Close the connection close(BLOB)> DFV1 V2 V3 1 id1 id2 measure 2 id3 id4 measure See ?textConnection HTH, Marc Schwartz
Use 'strsplit' to create a vector and then 'rbind' into a matrix or create a dataframe:> x <- c("id1\tid2\tid3\tid4", "1\t2\t3\t4", "5\t6\t7\t8") > x[1] "id1\tid2\tid3\tid4" "1\t2\t3\t4" "5\t6\t7\t8"> (y <- strsplit(x, "\t"))[[1]] [1] "id1" "id2" "id3" "id4" [[2]] [1] "1" "2" "3" "4" [[3]] [1] "5" "6" "7" "8"> do.call(rbind, y)[,1] [,2] [,3] [,4] [1,] "id1" "id2" "id3" "id4" [2,] "1" "2" "3" "4" [3,] "5" "6" "7" "8">Jim Holtman "What is the problem you are trying to solve?" ----- Original Message ---- From: lalitha viswanath <lalithaviswanath@yahoo.com> To: r-help@stat.math.ethz.ch Sent: Thursday, March 1, 2007 11:34:19 AM Subject: [R] Query about data manipulation Hi Thanks much for the prompt response to my earlier enquiry on packages for regression analyses. Along the same topic(?), I have another question about which I could use some input. I am retreiving data from a MySQL database using RODBC. The table has many BLOB columns and each BLOB column has data in the format "id1 \t id2 \t measure \n id3 \t id4 \t measure...." (i.e. multiple rows compressed as one long string) I am retreiving them as follows. dataFromDB <- sqlQuery(channel, "select uncompress(columnName) from tableName"); I am looking for ways to convert this long "string" into a table/dataframe in R, making it easier for further post processing etc without reading/writing it to a file first. Although by doing write.table and reading it in again, I got the result in a data frame, with the \t and \n interpreted correctly, I wish to sidestep this as I need to carry out this analyses for over 4 million such entries. I tried write.table(dataFromDB, file="FileName"); dataFromFile <- read.table(FileName, sep="\t") dataFromFile is of the form 92_8_nmenA 993_7_mpul 1.042444 92_8_nmenA 3_5_cpneuA 0.900939 190_1_rpxx 34_4_ctraM 0.822532 190_1_rpxx 781_6_pmul 0.870016 Your input on the above is greatly appreciated. Thanks Lalitha ____________________________________________________________________________________ Never miss an email again! ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ____________________________________________________________________________________ The fish are biting. [[alternative HTML version deleted]]