At 12:01 03/04/04 +0200, you wrote:>Content-Transfer-Encoding: 8bit >From: solares at unsl.edu.ar >Precedence: list >MIME-Version: 1.0 >Cc: >To: R-help at stat.math.ethz.ch >Date: Fri, 2 Apr 2004 12:47:48 -0300 (ART) >Message-ID: <50155.209.13.250.66.1080920868.squirrel at inter17.unsl.edu.ar> >Content-Type: text/plain;charset=iso-8859-1 >Subject: [R] convert excell file to text with RODBC package >Message: 27 > >Hi, i can convert excell to list in R with package RODBC ()but i don't >understand 2 mistake >1) Don't read the last row of the table excell >2) Don' t take the hoursSee below>My excell file call prueba4.xls and have the following rows: >where prueba4.xls was make in excell (office xp) and have one spreadsheet >call "Hoja1", you see each rows of she: >D??a Hora col1 col2 col3 col4 col5 col6 col7 col8 >15/12/2003 12:14:59 217 2760 8,2 35 79,6 86,4 >15/12/2003 12:15:00 217 2764 8,2 35 79,6 86,4 >15/12/2003 12:15:01 217 2758 8,3 35 79,6 86,4 >15/12/2003 12:15:02 217 2760 8,3 35 79,6 86,4 >15/12/2003 12:15:03 217 2755 8,3 35 79,6 86,4 >15/12/2003 12:15:04 217 2766 8,3 35 79,6 86,4 >15/12/2003 12:15:05 217,1 2766 8,3 35,1 79,6 86,4 >15/12/2003 12:15:06 217,1 2758 8,3 35,1 79,6 86,4 >15/12/2003 12:15:07 217,1 2768 8,3 35,1 79,6 86,4That seems to have 9 rows of data. There seem to be either 8 columns or 10 columns of data.>My code (i use the R 1.7.1 for windows xp) is the following: > > library(RODBC) > > canal<-odbcConnectExcel("c:/prueba4.xls") > > tablas<-sqlTables(canal) > > tablas > TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS >1 c:\\prueba4 <NA> Hoja1$ SYSTEM TABLE <NA> >2 c:\\prueba4 <NA> Hoja2$ SYSTEM TABLE <NA> >3 c:\\prueba4 <NA> Hoja3$ SYSTEM TABLE <NA> > > tbl<-sqlFetch(canal,substr(tablas[1,3],1,nchar(tablas[1,3])-1)) > > tbl[1] > D??a >1 2003-12-15 00:00:00 >2 2003-12-15 00:00:00 >3 2003-12-15 00:00:00 >4 2003-12-15 00:00:00 >5 2003-12-15 00:00:00 >6 2003-12-15 00:00:00 >7 2003-12-15 00:00:00 >8 2003-12-15 00:00:00 >9 2003-12-15 00:00:00 > >??Which is the mistake?.Well that has 9 rows the same as Hoja1 in Pruebas4.xls, so I do not understand your question 1. What exactly do you mean by question 2? Do you mean it is not reading the hours column (Hora) or do you mean it takes a long time? It looks to me as though column one, despite its label of day (D??a) actually contains some much more complex Excel format. Michael Dewey m.dewey at iop.kcl.ac.uk
David L. Van Brunt, Ph.D.
2004-Apr-04 23:44 UTC
[R] Can't seem to finish a randomForest.... Just goes and goes!
Playing with randomForest, samples run fine. But on real data, no go. Here's the setup: OS X, same behavior whether I'm using R-Aqua 1.8.1 or the Fink compile-of-my-own with X-11, R version 1.8.1. This is on OS X 10.3 (aka "Panther"), G4 800Mhz with 512M physical RAM. I have not altered the Startup options of R. Data set is read in from a text file with "read.table", and has 46 variables and 1,855 cases. Trying the following: The DV is categorical, 0 or 1. Most of the IV's are either continuous, or correctly read in as factors. The largest factor has 30 levels.... Only the DV seems to need identifying as a factor to force class trees over regresssion:>Mydata$V46<-as.factor(Mydata$V46) >Myforest.rf<-randomForest(V46~.,data=Mydata,ntrees=100,mtry=7,proximities=FALSE, importance=FALSE) 5 hours later, R.bin was still taking up 75% of my processor. When I've tried this with larger data, I get errors referring to the buffer (sorry, not in front of me right now). Any ideas on this? The data don't seem horrifically large. Seems like there are a few options for setting memory size, but I'm not sure which of them to try tweaking, or if that's even the issue.