Wensui Liu
2005-Mar-12 14:08 UTC
[R] any book and tutorial about how to manipulate data with R/S+
In real world, data manipulation might take even longer time and more effort than statistical analysis and modeling. Does anyone know a good book and tutorial about data manupulation? Thank you so much. -- WenSui Liu, MS MA Senior Decision Support Analyst Division of Health Policy and Clinical Effectiveness
Thomas Schönhoff
2005-Mar-12 14:34 UTC
[R] any book and tutorial about how to manipulate data with R/S+
Hallo, Am Samstag, 12. M?rz 2005 15:08 schrieb Wensui Liu:> In real world, data manipulation might take even longer time and > more effort than statistical analysis and modeling. > > Does anyone know a good book and tutorial about data manupulation? > Thank you so much.Well, it would be much easier to meet your demands if you could give us an idea what you exactly looking for. Anyway, there are some recommendations in R-Manual regarding introduxtory materials on doing statistics in R. If I remember correctly there are also some advices on r-cran.org in the generell FAQ. If you're looking for some introductory stuff doing data manipulation in R the book of Peter Dalgaard, Introductory Statistics with R should be taken into consideration. Not long time ago there was a similar question to this list, giving the whole range of available books on statistics in S/R . Have a look at http://maths.newcastle.edu.au/~rking/R/, you'll will be overwhelmed. Last but not least, if you look at r-cran website you'll find in contributed section some case-oriented tutorials, i.e. data mining or similar stuff! regards Thomas
Michael Grant
2005-Mar-13 02:53 UTC
[R] any book and tutorial about how to manipulate data with R/S+
Wensui, Here is an answer from a different perspective. Reading between the lines, you may be involved in 'remedial' data preparation at times. Depending on exactly what kind of tasks you are talking about you MAY be well advised to work in a database--that is why they exist. It just depends on what you have to do. I work with environmental data. And I work with some really junky data at times. I have to spend lots of effort grooming and combining data originally collected for reasons other than the one at hand, from disparate idiosyncratic sources, having information in both similar and very dissimilar formats, data of varying completeness, etc. I have to process data qualifiers, strip numbers out of strings, put them in--on and on. And of course it is different from record to record. This is just the nature of the beast. Another element is doing these same tasks over. One sometimes does not get the data in one shot. I remediate data, construct datasets, and process it. Then I get additional and/or corrected data and have do it again. This kind of thing is probably easier to track in DBs or spreadsheets. I would never try doing these tasks in R (or SPSS/SAS for that matter.) EXCEL works up to a point but I also go into MSAcess exploiting its visual query building and VB capabilities. As much as I dislike MS(I've been bitten too many times)I have to admit that the ability to easily construct (visual) queries, browse the results, etc., has been very useful. This kind remedial preparation is sometimes easy and sometimes brutal. A point here is that as the complexity of your data preparation increases it may be more efficient to do it in applications more appropriate to the task. Where the breakpoint is, is of a function of your own capabilities/inclinations in R (SPSS, SAS), EXCEL, Access or whatever. The one thing I know is that the problems of data prep., in my world al least, has always been there and will likely remain. I accept it and move on. The approach(es) you develop should be influenced by the frequency of such efforts and the size of the datasets typically involved. BTW, one truism is that project managers do not seem capable of understanding that just because something is in a computer does not mean it is ready to go to give them what they want :O(. Gee this stuff takes work...as you seem well aware. I steel myself for the task by reminding myself that writing and running the R programming is an enjoyable reward for my toil. R is fun. SPSS never was. I have not worked much with SAS because--and this a consideration--I can't afford a seat at home. BTW if some DB appears appropriate, then learn some SQL --even the if you use Access. There is always RODBC out there and it may be useful down the road. If you don't want to do all this then get an intern, graduate student, postdoc, or new career ;O). Best regards, Michael Grant Graduate School of Applied Brute Force in the Sciences
Apparently Analagous Threads
- any book and tutorial about how to manipulate data with R /S+
- Is it possible to create highly customized report in *.xls format by using R/S+?
- Is it possible to create highly customized report in *.xlsformat by using R/S+?
- recommendation on r scripting tutorial?
- AJAX on Mobile Internet Explorer