Sascha Morach
2007-Mar-15 10:30 UTC
[R] Hardware for a new Workstation for best performance using R
Hi, we are looking for a new workstation for big datasets/applications (easily up to 100'000 records and up to 300 Variables) using R. As an example: Variable Selection for a multivariate regression using stepAIC. What is the best configuration for a workstation to reach a high performance level for computing with R? Single core or multi core (is R together with nws package really able to use advantage of multi core processors, any experience/benchmarks on that)? Shall we use Linux instead of Windows? If yes, how is the compatibility of graphics computed on Linux if we like to use them after on windows? And what are the advantages using Linux instead of Windows? What kind of workstations are you using (hardware and operating system) for big data computations? And are you satisfied with it? I'm quite familiar with pc or server hardware. Thanks in advance Sascha Morach
Andrew Perrin
2007-Mar-15 13:48 UTC
[R] Hardware for a new Workstation for best performance using R
I can speak to some of these issues. I don't know about how much benefit you can get from SMP for *single* instances of R, though. 1.) Multicore will be helpful, at least, if you are running several instances of R at once. So, for example, if you have people running two different models at the same time, the OS can use separate processors or cores for each instance. 2.) Yes, by all means you should use linux instead of windows. The graphics output is completely compatible with whatever applications you want to paste them into on Windows. Linux is cheaper, stabler, and better at using the system's resources. 3.) If you're doing big datasets, you certainly need a 64-bit processor, operating system, and R. Consider, perhaps, a dual-Athlon XP 64 machine with a big pile of RAM? Andy ---------------------------------------------------------------------- Andrew J Perrin - andrew_perrin (at) unc.edu - http://perrin.socsci.unc.edu Assistant Professor of Sociology; Book Review Editor, _Social Forces_ University of North Carolina - CB#3210, Chapel Hill, NC 27599-3210 USA New Book: http://www.press.uchicago.edu/cgi-bin/hfs.cgi/00/178592.ctl On Thu, 15 Mar 2007, Sascha Morach wrote:> Hi, > > we are looking for a new workstation for big datasets/applications (easily > up to 100'000 records and up to 300 Variables) using R. As an example: > Variable Selection for a multivariate regression using stepAIC. > > What is the best configuration for a workstation to reach a high performance > level for computing with R? > > Single core or multi core (is R together with nws package really able to use > advantage of multi core processors, any experience/benchmarks on that)? > > Shall we use Linux instead of Windows? If yes, how is the compatibility of > graphics computed on Linux if we like to use them after on windows? And what > are the advantages using Linux instead of Windows? > > What kind of workstations are you using (hardware and operating system) for > big data computations? And are you satisfied with it? > > I'm quite familiar with pc or server hardware. > > Thanks in advance > > Sascha Morach > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Thomas Lumley
2007-Mar-19 15:25 UTC
[R] Hardware for a new Workstation for best performance using R
On Thu, 15 Mar 2007, Andrew Perrin wrote: (in part)> > 2.) Yes, by all means you should use linux instead of windows. The > graphics output is completely compatible with whatever applications you > want to paste them into on Windows.This turns out not to be the case. It is not trivial to produce good graphics off Windows for adding to Microsoft Office documents (regrettably an important case for many people). There has been much discussion of this on the R-sig-mac mailing list, for example, where PNG bitmaps (at sufficiently high resolution) seem to be the preferred method. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
Gabor Grothendieck
2007-Mar-19 15:43 UTC
[R] Hardware for a new Workstation for best performance using R
On 3/19/07, Thomas Lumley <tlumley at u.washington.edu> wrote:> On Thu, 15 Mar 2007, Andrew Perrin wrote: (in part) > > > > 2.) Yes, by all means you should use linux instead of windows. The > > graphics output is completely compatible with whatever applications you > > want to paste them into on Windows. > > This turns out not to be the case. > > It is not trivial to produce good graphics off Windows for adding to > Microsoft Office documents (regrettably an important case for many > people). There has been much discussion of this on the R-sig-mac mailing > list, for example, where PNG bitmaps (at sufficiently high resolution) > seem to be the preferred method.On Windows one can produce metafile output directly from R. This is a Windows vector graphics format so it retains resolution under expansion and shrinkage and it also works well with Microsoft Office. This would likely give superior results (maximum resolution, more flexibility in post processing, easier to do, interfaces better with Office) to using and transferring graphics from another OS, particularly png which is only bit-mapped rather than vector-based.
Thomas Lumley
2007-Mar-19 18:05 UTC
[R] Hardware for a new Workstation for best performance using R
On Mon, 19 Mar 2007, Gabor Grothendieck wrote:> On 3/19/07, Thomas Lumley <tlumley at u.washington.edu> wrote: >> On Thu, 15 Mar 2007, Andrew Perrin wrote: (in part) >>> >>> 2.) Yes, by all means you should use linux instead of windows. The >>> graphics output is completely compatible with whatever applications you >>> want to paste them into on Windows. >> >> This turns out not to be the case. >> >> It is not trivial to produce good graphics off Windows for adding to >> Microsoft Office documents (regrettably an important case for many >> people). There has been much discussion of this on the R-sig-mac mailing >> list, for example, where PNG bitmaps (at sufficiently high resolution) >> seem to be the preferred method. > > On Windows one can produce metafile output directly from R.Yes, indeed. However, this fact is of limited help when working on another operating system, which was the focus of the original question. -thomas
Gabor Grothendieck
2007-Mar-19 18:14 UTC
[R] Hardware for a new Workstation for best performance using R
On 3/19/07, Thomas Lumley <tlumley at u.washington.edu> wrote:> On Mon, 19 Mar 2007, Gabor Grothendieck wrote: > > > On 3/19/07, Thomas Lumley <tlumley at u.washington.edu> wrote: > >> On Thu, 15 Mar 2007, Andrew Perrin wrote: (in part) > >>> > >>> 2.) Yes, by all means you should use linux instead of windows. The > >>> graphics output is completely compatible with whatever applications you > >>> want to paste them into on Windows. > >> > >> This turns out not to be the case. > >> > >> It is not trivial to produce good graphics off Windows for adding to > >> Microsoft Office documents (regrettably an important case for many > >> people). There has been much discussion of this on the R-sig-mac mailing > >> list, for example, where PNG bitmaps (at sufficiently high resolution) > >> seem to be the preferred method. > > > > On Windows one can produce metafile output directly from R. > > Yes, indeed. However, this fact is of limited help when working on another > operating system, which was the focus of the original question.What was being discussed included: "Yes, by all means you should use linux instead of windows." and the subsequent discussion seemed to support that including the suggestion that producing graphics on linux is just as good as producing it on windows even if its intended to be transferred to Microsoft Office on Windows but in fact there are a number of advantages to doing it on Windows if you intend to use Microsoft Office there.
Peter Dalgaard
2007-Mar-19 18:23 UTC
[R] Hardware for a new Workstation for best performance using R
Thomas Lumley wrote:> On Thu, 15 Mar 2007, Andrew Perrin wrote: (in part) > >> 2.) Yes, by all means you should use linux instead of windows. The >> graphics output is completely compatible with whatever applications you >> want to paste them into on Windows. >> > > This turns out not to be the case. > > It is not trivial to produce good graphics off Windows for adding to > Microsoft Office documents (regrettably an important case for many > people). There has been much discussion of this on the R-sig-mac mailing > list, for example, where PNG bitmaps (at sufficiently high resolution) > seem to be the preferred method. > > -thomas > >One option for people who are paying for software anyways is to install Adobe Acrobat Writer software for generating PDF files from Word. This also allows you to include ouput from the pdf() device, and the end result comes out really nice.
Dalphin, Mark
2007-Mar-19 18:28 UTC
[R] Hardware for a new Workstation for best performance using R
On Mon, 19 Mar 2007, Thomas Lumley wrote:>> On 3/19/07, Thomas Lumley <tlumley at u.washington.edu> wrote: >>> On Thu, 15 Mar 2007, Andrew Perrin wrote: (in part) >>>> >>>> 2.) Yes, by all means you should use linux instead of windows. The >>>> graphics output is completely compatible with whatever applications you >>>> want to paste them into on Windows. >>> >>> This turns out not to be the case. >>> >>> It is not trivial to produce good graphics off Windows for adding to >>> Microsoft Office documents (regrettably an important case for many >>> people). There has been much discussion of this on the R-sig-macmailing>>> list, for example, where PNG bitmaps (at sufficiently high resolution) >>> seem to be the preferred method. >> >> On Windows one can produce metafile output directly from R. > > Yes, indeed. However, this fact is of limited help when working on another> operating system, which was the focus of the original question. > > -thomasOne solution which has not been covered here is to use both operating systems. For example, I need to present in Powerpoint, yet my work is done under Linux where I have substantially more RAM and CPU power. Typically, I'll run my analysis under Linux and then take advantage of the binary compatibility of the .RData file and move my final values from Linux to Windows via Samba; I may delete large intermediate results before the transfer to compendate for my lack of RAM under Windows. Some small scripts which may have been developed under Linux are used to create the plots which are placed in my Powerpoint presentations. By an large, the plots developed under Linux drop right into the Windows presentations, although there are occasional font size difficulties that require adjustments. Mark Dalphin ---------------------- Mark Dalphin Dept Comp Biol, M/S AW2/D3262 Amgen, Inc. 1201 Amgen Court W Seattle, WA 98119