thr3ads.net - R help - [R] Reasons to Use R [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Lorenzo Isella

2007-Apr-05 15:02 UTC

[R] Reasons to Use R

Dear All,
The institute I work for is organizing an internal workshop for High
Performance Computing (HPC).
I am planning to attend it and talk a bit about fluid dynamics, but
there is also quite a lot of interest devoted to data post-processing
and management of huge data sets.
A lot of people are interested in image processing/pattern recognition
and statistic applied to geography/ecology, but I would like not to
post this on too many lists.
The final aim of the workshop is  understanding hardware requirements
and drafting a list of the equipment we would like to buy. I think
this could be the venue to talk about R as well.
Therefore, even if it is not exactly a typical mailing list question,
I would like to have suggestions about where to collect info about:
(1)Institutions (not only academia) using R
(2)Hardware requirements, possibly benchmarks
(3)R & clusters, R & multiple CPU machines, R performance on different
hardware.
(4)finally, a list of the advantages for using R over commercial
statistical packages. The money-saving in itself is not a reason good
enough and some people are scared by the lack of professional support,
though this mailing list is simply wonderful.

Kind Regards

Lorenzo Isella

Schmitt, Corinna

2007-Apr-05 15:35 UTC

head link

[R] Reasons to Use R

Dear Mr. Isella,

I just started my PhD Thesis. I need to work with R. Good sources are
Bioconductor (www.bioconductor.org). It is a DB based on R-programming.
Another institute which has good experiences with R is the HKI in Jena, Germany.
Perhaps you can contact Mrs. Radke to get more information or speakers for your
workshop. Both parties are mainly for bioinformatics methods but perhaps can
help you.

A good reason to use R is that computations are much quicker and you can
import/export from many other programs or languages files.

Happy Easter,
C.Schmitt

**************************************************************************
Corinna Schmitt, Dipl.Inf.(Bioinformatik)
Fraunhofer Institut f?r Grenzfl?chen- & Bioverfahrenstechnik
Nobelstrasse 12, B 3.24
70569 Stuttgart
Germany

phone: +49 711 9704044 
fax: +49 711 9704200
e-mail: Corinna.Schmitt at igb.fraunhofer.de
http://www.igb.fraunhofer.de

 

-----Urspr?ngliche Nachricht-----
Von: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at
stat.math.ethz.ch] Im Auftrag von Lorenzo Isella
Gesendet: Donnerstag, 5. April 2007 17:02
An: r-help at stat.math.ethz.ch
Betreff: [R] Reasons to Use R

Dear All,
The institute I work for is organizing an internal workshop for High
Performance Computing (HPC).
I am planning to attend it and talk a bit about fluid dynamics, but
there is also quite a lot of interest devoted to data post-processing
and management of huge data sets.
A lot of people are interested in image processing/pattern recognition
and statistic applied to geography/ecology, but I would like not to
post this on too many lists.
The final aim of the workshop is  understanding hardware requirements
and drafting a list of the equipment we would like to buy. I think
this could be the venue to talk about R as well.
Therefore, even if it is not exactly a typical mailing list question,
I would like to have suggestions about where to collect info about:
(1)Institutions (not only academia) using R
(2)Hardware requirements, possibly benchmarks
(3)R & clusters, R & multiple CPU machines, R performance on different
hardware.
(4)finally, a list of the advantages for using R over commercial
statistical packages. The money-saving in itself is not a reason good
enough and some people are scared by the lack of professional support,
though this mailing list is simply wonderful.

Kind Regards

Lorenzo Isella

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Joel J. Adamson

2007-Apr-05 17:18 UTC

head link

[R] Reasons to Use R

Lorenzo Isella writes:

 > (4)finally, a list of the advantages for using R over commercial
 > statistical packages.

Here's my entry on the list, as this was a topic of conversation over
lunch: it's better than the proprietary statistical software I use
most of the time.  By better I mean that the language is consistent,
the features are all well-documented and none of it appears to have
been rushed out onto the market.  The proprietary software that I use
most of the time at work seems hurriedly cobbled together and R (nor LaTeX nor
Emacs nor Linux nor...) doesn't give me that feeling.

 > The money-saving in itself is not a reason good
 > enough

Interesting ;)  I know what you mean -- it may even make them
suspicious.

Joel
-- 
Joel J. Adamson
Biostatistician
Pediatric Psychopharmacology Research Unit
Massachusetts General Hospital
Boston, MA  02114
(617) 643-1432
(303) 880-3109

The information transmitted in this electronic communication is intended only
for the person or entity to whom it is addressed and may contain confidential
and/or privileged material. Any review, retransmission, dissemination or other
use of or taking of any action in reliance upon this information by persons or
entities other than the intended recipient is prohibited. If you received this
information in error, please contact the Compliance HelpLine at 800-856-1983 and
properly dispose of this information.

John Kane

2007-Apr-06 00:32 UTC

head link

[R] Reasons to Use R

--- Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
>
> (4)finally, a list of the advantages for using R
> over commercial
> statistical packages. The money-saving in itself is
> not a reason good
> enough and some people are scared by the lack of
> professional support,
> though this mailing list is simply wonderful.
>Given that I can do as much if not more with R (in
most cases) than with commercial software, as an
independent consultant,  'cost' is a very significant
factor. 

A very major advantage of R is the money-saving.  Have
a look at
http://www.spss.com/stores/1/Software_Full_Version_C2.cfm

 and convince me that cost ( for an independent
contractor) is not a good reason.

Stephen Tucker

2007-Apr-06 09:19 UTC

head link

[R] Reasons to Use R

Hi Lorenzo,

I don't think I'm qualified to provide solid information on the first
three questions, but I'd like to drop a few thoughts on (4). While
there are no shortage of language advocates out there, I'd like to
join in for this once. My background is in chemical engineering and
atmospheric science; I've done simulation on a smaller scale but spend
much of my time analyzing large sets of experimental data. I am
comfortable programming in Matlab, R, Python, C, Fortran, Igor Pro,
and I also know a little IDL but have not programmed in it
extensively.

As you are probably aware, I would count among these, Matlab, R,
Python, and IDL as good candidates for processing large data sets, as
they are high-level languages and can communicate with netCDF files
(which I imagine will be used to transfer data).

Each language boasts an impressive array of libraries, but what I
think gives R the advantage for analyzing data is the level of
abstraction in the language. I am extremely impressed with the objects
available to represent data sets, and the functions support them very
well - it requires that I carry around a fewer number of objects to
hold information about my data (and I don't have to "unpack" them
to
feed them into functions). The language is also very "expressive" in
that it lets you write a procedure in many different ways, some
shorter, some more readable, depending on what your situation
requires. System commands and text processing are integrated into the
language, and the input/output facilities are excellent, in terms of
data and graphics. Once I have my data object I am only a few
keystrokes to split, sort, and visualize multivariate data; even after
several years I keep discovering new functions for basic things like
manipulation of data objects and descriptive statistics, and plotting
- truly, an analyst's needs have been well anticipated.

And this is a recent obsession of mine, which I was introduced to
through Python, but the functional programming support for R is
amazing. By using higher-order functions like lapply(), I infrequently
rely on FOR-LOOPS, which have often caused me trouble in the past
because I had forgotten to re-initialize a variable, or incremented
the wrong variable, etc. Though I'm definitely not militant about
functional programming, in general I try to write functions and then
apply them to the data (if the functions don't exist in R already),
often through higher-order functions such as lapply(). This approach
keeps most variables out of the global namespace and so I am less
likely to reassign a value to a variable that I had intended to
keep. It also makes my code more modular so that I can re-use bits of
my code as my analysis inevitably grows much larger than I had
originally intended.

Furthermore, my code in R ends up being much, much shorter than code I
imagine writing in other languages to accomplish the same task; I
believe this leads to fewer places for errors to occur, and the nature
of the code is immediately comprehensible (though a series of nested
functions can get pretty hard to read at times), not to mention it
takes less effort to write. This also makes it easier to interact with
the data, I think, because after making a plot I can set up for the
next plot with only a few function calls instead of setting out to
write a block of code with loops, etc.

I have actually recommended R to colleagues who needed to analyze the
information from large-scale air quality/ global climate simulations,
and they are extremely pleased. I think the capability for statistics
and graphics is well-established enough that I don't need to do a
hard-sell on that so much, but R's language is something I get very
excited about. I do appreciate all the contributors who have made this
available.

Best regards,
ST

--- Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
> Dear All,
> The institute I work for is organizing an internal workshop for High
> Performance Computing (HPC).
> I am planning to attend it and talk a bit about fluid dynamics, but
> there is also quite a lot of interest devoted to data post-processing
> and management of huge data sets.
> A lot of people are interested in image processing/pattern recognition
> and statistic applied to geography/ecology, but I would like not to
> post this on too many lists.
> The final aim of the workshop is  understanding hardware requirements
> and drafting a list of the equipment we would like to buy. I think
> this could be the venue to talk about R as well.
> Therefore, even if it is not exactly a typical mailing list question,
> I would like to have suggestions about where to collect info about:
> (1)Institutions (not only academia) using R
> (2)Hardware requirements, possibly benchmarks
> (3)R & clusters, R & multiple CPU machines, R performance on
different
> hardware.
> (4)finally, a list of the advantages for using R over commercial
> statistical packages. The money-saving in itself is not a reason good
> enough and some people are scared by the lack of professional support,
> though this mailing list is simply wonderful.
> 
> Kind Regards
> 
> Lorenzo Isella
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

____________________________________________________________________________________
Bored stiff? Loosen up...

bogdan romocea

2007-Apr-06 13:47 UTC

head link

[R] Reasons to Use R

> (1)Institutions (not only academia) using R
http://www.r-project.org/useR-2006/participants.html
> (2)Hardware requirements, possibly benchmarks
Since you mention huge data sets, GNU/Linux running on 64-bit machines
with as much RAM as your budget allows.
> (3)R & clusters, R & multiple CPU machines,
> R performance on different hardware.
OpenMosix, Quantian for clusters; the archive for multiple CPUs (this
was asked quite a few times). It may be best to measure R performance
on different hardware by yourself, using your own data and code.
> (4)finally, a list of the advantages for using R over
> commercial statistical packages.
I'd say it's not R vs. commercial packages, but S vs. the rest of the
world. Check http://www.insightful.com/ , much of what they say is
applicable to R. Make the case that S is vastly superior directly, not
just through a list of reasons: take a few data sets and show how they
can be analyzed with S compared to other choices. Both R and S-Plus
are likely to significantly outperform most other software, depending
on the kind of work that needs to be done.

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Lorenzo Isella
> Sent: Thursday, April 05, 2007 11:02 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Reasons to Use R
>
> Dear All,
> The institute I work for is organizing an internal workshop for High
> Performance Computing (HPC).
> I am planning to attend it and talk a bit about fluid dynamics, but
> there is also quite a lot of interest devoted to data post-processing
> and management of huge data sets.
> A lot of people are interested in image processing/pattern recognition
> and statistic applied to geography/ecology, but I would like not to
> post this on too many lists.
> The final aim of the workshop is  understanding hardware requirements
> and drafting a list of the equipment we would like to buy. I think
> this could be the venue to talk about R as well.
> Therefore, even if it is not exactly a typical mailing list question,
> I would like to have suggestions about where to collect info about:
> (1)Institutions (not only academia) using R
> (2)Hardware requirements, possibly benchmarks
> (3)R & clusters, R & multiple CPU machines, R performance on
> different hardware.
> (4)finally, a list of the advantages for using R over commercial
> statistical packages. The money-saving in itself is not a reason good
> enough and some people are scared by the lack of professional support,
> though this mailing list is simply wonderful.
>
> Kind Regards
>
> Lorenzo Isella
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Roland Rau

2007-Apr-06 15:06 UTC

head link

[R] Reasons to Use R

Hi Lorenzo,

On 4/5/07, Lorenzo Isella <lorenzo.isella@gmail.com>
wrote:>
> I would like to have suggestions about where to collect info about:
> (1)Institutions (not only academia) using R

A starting point might be to look at the R-project homepage and look at the
members and donors list. This is, of course, not a comprehensive list; but
at least it can give an overview in which diverse backgrounds people are
using R --- even if it is only the tip of the iceberg.

(2)Hardware requirements, possibly benchmarks

Maybe you should also mention that you can run just from a USB stick if you
want (See R for Windows FAQ 2.6).

(3)R & clusters, R & multiple CPU machines, R performance on
different> hardware.

Have a look a the 'R Administration and Installation' manual; it gives a
nice overview on how many platforms are is running.

Best,
Roland

	[[alternative HTML version deleted]]

Ramon Diaz-Uriarte

2007-Apr-06 19:18 UTC

head link

[R] Reasons to Use R

Dear Lorenzo,

I'll try not to repeat what other have answered before.

On 4/5/07, Lorenzo Isella <lorenzo.isella at gmail.com>
wrote:> The institute I work for is organizing an internal workshop for High
> Performance Computing (HPC).(...)
> (1)Institutions (not only academia) using R
You can count my institution too. Several groups. (I can provide more
details off-list if you want).
> (2)Hardware requirements, possibly benchmarks
> (3)R & clusters, R & multiple CPU machines, R performance on
different hardware.
We do use R in commodity off-the shelf clusters; our two clusters are
running Debian GNU/Linux; both 32-bit machines ---Xeons--- and 64-bit
machines ---dual-core AMD Opterons. We use parallelization quite a
bit, with MPI (via Rmpi and papply packages mainly). One convenient
feature is that (once the lam universe is up and running) whether we
are using the 4 cores in a single box, or the max available 120, is
completeley transparent. Using R and MPI is, really, a piece of cake.
That said, there are things that I miss; in particular, oftentimes I
wish R were Erlang or Oz because of the straightforward fault-tolerant
distributed computing and the built-in abstractions for distribution
and concurrency. The issue of multithreading has come up several times
in this list and is something that some people miss.

I am not sure how much R is used in the usual HPC realms. It is my
understanding that the "traditional HPC" is still dominated by things
such as HPF, and C with MPI, OpenMP, or UPC or Cilk. The usual answer
to "but R is too slow" is "but you can write Fortran or C code
for the
bottlenecks and call it from R". I guess you could use, say, UPC in
that C that is linked to R, but I have no experience. And I think this
code can become a pain to write and maintain (specially if you want to
play around with what you try to parallelize, etc). My feeling (based
on no information or documentation whatsoever) is that how far R can
be stretched or extended into HPC is still an open question.

> (4)finally, a list of the advantages for using R over commercial
> statistical packages. The money-saving in itself is not a reason good
> enough and some people are scared by the lack of professional support,
> though this mailing list is simply wonderful.
>
(In addition to all the already mentioned answers)
Complete source code availability. Being able to look at the C source
code for a few things has been invaluable for me.
And, of course, and extremely active, responsive, and vibrant
community that, among other things, has contributed packages and code
for an incredible range of problems.


Best,

R.

P.S. I'd be interested in hearing about the responses you get to your
presentation.

> Kind Regards
>
> Lorenzo Isella
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

Jorge Cornejo-Donoso

2007-Apr-09 15:18 UTC

head link

[R] Reasons to Use R

tha s9ze of db is an issue with R. We are still using SAS because R
can't handle own db, and of couse we don't want to sacrify resolution,
because the data collection is expensive (at least in fisheries and
oceagraphy), so.. I think that R need to improve the use of big DBs. Now
I only can use R for graph preparation and some data analisis, but we
can't do the main work on R, abd that is really sad.


-----Mensaje original-----
    De: "Wilfred Zegwaard" <wilfred.zegwaard en gmail.com
    Enviado: 08/04/07 21:47:29
    Para: "r-help en stat.math.ethz.ch" <r-help en
stat.math.ethz.ch>
    Asunto: Re: [R] Reasons to Use R
    
    Dear Johann and Gabor,
    
    It's what amounts to large datasets. There are hundreds of datasets
R
    can't handle, probably thousands or more. I noticed on my computer
    (which is nothing more that an average PC) that R breaks down after
250
    MB of memory. I also note that SPSS breaks down, Matlab, etc.
    
    I'm not a SAS user, but I have worked in the past with SAS. It's
very
    good as a remember, but it's ten years ago. And it's a "dollar
machine"
    I've been told: you add dollars to SAS as you add dollars to a
Porsche.
    I haven't got it and for most statistical applications it isn't
    necessary I've been told. R is sufficient for that. The datasets I
use
    are often not that big (the way I like it).
    About three years ago I spoke to somebody who has worked with it and
    said "it's database system is excellent and statistical
profound".
    Someone with a PhD, so probably he is right.
    
    Monte-Carlo simulations are computationally time-consuming, but
probably
    these can be done in R. I haven't seen any libaries for it (they
might
    be there). It has been done with S (the commercial counterpart of
R), so
    probably with R too. If you tie Monte Carlo simulaton with large
    datasets you probably run into problems with a conventional R
system.
    What I've been told in those instances is "buy a new computer"
/
"add
    memory and buy a new processor"... and don't smoke hashiesh.
    
    That wasn't a good advice because the guy who told me smoked
hashiesh
    like hell and drank Pastis (blue liqor) like water. I kicked him
out.
    But that's another story.
    
    Cheers,
    
    Wilfred
    
    (I drink wine and tailor made beer, and only on occasions. That's
why.
    His simulations were good I've been told.)
    
    ______________________________________________
    R-help en stat.math.ethz.ch mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
    
    __________ Informacisn de NOD32, revisisn 2174 (20070409) __________
    
    Este mensaje ha sido analizado con  NOD32 antivirus system
    http://www.nod32.com
    
0v09zwJaTfsDOQ7RUTeNdqwMt0k0g13k7nSRNH7lqV5EPloH4BOK24J0VBtEBZDXGE8qpu6+
ijSFg1Mibvz/d4QfjXF2OJAir3q1m50P0n53bEbpAy/tE5ZeC+5UgsPxhVgTMFGJphHowTnj
iIgpY4XWxQ4lPB20cZFZbPeFWrLverj47QT/2CCdTDb6Shp9rN6Tdq8hZ5PSC45CcGNZ0bY4
WRL9qTp8jk7gK3UUwY6mJK8LL+BrGUWvnrH82GyCcNp0wDdyfdV3xlzk3hnP6jWT3P4WFHk1
wnTuw6GWbVLqnYIbRG/rVOr9Yja24UQW1OO/LY0a4UiadHt80lCKNk1KodNXteYSdVwQfZIf
v3oP/EhSVZBkHFBdXGkFS7M1EFEhrFwCCBLUW7rgiHmYowhTbLRlta4aOGBxhYJsZn6FE4k6
b09AFVuHs8GitGwIevrmoiO9rKv3Nz7XmxK3wx+vX83GpueF3PgKWI55G74PZlWpuXw/CpwA
OSpv235JDyrp+eOEenLkuhPmXoT6K4MELUG0tgktObViekU+wLg4L2wdLnExHwn0oiLRIKqS
TIgD6e9pw/HACPd61kpe9H+x9gkOZF6y3MpR/Fv5eJLN0ACBm1YM5cvRx1+gL/8jvxZHgrXt
cIOVJZv3X4pgl9wnxhzzlv0N/QTFUTRDYIexYuXhDaFcdFWOaAQF3IAuwTDblaKE66f9bkxQ
tXy+Ty44riFsrAJvVd/lSvNgDF+PieOt66WU6tn4IDYDayq1uyq1CS/UyyQuudBQCEdsG2Fk
vStw8Le41nMd+ClxmVEjfEQZ1cPdAphqDxoLqg2Avpe1RfzMPXdgUIxtsjmn7hp19/QJrNld
V/KBwXPh2hsXlyTbiVp8TZZj8fffdxJSbPkWHKCdBgZ4kalMqusIwNva38tmeLJW93MiQ1x2
FKHEDg3LpHbKvNwoPfZRFs0JKUyF/4BwZwRllciv4+LNxV7At1vtxbQJGvZJDjv5cxtHW5ES
hdZCdVBq64jcfMScYUrUIsLI/U97pjGnjFUTyMSuWpN5OOjoxBax+UVNaYjXrhWOJqIs1cKj
VsZsaRfHiKNH565Z7gVk6MLONyq9+KKNNJNDA1JXRQWPmN7zOtPO9J9gyIQ0Bjg46LL4FUJb
uzI/ujIEp6dkqetnDEApMx+h6XmclLbFb/5pdegYO8WQxE7rPhf1+IIqPwAtW/LAsexXBknh
YYMWay5q79VGwYdXwB+mDVeqQe36GvOAcRg1YnSyg/hIo6cqGLbpDYBDWBJ2YE3gxs8HSnue
ZqEh/rm0/htfaaocR5lPwG/kLx196/0/V2o+j2Ntqms0nsh3MzgrryBzw3jF8f4My0ekES9v
5j+RCit+ETL+TpK0UcxofpXYMfPOTbRFN288S1l2Yx+YayR/5G7QU3QdfdglXH7eGlDN1zF5
IgaLpP4B5fsJsWmFB9Gb1O8+e2DBWLDUciuR5KwGAQAAWYJdmzlTc4335pO9jBgK1K1Br+lU
IcAIK3PHlokRizZ+NhsRHOxYb6EyYjp2JZEAqwJVd2OjaAWirmsGPstzDuvlabjzyLvzyI9D
klMxi7Dy8k14FukwiqMSylRRAjbn1XSZCNHoQmL2xc5hdYdvrfMu0eJUtR13UH66DV8yyr4X
WDmbxOV1VI2XeyacMdObBuYLRoMcq3WO9v9bLWLM8ER1atVGC6OUQqufyOXzrGKMuvPbcVAx
FNaA44t0nMttXSkKhMuMRzYfVZNwtJhIhZNOGwClJgjIz5172AYiu2RO3SW/+0YAkdaPQUH1
QO6HQg3KcTrxoGnChZQm/mDEBUrtm8lxOOQCTwpxP/sDLKpDapuYdXVU8MXTeJ2arUXgIucb
nRbNr6TDF15p0fKnVwSVssfqxfcQaO45z+JdLj0nL21geobZZe8zb3kUQa90KyPiQz2YbDBU
gdWLPLx0H6MC5pnEoXRmPJA6PTy2H89YJ7Vxy673htGGJ/fx+FCqPrudWf+Ys5AwrFsEc/fj
hb3vj3Qljp9G6At7nka8NYOZnEHpMF5SVyOH9JQEhaejnx3dz6oQDgm442q06eK6TLrAq8ic
keQhS8+jePcvvvf1DHa3hg16ngyga4XAIeKsAL/nhO9bjx5LYzhiIF4RiEVLz2iC1fug1T7Z
W7zZ1FHNDcCr95dciz3lvuTvcbLNFQ4R4UWlzZnCxbfio3nTKgd+2CaueG+ti5APfWHnqyQs
CK3EXEkL0HDr1qcRBEiOSryzTNhPlO/cXAkWF/ZmrAxCOs6W8gVRRv1XUn6kc8kG4sxcagmr
FMSxGueouVPysp+SAWe+y4YTq8awCylOCIlekkamZgXM2aIxuD42CChpF5WfjkTzKwy1QPt9
fT38q+s7JwhOeV8QbQKpG9lrYtvMRm+UR+7leDkm6w4AwsPSdfe561WFkBPXeDZXFstyB1fl
jrTS8C0J4n90gez5tMRJL+Mzg0yaCjkOOOzpkU7i0gn66R1NjMKeYJ+bXX0WKX5c1SEH/ijm
zOeGicfV6ApcZIpVAVFWEDdej2ToDamgP8epwwuq8tNujQgtDLUkku/GC3ImrvS+Wr311VDn
qg+8kKBiQhAvCp2CT1K33T6el/S3UT5Le7gW+t9K4OuCLOffiVx10OxpTk0Y/vykJ/rfv4TC
8rL6dFxzuksOT9l7mMUkpqhvcjQ1SyaawweWbDaVhCGeg+j7DN5tvWE+sDbKNoMmwgIhXfpS
/gZd8c/HMiU1CL61m1IUT9SjubiuUzA5cbjeDjULu4uN2MOrmgGiItmYbVQkeFMKyAszpYL1
jABo0Jxrsa2hRQomlJcInmpWjXnCA9RyyWlA4IdMYHPWyeQ1IEGK7la0MsQ6/Zw9JIylqHFv
gJowmDN+1XGp5jTB0dq7zzC0IyOqvY+hL4MpRB1Yp3SWw1eCdDT874pabvyZKHDEpsPYISp+
woKVJFz6qpVYUy6MM5GOdvj1e25WCtN/BQo9pq7wJFbu2y5K2RWo8Pyar3baHc0aLgN5nv46
DOrhGW/KZY3T/Yn02Y/vSsLDgXF6avgioyPzuH4DbTbTF5XZIBeQtEoVpzTxSyZQSlbBvLXi
t23yHxpW28DzRSVJtDht2PlzYLUrtBDCR+zxFXYvmYpRYrm9EMB0hIpQJ9FRAFCnVkYCxm4x
1/dktsEN013metm5r7snPGwFhlACJTZqcNI8rvmn0Aat2J4e9fpGs1fWCDYfQG5FJf2Yl6mR
pZXrn/sSSnqUIek6p2JhxUd7xCWfjhFQ7iHrQksOQcusTt7RrcJ7QS0FIj16xrcUr6lFwlAW
/TjWghi2sksDiZ6nyT2mSnZkv63F8WIrf37uwLWkNaGVS+jVH6wHXpwXDOYDMD0wkdvj6iiq
h6Q7GPkRYlM2MeOyj9n3JYG1Ff1aJLDg7KRc3XcCDT+SaqXGnYR8KK7eoDsvpidQ4YoJzwTU
rT+Vb0A+i6SkPnzmDWyR+59XgR4Oz6UXz3BL41esukunnjEbLI06/xJ0RND7ddRFDzEvy6Ro
B/5xIH9UEh9dhojMCS/6rXLgZMgCkdIu1HbrHYZoC1tGJXk+ikxe0brkRCk7r0XgkG3s6Ut7
38WwNDp26tvvZ22sPmxHI7H1rYLmQHNYP17QpUkhFYVDPkVfkxralSFjmhwqb/XZJvkMPmrv
r66hOy1hYQfNRO0BI4ZJ9U0lNPN0UN4HKf1Y3+CcI6oqO6/MtRbm7OLXaquqYbobBj+ff57j
ObntcZ9KU21G4VMhHTDz8bXnEwsL1vf0Kq9D3p1kybksvF+lKzpcPUbhUyEdMPPxtecTCwvW
9/Qqr0PenWTJuSy8X6UrOlwxSIbZgY4NDWEI7fipWLlDeynKWh3vZqjgL4/OXQHTeJXjjNT+
MQV57Le0jKxEe5rcAZNrO0B3e+LED3qvewGhUzEXGhWJWRohjJhjIK7aRjcjFfJR/gxO9L90
YWrcDtPy6p0LnDb+KanV7jg68YQaJn9UXPTgOr3xjLqQkAEGm48RwJotFPYOHX5RnWcG86L9
r3uCZYmSOOSAYDcs8XwVbHNammzTX6JovGQ23PG1zFx1J2hR0qj5zmoMqinkjw/Qp64lmvU4
YsBR0N79oiwmYttlHKiFxwn175sEv9z+Yofo0p5d7FPGktz0zS0CYHWiKQIfvfJvWqKcJcoN
PTTZPur170G0j86UP6ysjg5jzR4Q4CwNUQxx43MbdiCo2+Et83LQdlmvm9kVXo1oZcXCc8q/
BV5S9NpnM8RJrKZi0kuwdjgtxMJTWY5kgIImaNyD9vj4IggyK7xc132s5AjTbd0JHu5+3kz3
kWLb0ooTbCUxqc9VbnusyYPzmVvkts9YjVrUJLCyPTbK8hvqO+Qh2pe3axAvoxXK9OajZfU2
uLpRnX9YsHpDy3ys5DAEZeAgF8oP0CP67THEadxq5wJmIX5pFhQ1j/MmG1Pq3NXKTMp5v//V
UtxhwtugElnvlji+MIZJRSoaRgSSGiZNbzg1NukE82pZP5EH4J3a8L/y1P+KSwsPih6IA3zP
qKKHgxtVywE7Q1Q5NSM43NXNs6DLADcWH1XyWrXtPvf/o6chmV3eYgD4AJtYqesbBp0JYv4v
tWiPZCwRtjmNJQXVLzlmAvFC6tx2JHLDhqohi3XlvoUxo5a2+NM02GSdGLa0iTE8pBCO+xHy
ZgKHoMKjAX9qUP6hPwS/Cghjz2sLaI7FrmOvPcZxSIKySnL+J3WcK/gXpnUSwHmoUqBBVsg8
0F570SG+gz7b3eFLtec5wYQ6aIAdikSrr8zpmWetGzKssbdJ4sRSQMotJBlKZywPCuhG59Ag
sHhCJXlChvgSNoIoHuDveGEVmSLBQ+udB+X10NveyVP6lEHA8E60BHphzoRXgTFVTeY6lAAD
lHAfExjATk6IgXbWXSoSxJ62lyQ7OcnC16gbb5YpPAQHDwWVVv3jIZpW/fk58nzXLK217+6t
LZ+QABEozyEzrgaPwOmbp2ebPfS+XT2qZWbog6594IEHUxPiypc6JdGOwD66H17jkw

Gabor Grothendieck

2007-Apr-09 15:32 UTC

head link

[R] Reasons to Use R

What about the S-Plus question?  S-Plus stores objects in files
whereas R stores them in memory.

On 4/9/07, Jorge Cornejo-Donoso <jorgecornejo at uach.cl>
wrote:> I have a Dell with 2 Intel XEON 3.0 procesors and 2GB of ram....
> The problem is the DB size.
>
> -----Mensaje original-----
> De: Gabor Grothendieck [mailto:ggrothendieck at gmail.com]
> Enviado el: Lunes, 09 de Abril de 2007 11:28
> Para: Jorge Cornejo-Donoso
> CC: r-help at stat.math.ethz.ch
> Asunto: Re: [R] Reasons to Use R
>
> Have you tried 64 bit machines with larger memory or do you mean that you
> can't use R on your current machines?
>
> Also have you tried S-Plus?  Will that work for you? The transition from
> that to R would be less than from SAS to R.
>
> On 4/9/07, Jorge Cornejo-Donoso <jorgecornejo at uach.cl> wrote:
> > tha s9ze of db is an issue with R. We are still using SAS because R
> > can't handle own db, and of couse we don't want to sacrify
resolution,
> > because the data collection is expensive (at least in fisheries and
> > oceagraphy), so.. I think that R need to improve the use of big DBs.
> > Now I only can use R for graph preparation and some data analisis, but
> > we can't do the main work on R, abd that is really sad.
> >
>
>

Greg Snow

2007-Apr-09 16:20 UTC

head link

[R] Reasons to Use R

Here are a couple more thougts to add to what you have already received:

You mentioned that price is not at issue, but there are other costs than
money that you may want to look at.  On my work machine I have R,
S-PLUS, SAS, SPSS, and a couple of other stats programs; on my laptop
and home computers I have R installed.  So, if a deadline is looming and
I am working on a project mainly in R, it is easy to work on it on the
bus or at home (or in a boring meeting), the same does not work for a
SAS or SPSS project (Hmm, thinking about this now, maybe I need to do
less in R :-).

R and S-PLUS are very flexible/customizable, if you have a certain plot
that you make often you can write your own function/script to do it
automatically, most other programs will give you their standard, then
you have to modify it to meet your specifications.  With sweave (and the
odf and html extensions) you can automate whole reports, very useful for
things that you do month after month.

And what I think is the biggest advantage of R and S-PLUS is that they
strongly encourage you to think about your data.  Other programs (at
least that I am familiar with) tend to have 1 specific way of treating
your data, and expect you to modify your data to fit that programs
model.  These models can be overrestrictive (force you to restructure
your data to fit their model) or underrestrictive (allow things that
should really be separate data objects to be combined into a single
"dataset") and sometimes both.  S on the other hand allows many
different ways to store and work with your data, and as you analyze the
data, different branches of new analysis open up depending on early
results rather than just getting stock output for a procedure.  If all
you want is a black box where data goes in one end and a specific answer
comes out the other, then most programs will work; but if you want to
really understand what your data has to tell you, then R/S-PLUS makes
this easy and natural.

Hope this helps,


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
 
 
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Lorenzo Isella
> Sent: Thursday, April 05, 2007 9:02 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Reasons to Use R
> 
> Dear All,
> The institute I work for is organizing an internal workshop 
> for High Performance Computing (HPC).
> I am planning to attend it and talk a bit about fluid 
> dynamics, but there is also quite a lot of interest devoted 
> to data post-processing and management of huge data sets.
> A lot of people are interested in image processing/pattern 
> recognition and statistic applied to geography/ecology, but I 
> would like not to post this on too many lists.
> The final aim of the workshop is  understanding hardware 
> requirements and drafting a list of the equipment we would 
> like to buy. I think this could be the venue to talk about R as well.
> Therefore, even if it is not exactly a typical mailing list 
> question, I would like to have suggestions about where to 
> collect info about:
> (1)Institutions (not only academia) using R (2)Hardware 
> requirements, possibly benchmarks (3)R & clusters, R & 
> multiple CPU machines, R performance on different hardware.
> (4)finally, a list of the advantages for using R over 
> commercial statistical packages. The money-saving in itself 
> is not a reason good enough and some people are scared by the 
> lack of professional support, though this mailing list is 
> simply wonderful.
> 
> Kind Regards
> 
> Lorenzo Isella
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Gabor Grothendieck

2007-Apr-09 16:43 UTC

head link

[R] Reasons to Use R

I might be wrong about this but I thought that the licenses for at least
some of the commercial packages do let you make a copy of the one
you have at work for home use.

On 4/9/07, Greg Snow <Greg.Snow at intermountainmail.org>
wrote:> Here are a couple more thougts to add to what you have already received:
>
> You mentioned that price is not at issue, but there are other costs than
> money that you may want to look at.  On my work machine I have R,
> S-PLUS, SAS, SPSS, and a couple of other stats programs; on my laptop
> and home computers I have R installed.  So, if a deadline is looming and
> I am working on a project mainly in R, it is easy to work on it on the
> bus or at home (or in a boring meeting), the same does not work for a
> SAS or SPSS project (Hmm, thinking about this now, maybe I need to do
> less in R :-).
>
> R and S-PLUS are very flexible/customizable, if you have a certain plot
> that you make often you can write your own function/script to do it
> automatically, most other programs will give you their standard, then
> you have to modify it to meet your specifications.  With sweave (and the
> odf and html extensions) you can automate whole reports, very useful for
> things that you do month after month.
>
> And what I think is the biggest advantage of R and S-PLUS is that they
> strongly encourage you to think about your data.  Other programs (at
> least that I am familiar with) tend to have 1 specific way of treating
> your data, and expect you to modify your data to fit that programs
> model.  These models can be overrestrictive (force you to restructure
> your data to fit their model) or underrestrictive (allow things that
> should really be separate data objects to be combined into a single
> "dataset") and sometimes both.  S on the other hand allows many
> different ways to store and work with your data, and as you analyze the
> data, different branches of new analysis open up depending on early
> results rather than just getting stock output for a procedure.  If all
> you want is a black box where data goes in one end and a specific answer
> comes out the other, then most programs will work; but if you want to
> really understand what your data has to tell you, then R/S-PLUS makes
> this easy and natural.
>
> Hope this helps,
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.snow at intermountainmail.org
> (801) 408-8111
>
>
>
> > -----Original Message-----
> > From: r-help-bounces at stat.math.ethz.ch
> > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Lorenzo
Isella
> > Sent: Thursday, April 05, 2007 9:02 AM
> > To: r-help at stat.math.ethz.ch
> > Subject: [R] Reasons to Use R
> >
> > Dear All,
> > The institute I work for is organizing an internal workshop
> > for High Performance Computing (HPC).
> > I am planning to attend it and talk a bit about fluid
> > dynamics, but there is also quite a lot of interest devoted
> > to data post-processing and management of huge data sets.
> > A lot of people are interested in image processing/pattern
> > recognition and statistic applied to geography/ecology, but I
> > would like not to post this on too many lists.
> > The final aim of the workshop is  understanding hardware
> > requirements and drafting a list of the equipment we would
> > like to buy. I think this could be the venue to talk about R as well.
> > Therefore, even if it is not exactly a typical mailing list
> > question, I would like to have suggestions about where to
> > collect info about:
> > (1)Institutions (not only academia) using R (2)Hardware
> > requirements, possibly benchmarks (3)R & clusters, R &
> > multiple CPU machines, R performance on different hardware.
> > (4)finally, a list of the advantages for using R over
> > commercial statistical packages. The money-saving in itself
> > is not a reason good enough and some people are scared by the
> > lack of professional support, though this mailing list is
> > simply wonderful.
> >
> > Kind Regards
> >
> > Lorenzo Isella
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

halldor bjornsson

2007-Apr-09 21:22 UTC

head link

[R] Reasons to Use R

Dear Lorenzo,

Thanks for starting a great thread here. Like others, I would like to
hear a summary
if you make one.

My institute uses R for internal data processing and analyzing. Below
are some of our reasons, and yes cost (or lack thereof) is not the
only one.

First, prior to the rise of R we already had a number of people using
Splus, and our
main compute server had licenses for Splus. As the institution moved
from Sun Unix
servers to Linux workstations and servers, the licensing issue became
important. Having
to service many licenses (one per workstation, and several on the
servers) is time consuming for overworked IT staff. Furthermore, our
Splus programs that ran routinely on the servers
could all be easily made run on R. Hence, this was really a no-brainer.

Second, R runs on both windows and linux (and solaris and macs,-
although the last one is not really an issue for us). We have made
some user programs that are tailor-made for the work we do, these we
bundle into R packages, that then can be used on both windows and
linux. This was a very important consideration for us.

Third, user community. Even with commercial solutions (such as Matlab)
the quality of the
user community is very important, - if we had felt that R did not have
an active and responsive community we probably would have been more
hesitant. Needless to say
R has an incredibly active community which makes it an attractive environment.
Furthermore, other institutions in our field are also adopting R, at
least in the research departments.

Fourth, R is a good choice for many of the things that we do (data
analysis of varying complexity, good graphics, maptools [working with
shapefiles] etc). It was therefore an obvious candiate for us from the
start.

Now, R does not have everything we want. One thing missing is a decent
R-DB2 connection, for windows the excellent RODBC works fine, but ODBC
support on Linux is  a hassle. The big file issue is there, but many
of our files are GRIB which is a format that is  generally not
supported by anyone.... Furthermore, object graphics, ala pythons
matplotlib (and of course  Matlab) is not there, but would be very
handy. However, that being said, it is easy to make publication (print
and web) quality graphics with R. And of course as always with Open
Source if you miss something bad enough why not do it (or have it
done) yourself and add it to the package.

We have not used R much for large NetCDF datasets, there are other
tools (such as
the CDO package, which also supports GRIB) that are better oriented for this.

We have used R on solaris, Linux (several different flavours) and
Windows (since W98).  We currently use it on our primary production
servers (RedHat Enterprise Edition), but we have not used it in a
parallel setting. We have not used R for making on-the-fly
calculations and graphics for the web, although this is clearly
possible.

I hope this helps, I have found  this thread to be a good one.

Sincerely,
Halld?r

On 4/5/07, Lorenzo Isella <lorenzo.isella at gmail.com>
wrote:> Dear All,
> The institute I work for is organizing an internal workshop for High
> Performance Computing (HPC).
> I am planning to attend it and talk a bit about fluid dynamics, but
> there is also quite a lot of interest devoted to data post-processing
> and management of huge data sets.
> A lot of people are interested in image processing/pattern recognition
> and statistic applied to geography/ecology, but I would like not to
> post this on too many lists.
> The final aim of the workshop is  understanding hardware requirements
> and drafting a list of the equipment we would like to buy. I think
> this could be the venue to talk about R as well.
> Therefore, even if it is not exactly a typical mailing list question,
> I would like to have suggestions about where to collect info about:
> (1)Institutions (not only academia) using R
> (2)Hardware requirements, possibly benchmarks
> (3)R & clusters, R & multiple CPU machines, R performance on
different hardware.
> (4)finally, a list of the advantages for using R over commercial
> statistical packages. The money-saving in itself is not a reason good
> enough and some people are scared by the lack of professional support,
> though this mailing list is simply wonderful.
>
> Kind Regards
>
> Lorenzo Isella
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Halld?r Bj?rnsson
Deildarstj. Ranns. & ?r?un
Ve?ursvi? Ve?urstofu ?slands

Halld?r Bjornsson
Weatherservice R & D
Icelandic Met. Office

Bi-Info (http://members.home.nl/bi-info)

2007-Apr-09 22:23 UTC

head link

[R] Reasons to Use R

Licensing is a big issue in software. The way I prefer it is an easy 
license, a license which makes it possible that I can work on another 
PC, without paying a lot of money. R produces quite good results and is 
widely used. That makes it a statistical package that I want.
The other thing is that working with large datasets requires "some" 
effort by software makers to get it working. I doubt if R has the 
capability of working consistently with large datasets. That is an issue 
I think. I have done some comparisons between SPSS and R, and R seems to 
be performing allright, so I can do computations with it. Nonetheless: 
the data handling is not quite as good I think in comparison with SAS.

When I started doing statistics there were about three packages: SPSS, 
SAS and BMDP (at least: these were available). On a PC you were required 
to use SPSS.
Nowadays there are hundreds, some with excellent database facilities, or 
you can compute the newest statistical tests, or an exotic one. I 
haven't got a clue how to work with new database facilities. dBase was 
my only database education and everything has changed. So I cannot 
answer if R is capable of working with large datasets in relation to 
databases. I really don't know. The only thing I know that if I compute 
a ChiSq, it works on a relatively large dataset (not Fisher tests by the 
way). The same with a likelihood procedure, or tabulations including 
non-parametrics or factor analysis.   But databases are an issue I've 
been told by a guy who works with R. SAS was a better option he told me.

So what's the big deal about S using files instead of memory like R. I 
don't get the point. Isn't there enough swap space for S? (Who cares 
anyway: it works, isn't it?) Or are there any problems with S and large 
datasets? I don't get it. You use them, Greg. So you might discuss that 
issue.

Wilfred

The licences keep changing, some have in the past but don't now, some
you can get an additional licence for home at a discounted price. Some
it depends on the type of licence you have at work (currently our SAS
licence is such that the 3 people in my group can all have it installed,
but at most 1 can be using it at any 1 time, how does that affect
installing/using it at home).  I may be able to install some of the
software at home also, but for most of them I have given up trying to
figure out the legality of it and so I have not installed them at home
to be on the safe side.

Some of the doctors I work with who are also affiliated with the local
university have mentioned that they can get a discounted academic
version of SAS and could use that, but my interpretation of the academic
licence that one showed me (probably not the most recent) said (in my
interpretation, I am not a lawyer) that if they published the results
without paying a licence upgrade fee, they would be violating the
licence (the academic version was intended for teaching only).

The R licence on the other hand is pretty clear that I can install it
and use it pretty much anywhere I want.

You are right in correcting me, R is not the only package that can be
used on multiple computers.  I do think it is the most straight forward
of the good ones.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow op intermountainmail.org
(801) 408-8111

> -----Original Message-----
> From: Gabor Grothendieck [mailto:ggrothendieck op gmail.com] 
> Sent: Monday, April 09, 2007 10:44 AM
> To: Greg Snow
> Cc: Lorenzo Isella; r-help op stat.math.ethz.ch
> Subject: Re: [R] Reasons to Use R
> 
> I might be wrong about this but I thought that the licenses 
> for at least some of the commercial packages do let you make 
> a copy of the one you have at work for home use.
> 
> On 4/9/07, Greg Snow <Greg.Snow op intermountainmail.org> wrote:
> > Here are a couple more thougts to add to what you have 
> already received:
> >
> > You mentioned that price is not at issue, but there are other costs 
> > than money that you may want to look at.  On my work 
> machine I have R, 
> > S-PLUS, SAS, SPSS, and a couple of other stats programs; on 
> my laptop 
> > and home computers I have R installed.  So, if a deadline 
> is looming 
> > and I am working on a project mainly in R, it is easy to 
> work on it on 
> > the bus or at home (or in a boring meeting), the same does not work 
> > for a SAS or SPSS project (Hmm, thinking about this now, 
> maybe I need 
> > to do less in R :-).
> >
> > R and S-PLUS are very flexible/customizable, if you have a certain 
> > plot that you make often you can write your own 
> function/script to do 
> > it automatically, most other programs will give you their standard, 
> > then you have to modify it to meet your specifications.  
> With sweave 
> > (and the odf and html extensions) you can automate whole 
> reports, very 
> > useful for things that you do month after month.
> >
> > And what I think is the biggest advantage of R and S-PLUS 
> is that they 
> > strongly encourage you to think about your data.  Other 
> programs (at 
> > least that I am familiar with) tend to have 1 specific way 
> of treating 
> > your data, and expect you to modify your data to fit that programs 
> > model.  These models can be overrestrictive (force you to 
> restructure 
> > your data to fit their model) or underrestrictive (allow 
> things that 
> > should really be separate data objects to be combined into a single
> > "dataset") and sometimes both.  S on the other hand allows
many
> > different ways to store and work with your data, and as you analyze 
> > the data, different branches of new analysis open up depending on 
> > early results rather than just getting stock output for a 
> procedure.  
> > If all you want is a black box where data goes in one end and a 
> > specific answer comes out the other, then most programs 
> will work; but 
> > if you want to really understand what your data has to tell 
> you, then 
> > R/S-PLUS makes this easy and natural.
> >
> > Hope this helps,
> >
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > greg.snow op intermountainmail.org
> > (801) 408-8111
> >
> >
> >
> > > -----Original Message-----
> > > From: r-help-bounces op stat.math.ethz.ch 
> > > [mailto:r-help-bounces op stat.math.ethz.ch] On Behalf Of Lorenzo
> > > Isella
> > > Sent: Thursday, April 05, 2007 9:02 AM
> > > To: r-help op stat.math.ethz.ch
> > > Subject: [R] Reasons to Use R
> > >
> > > Dear All,
> > > The institute I work for is organizing an internal 
> workshop for High 
> > > Performance Computing (HPC).
> > > I am planning to attend it and talk a bit about fluid 
> dynamics, but 
> > > there is also quite a lot of interest devoted to data 
> > > post-processing and management of huge data sets.
> > > A lot of people are interested in image processing/pattern 
> > > recognition and statistic applied to geography/ecology, 
> but I would 
> > > like not to post this on too many lists.
> > > The final aim of the workshop is  understanding hardware 
> > > requirements and drafting a list of the equipment we 
> would like to 
> > > buy. I think this could be the venue to talk about R as well.
> > > Therefore, even if it is not exactly a typical mailing list 
> > > question, I would like to have suggestions about where to collect
> > > info about:
> > > (1)Institutions (not only academia) using R (2)Hardware 
> > > requirements, possibly benchmarks (3)R & clusters, R & 
> multiple CPU 
> > > machines, R performance on different hardware.
> > > (4)finally, a list of the advantages for using R over commercial 
> > > statistical packages. The money-saving in itself is not a reason 
> > > good enough and some people are scared by the lack of 
> professional 
> > > support, though this mailing list is simply wonderful.
> > >
> > > Kind Regards
> > >
> > > Lorenzo Isella
> > >
> > > ______________________________________________
> > > R-help op stat.math.ethz.ch mailing list 
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible
code.
> > >
> >
> > ______________________________________________
> > R-help op stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
______________________________________________
R-help op stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-- 
No virus found in this incoming message.

20:34

Alan Zaslavsky

2007-Apr-11 15:06 UTC

head link

[R] Reasons to Use R

Right: SAS objects (at least in the base and statistics components of the 
system -- there are dozens of add-ons for particular markets) are simple 
databases.  the predominant model for data manipulation and statistical 
calculation is a row by row operation that creates modified rows and/or 
accumulates totals.  This was pretty much the only way things could be 
done in the days when real (and typically virtual) memory was much smaller 
than it now is.  It can be a pretty efficient model for calculatons that 
fit that pattern.  One downside of course is that a line of R code can 
easily turn into 30 lines of SAS with data steps, sort steps, steps to 
accumulate totals, etc.

As noted by a couple of previous writers, S-Plus might be regarded as 
somewhat intermediate in its model in that objects constitute files but 
rows do not correspond to chunks of adjacent bytes in memory or filespace.

I have thought for a long time that a facility for efficient rowwise 
calculations might be a valuable enhancement to S/R.  The storage of the 
object would be handled by a database and there would have to be an 
efficient interface for pulling a row (or small chunk of rows) out of the 
database repeatedly; alternatively the operatons could be conducted inside
the database.  Basic operations of rowwise calculation and cumulation
(such as forming a column sum or a sum of outer-products) would be
written in an R-like syntax and translated into an efficient set of
operations that work through the database.  (Would be happy to share
some jejeune notes on this.)  However the main answer to thie problem
in the R world seems to have been Moore's Law.  Perhaps somebody could
tell us more about the S-Plus large objects library, or the work that
Doug Bates is doing on efficient calculations with large datasets.

 	Alan Zaslavsky
 	zaslavsk at hcp.med.harvard.edu
> Date: Tue, 10 Apr 2007 16:27:50 -0600
> From: "Greg Snow" <Greg.Snow at intermountainmail.org>
> Subject: Re: [R] Reasons to Use R
> To: "Wensui Liu" <liuwensui at gmail.com>
>
> I think SAS has the database part built into it.  I have heard 2nd hand
> of new statisticians going to work for a company and asking if they have
> SAS, the reply is "Yes we use SAS for our database, does it do
> statistics also?"  Also I heard something about SAS is no longer
> considered an acronym, they like having it be just a name and don't
want
> the fact that one of the S's used to stand for statistics to scare away
> companies that use it as a database.
>
> Maybe someone more up on SAS can confirm or deny this.

Jim Lemon

2007-Apr-12 10:14 UTC

head link

[R] Reasons to Use R

Charilaos Skiadas wrote:> A new fortune candidate perhaps?
> 
> On Apr 10, 2007, at 6:27 PM, Greg Snow wrote:
> 
> 
>>Remember, everything is better than everything else given the right
>>comparison.
>>Only if we remove the grammatical blip that turns it into an infinite 
regress, i.e.

"Remember, anything is better than everything else given the right 
comparison"

Jim

(Ted Harding)

2007-Apr-12 10:45 UTC

head link

[R] Reasons to Use R

On 12-Apr-07 10:14:21, Jim Lemon wrote:> Charilaos Skiadas wrote:
>> A new fortune candidate perhaps?
>> 
>> On Apr 10, 2007, at 6:27 PM, Greg Snow wrote:
>> 
>> 
>>>Remember, everything is better than everything else given the
>>>right comparison.
>>>
> Only if we remove the grammatical blip that turns it into an infinite 
> regress, i.e.
> 
> "Remember, anything is better than everything else given the right 
> comparison"
> 
> Jim
Oh dear, I would be disappointed with that, Jim.

I was rather enjoying the vision of a "topological sort tree"
(ordered by "better according to some comparison") in which every
single thing had everything else hanging off it, and in turn was
hanging off everything else!

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 12-Apr-07                                       Time: 11:45:05
------------------------------ XFMail ------------------------------

Jim Lemon

2007-Apr-13 09:29 UTC

head link

[R] Reasons to Use R

(Ted Harding) wrote:> On 12-Apr-07 10:14:21, Jim Lemon wrote:
> 
>>Charilaos Skiadas wrote:
>>
>>>A new fortune candidate perhaps?
>>>
>>>On Apr 10, 2007, at 6:27 PM, Greg Snow wrote:
>>>
>>>
>>>
>>>>Remember, everything is better than everything else given the
>>>>right comparison.
>>>>
>>
>>Only if we remove the grammatical blip that turns it into an infinite 
>>regress, i.e.
>>
>>"Remember, anything is better than everything else given the right 
>>comparison"
>>
>>Jim
> 
> 
> Oh dear, I would be disappointed with that, Jim.
> 
> I was rather enjoying the vision of a "topological sort tree"
> (ordered by "better according to some comparison") in which every
> single thing had everything else hanging off it, and in turn was
> hanging off everything else!
> Sorry, Ted, I think Benoit Mandelbrot beat you to it.

Jim

Reasonably Related Threads

Search for more maybe matching threads

R help - Apr 2007 - Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

[R] Reasons to Use R

Reasonably Related Threads