J'aimerais savoir si la fonction merge() est la seule disponible pour concatener des tableaux de donn?es? Est-ce normal que l'ex?cution soit lente?
In S-Plus, and presumably also in R, the execution of merge() with large data.frames is slow. When speed becomes an issue, my colleagues use other language to handle this kind of operation with large data sets. Hope this helps, Spencer Graves Erwan BARRET wrote:> J'aimerais savoir si la fonction merge() est la seule disponible pour concatener des tableaux de donn?es? > Est-ce normal que l'ex?cution soit lente? > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
La langue officielle de la liste est bien Anglais. (The official language of the list is English.) On Wed, Apr 16, 2003 at 03:39:11PM +0200, Erwan BARRET wrote:> J'aimerais savoir si la fonction merge() est la seule disponible pour concatener des tableaux de donn?es?Eh non, pour concatener il en a egalement rbind(), cbind() et pas mal d'autres. (Oh no, there are also rbind() and cbind(), among others.)> Est-ce normal que l'ex?cution soit lente?Ca depend. (Depends) Dirk (Dirk) -- Wishful thinking can dominate much of the work of a profession for a decade, but not indefinitely. -- Robert Shiller, on Efficient Markets models, 2002
? "concatener" ? Transliterated, this suggests the function rbind(). The function merge() does something very special. If rbind() will do what you want, it might be much faster. - tom blackwell - u michigan medical school - ann arbor - On Wed, 16 Apr 2003, Erwan BARRET wrote:> J'aimerais savoir si la fonction merge() est la seule > disponible pour concatener des tableaux de données? > Est-ce normal que l'exécution soit lente? >
The function merge() is like the "join" operation in relational data bases - it's much more powerful than mere concatenation (and thus often much slower, especially on large tables.) To merely concatenate tables together, use rbind() (to concatenate by rows) or cbind() (to concatenate by columns). If you do need the power of merge(), but it is too slow for your purposes you may be able to write a special-purpose function in R that does just only you need and much more quickly -- such is the nature of the S language -- it is very powerful, but the powerful general-purpose functions can often be quite slow in particular cases. Hope this helps, and apologies if I have not completely understood your question. -- Tony Plate At Wednesday 03:39 PM 4/16/2003 +0200, Erwan BARRET wrote:>J'aimerais savoir si la fonction merge() est la seule disponible pour >concatener des tableaux de donn?es? >Est-ce normal que l'ex?cution soit lente? > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-help
For writing "special-purpose" merge functions as mentioned below, you will probably want to know about match(). A lot of the essential work of merge() is being done by match(), but merge() also includes a lot of idiot-proofing. Comparing the length of the code for these two functions will give you some idea of the overhead of all that idiot-proofing. If you write your own merge function using match() and cbind(), or match() and data.frame(), it will probably be much faster (but less idiot proof; e.g. if your "by" variable is a factor, you need to be very careful - probably you want to convert it to character prior to the merging operation).>The function merge() is like the "join" operation in relational data bases >- it's much more powerful than mere concatenation (and thus often much >slower, especially on large tables.) > > >To merely concatenate tables together, use rbind() (to concatenate by rows) >or cbind() (to concatenate by columns). > > >If you do need the power of merge(), but it is too slow for your purposes >you may be able to write a special-purpose function in R that does just >only you need and much more quickly -- such is the nature of the S language >-- it is very powerful, but the powerful general-purpose functions can >often be quite slow in particular cases. > > >Hope this helps, and apologies if I have not completely understood your >question. > > >-- Tony Plate > > > >At Wednesday 03:39 PM 4/16/2003 +0200, Erwan BARRET wrote: >>J'aimerais savoir si la fonction merge() est la seule disponible pour >>concatener des tableaux de donn?es? >>Est-ce normal que l'ex?cution soit lente? >>James A. Rogers, Ph.D. <rogers at cantatapharm.com> Statistical Scientist Cantata Pharmaceuticals 300 Technology Square, 5th floor Cambridge, MA 02139 617.225.9009 x312 Fax 617.225.9010