Piotr Chmielowski
2009-Aug-28  14:35 UTC
[R] Question: how to index (subset) a data frame without memory overhead
1. Suppose one has a big data frame (say, m such that dim(m)=c(8610, 3521) ) If only a subset of m, say m[1:8600, ] is now needed, how to select it without creating large memory overhead? A natural solution, m <- m[1:8600,], seems to use in addition to memory needed to hold m roughly 2 times more memory - making the total memory required over 3 times object.size(m), as seen by using memory.size(max=T). This is understandable since the arguments are passed as value. However, is there a natural way around this memory overhead? 2. Similarly, if one has another data frame n such that dim(n)=c(10, 3521), doing m <- rbind(m,n) also needs the same amount of memory overhead. Is there any way around that? I tried package ref, but could not solve the particular problems above. Any help would be appreciated. Piotr Piotr Chmielowski Chief Operating Officer/Group Risk Manager Kingsley House, Wimpole Street, London, W1G 0RE, United Kingdom http://www.reechaim.com/ DDI: +44 (0)20 7399 3662 Switchboard: +44 (0)20 7399 3650 Fax: +44 (0)20 7399 3698 Mobile: +44 (0)7825 711 957 Email: piotr.chmielowski at reechaim.com http://www.reechaim.com/ Reech AiM Partners LLP, Registered Office: 42-44 Portman Road, Reading, Berkshire, RG30 1EA. Registered in England and Wales No. OC321436. Authorised and regulated by the Financial Services Authority. Reech CBRE Alternative Real Estate LLP, Registered Office: 42-44 Portman Road, Reading, Berkshire, RG30 1EA. Registered in England and Wales No. OC322313. Authorised and regulated by the Financial Services Authority. This message and any attachments (the "message") is intended solely for the addressees and is confidential. If you receive this message in error, please delete it and immediately notify the sender. Any use not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except by formal approval. The internet can not guarantee the integrity of this message. We shall not therefore be liable for the message if modified. ______________________________________________________________________ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email