Utkarsh Singhal
2010-Jan-05 20:16 UTC
[R] why is object.size is more for constant numeric vector?
Hi All, I ran the following lines in R: print(object.size(a <- rep(1,10^6)),units="Mb") print(object.size(a <- rep(3.542,10^6)),units="Mb") print(object.size(b <- rep("x",10^6)),units="Mb") print(object.size(b <- rep("xyzxyz xyz",10^6)),units="Mb") print(object.size(b <- 1:10^6),units="Mb") print(object.size(b <- rep(1:10,each=10^5)),units="Mb") print(object.size(b <- rep(TRUE,10^6)),units="Mb") The object size from first two lines is 7.6 MB, but from the last five it is 3.8 MB, although the length of vector is same. Apparently, the size of any vector of a given length is twice if the vector is numeric constant than if it is not. Why is it so? Is my observation wrong? Or, is there some catch with 'object.size'? Thanks in advance. Regards Utkarsh [[alternative HTML version deleted]]
Gabor Grothendieck
2010-Jan-05 22:12 UTC
[R] why is object.size is more for constant numeric vector?
Note this:> class(rep(1, 3))[1] "numeric"> class(1:3)[1] "integer" On Tue, Jan 5, 2010 at 3:16 PM, Utkarsh Singhal <utkarsh.iit at gmail.com> wrote:> Hi All, > > I ran the following lines in R: > > print(object.size(a <- rep(1,10^6)),units="Mb") > print(object.size(a <- rep(3.542,10^6)),units="Mb") > > print(object.size(b <- rep("x",10^6)),units="Mb") > print(object.size(b <- rep("xyzxyz xyz",10^6)),units="Mb") > print(object.size(b <- 1:10^6),units="Mb") > print(object.size(b <- rep(1:10,each=10^5)),units="Mb") > print(object.size(b <- rep(TRUE,10^6)),units="Mb") > > The object size from first two lines is 7.6 MB, but from the last five it is > 3.8 MB, although the length of vector is same. > > Apparently, the size of any vector of a given length is twice if the vector > is numeric constant than if it is not. > > Why is it so? Is my observation wrong? Or, is there some catch with > 'object.size'? > > Thanks in advance. > Regards > Utkarsh > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Prof Brian Ripley
2010-Jan-06 06:31 UTC
[R] why is object.size is more for constant numeric vector?
On Wed, 6 Jan 2010, Utkarsh Singhal wrote:> Hi All, > > I ran the following lines in R: > > print(object.size(a <- rep(1,10^6)),units="Mb") > print(object.size(a <- rep(3.542,10^6)),units="Mb") > > print(object.size(b <- rep("x",10^6)),units="Mb") > print(object.size(b <- rep("xyzxyz xyz",10^6)),units="Mb") > print(object.size(b <- 1:10^6),units="Mb") > print(object.size(b <- rep(1:10,each=10^5)),units="Mb") > print(object.size(b <- rep(TRUE,10^6)),units="Mb") > > The object size from first two lines is 7.6 MB, but from the last five it is > 3.8 MB, although the length of vector is same. > > Apparently, the size of any vector of a given length is twice if the vector > is numeric constant than if it is not. > > Why is it so? Is my observation wrong? Or, is there some catch with > 'object.size'?Your observation is faulty. The first two are type "double", and a C double takes 8 bytes. The last three are type "integer" or "logical" with values stored in C int, 4 bytes each. Character strings are harder to compute storage for as identical strings share storage. On a 32-bit machine identical strings take an extra 4 bytes per string, on a 64-bit machine an extra 8 bytes. If you look at the values in bytes you will see that 'twice' is an approximation. So - the storage needed depends on the type of the vector as well as the length. - for character vectors it depends on the architecture and the content. Please do consult the R manuals rather than expect others to read them for you: this is all in the 'R Internals' manual.> > Thanks in advance. > Regards > Utkarsh > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595