Hi, I just found a strange increase in the reference number and I'm wondering if there is any reason for it, here is the code.> a=c(1,2,3) > .Internal(inspect(a))@0x000000001bf0b9b0 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3> is.vector(a)[1] TRUE> .Internal(inspect(a))@0x000000001bf0b9b0 14 REALSXP g0c3 [NAM(7)] (len=3, tl=0) 1,2,3 The variable *a* initially has one reference number, after calling *is.vector* function, the reference number goes to 7, which I believe is the highest number that is allowed in R. I also tried the other R functions, *is.atomic, is.integer* and *is.numeric* do not increase the reference number, but *typeof *will do. Is it intentional? Best, Jiefei [[alternative HTML version deleted]]
On 12/07/2019 1:22 p.m., King Jiefei wrote:> Hi, > > I just found a strange increase in the reference number and I'm wondering > if there is any reason for it, here is the code. > >> a=c(1,2,3) >> .Internal(inspect(a)) > @0x000000001bf0b9b0 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3 >> is.vector(a) > [1] TRUE >> .Internal(inspect(a)) > @0x000000001bf0b9b0 14 REALSXP g0c3 [NAM(7)] (len=3, tl=0) 1,2,3 > > The variable *a* initially has one reference number, after calling > *is.vector* function, the reference number goes to 7, which I believe is > the highest number that is allowed in R. I also tried the other R > functions, *is.atomic, is.integer* and *is.numeric* do not increase the > reference number, but *typeof *will do. Is it intentional?is.vector() is a closure that calls .Internal. is.atomic(), is.integer() and is.numeric() are all primitives. Generally speaking closures that call .Internal are easier to implement (e.g. is.vector can use the regular mechanism to set a default for its second argument), but less efficient in CPU time. From it's help page, it appears that the logic for is.vector() is a lot more complex than for the others, so that implementation does make sense. So why does NAMED go to 7? Initially, the vector is bound to a. Within is.vector, it is bound to the local variable x. At this point there are two names bound to the same object, so it has to be considered immutable. There's really no difference between any of the values of 2 or more in the memory manager. (But see http://developer.r-project.org/Refcnt.html for some plans. That document is from about 5 years ago; I don't know the current state.) Duncan Murdoch
Hi Jiefei and Duncan, I suspect what is likely happening is that one of ENSURE_NAMEDMAX or MARK_NOT_MUTABLE are being hit for x. These used to set named to 3, but now set it to 7 (ie the previous and current NAMEDMAX value, respectively). Because these are macros rather than C functions, its not easy to figure out why one of them is being invoked from do_isvector (a cursory exploration didn't reveal what was going on, at least to me) and I don't have the time to dig super deeply into this right now, but perhaps Luke or Tomas know why this is happening of the top of their head. Sorry I can't be of more help. ~G On Fri, Jul 12, 2019 at 11:47 AM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> On 12/07/2019 1:22 p.m., King Jiefei wrote: > > Hi, > > > > I just found a strange increase in the reference number and I'm wondering > > if there is any reason for it, here is the code. > > > >> a=c(1,2,3) > >> .Internal(inspect(a)) > > @0x000000001bf0b9b0 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3 > >> is.vector(a) > > [1] TRUE > >> .Internal(inspect(a)) > > @0x000000001bf0b9b0 14 REALSXP g0c3 [NAM(7)] (len=3, tl=0) 1,2,3 > > > > The variable *a* initially has one reference number, after calling > > *is.vector* function, the reference number goes to 7, which I believe is > > the highest number that is allowed in R. I also tried the other R > > functions, *is.atomic, is.integer* and *is.numeric* do not increase the > > reference number, but *typeof *will do. Is it intentional? > > is.vector() is a closure that calls .Internal. is.atomic(), > is.integer() and is.numeric() are all primitives. > > Generally speaking closures that call .Internal are easier to implement > (e.g. is.vector can use the regular mechanism to set a default for its > second argument), but less efficient in CPU time. From it's help page, > it appears that the logic for is.vector() is a lot more complex than for > the others, so that implementation does make sense. > > So why does NAMED go to 7? Initially, the vector is bound to a. Within > is.vector, it is bound to the local variable x. At this point there are > two names bound to the same object, so it has to be considered > immutable. There's really no difference between any of the values of 2 > or more in the memory manager. (But see > http://developer.r-project.org/Refcnt.html for some plans. That > document is from about 5 years ago; I don't know the current state.) > > Duncan Murdoch > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
If you would like more details I wrote about this recently: https://www.brodieg.com/2019/02/18/an-unofficial-reference-for-internal-inspect/ Basically as soon as you hit a closure R assumes that theargument might still have a surviving reference to iteven after the closure evaluation ends because theclosure environment still exists.? This is why thenamed count is set to the maximum value. Note that since the time I wrote the above andnow the maximum reference count changed to 7 from 3.? Atsome point I believe there was a plan to add true(er?)reference counting in 3.6 to but that was not quiteready in time. On Friday, July 12, 2019, 2:47:41 PM EDT, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: On 12/07/2019 1:22 p.m., King Jiefei wrote:> Hi, > > I just found a strange increase in the reference number and I'm wondering > if there is any reason for it, here is the code. > >> a=c(1,2,3) >> .Internal(inspect(a)) > @0x000000001bf0b9b0 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3 >> is.vector(a) > [1] TRUE >> .Internal(inspect(a)) > @0x000000001bf0b9b0 14 REALSXP g0c3 [NAM(7)] (len=3, tl=0) 1,2,3 > > The variable *a* initially has one reference number, after calling > *is.vector* function, the reference number goes to 7, which I believe is > the highest number that is allowed in R.? I also tried the other R > functions, *is.atomic, is.integer* and *is.numeric* do not increase the > reference number, but *typeof *will do. Is it intentional?is.vector() is a closure that calls .Internal.? is.atomic(), is.integer() and is.numeric() are all primitives. Generally speaking closures that call .Internal are easier to implement (e.g. is.vector can use the regular mechanism to set a default for its second argument), but less efficient in CPU time.? From it's help page, it appears that the logic for is.vector() is a lot more complex than for the others, so that implementation does make sense. So why does NAMED go to 7?? Initially, the vector is bound to a.? Within is.vector, it is bound to the local variable x.? At this point there are two names bound to the same object, so it has to be considered immutable.? There's really no difference between any of the values of 2 or more in the memory manager.? (But see http://developer.r-project.org/Refcnt.html for some plans.? That document is from about 5 years ago; I don't know the current state.) Duncan Murdoch ______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]]