Thanks a lot for answering. Before I get into it, please note that everything below bears the big capture "Thanks for trying to help me at all". 1) Yeah, those examples - quite hard to satisfy everyone's needs ;-) While the one side complained that my past examples regarding this issue were not informative enough, others didn't like the more elaborated version (as seems to be the case for you). I simply tried to make it as easy as possible for people to see what's actually going on so they wouldn't have to program their own stuff for things like reading the actual memory consumed by the Rterm process etc.. If you prefer plain vanilla, though, I guess this would be it: memoryLeak <- function( x = system.file("exampleData", "mtcars.xml", package="XML"), n = 5000, free_doc = FALSE, rm_doc = FALSE, use_gc = FALSE ) { lapply(1:n, function(ii) { doc <- xmlParse(x) if (free_doc) free(doc) if (rm_doc) rm(doc) if (use_gc) gc() NULL }) } 2) If I knew my way around OSX or Linux, I would be happy to go with your suggestions - but as I'm not, unfortunately that's out of reach for me. But IMO, a deeper level of cross-platform expertise should **not** be a generall prerequisite before you can ask for help - even at r-devel (as opposed to r-help). However, AFAIK from past conversations with Duncan, the problem is indeed Windows-specific as on all his non-Windows infrastructure (definitely Linux, possibly OSX), everything went fine. 3) The same goes for the level of expertise in C. After all, R is not C. I totally agree that the more programming languages one knows, the better. But again: I don't think that knowing your way around C should be a prerequisite for asking for help when an *R function* interfacing C causes trouble. Requesting this would sort of oppose R's nature/paradigm of being an awesome "top-level" interfacing language. But I'll try to narrow the problem down on a C-level if I can help you with that. 4) Both Duncan as well as Hadley have suggested that libxml2 is indeed causing the problem. So trying to link against another build would possibly be a great way to start! How would I go about that? Thanks if you should take the time to further look into this! Janko On Mon, Dec 15, 2014 at 4:54 AM, Jeroen Ooms <jeroenooms at gmail.com> wrote:> > On Thu, Dec 11, 2014 at 12:13 PM, Janko Thyson <janko.thyson at gmail.com> > wrote: >> >> I'd so much appreciate if someone could have a look at this. If I can be >> of >> any help whatsoever, please let me know! >> > > Your current code uses various functions from XML and rvest so it is not a > *minimal* reproducible example. Even if you are unfamiliar with C, you > should be able to investigate exactly which function in the XML package you > think has issues. Once you found the problematic R function, inspect the > source code or use debug() to see if you can narrow it down even further, > preferably to a particular call to C. > > Moreover you should create a reproducible example that allows us (and you) > to test if this problem appears on other systems such as OSX or linux. > Development and debugging on Windows is very painful so your windows-only > example is not too helpful. Making people use windows is not a good > strategy for getting help. > > If the "leak" does not appear on other systems, it is likely a problem in > the libxml2 windows library on cran. In that case we can try to link > against another build. On the other hand, if the problem does appear across > systems, and you have provided a minimal reproducible example that > pinpoints the problematic C function, we can help you review/debug the code > C to see if/where some allocated object is not properly freed. > > > >[[alternative HTML version deleted]]
Sorry guys, didn't see your responses before sending mine. Thanks jeroen!! I'll test your version today and get back to you. Gesendet von meinem Smartphone Am 15.12.2014 12:12 schrieb "Janko Thyson" <janko.thyson at gmail.com>:> Thanks a lot for answering. Before I get into it, please note that > everything below bears the big capture "Thanks for trying to help me at > all". > > 1) Yeah, those examples - quite hard to satisfy everyone's needs ;-) While > the one side complained that my past examples regarding this issue were not > informative enough, others didn't like the more elaborated version (as > seems to be the case for you). I simply tried to make it as easy as > possible for people to see what's actually going on so they wouldn't have > to program their own stuff for things like reading the actual memory > consumed by the Rterm process etc.. If you prefer plain vanilla, though, I > guess this would be it: > > memoryLeak <- function( > x = system.file("exampleData", "mtcars.xml", package="XML"), > n = 5000, > free_doc = FALSE, > rm_doc = FALSE, > use_gc = FALSE > ) { > lapply(1:n, function(ii) { > doc <- xmlParse(x) > if (free_doc) free(doc) > if (rm_doc) rm(doc) > if (use_gc) gc() > NULL > }) > } > > 2) If I knew my way around OSX or Linux, I would be happy to go with your > suggestions - but as I'm not, unfortunately that's out of reach for me. But > IMO, a deeper level of cross-platform expertise should **not** be a > generall prerequisite before you can ask for help - even at r-devel (as > opposed to r-help). However, AFAIK from past conversations with Duncan, the > problem is indeed Windows-specific as on all his non-Windows infrastructure > (definitely Linux, possibly OSX), everything went fine. > > 3) The same goes for the level of expertise in C. After all, R is not C. I > totally agree that the more programming languages one knows, the better. > But again: I don't think that knowing your way around C should be a > prerequisite for asking for help when an *R function* interfacing C causes > trouble. Requesting this would sort of oppose R's nature/paradigm of being > an awesome "top-level" interfacing language. But I'll try to narrow the > problem down on a C-level if I can help you with that. > > 4) Both Duncan as well as Hadley have suggested that libxml2 is indeed > causing the problem. So trying to link against another build would possibly > be a great way to start! How would I go about that? > > Thanks if you should take the time to further look into this! > Janko > > On Mon, Dec 15, 2014 at 4:54 AM, Jeroen Ooms <jeroenooms at gmail.com> wrote: >> >> On Thu, Dec 11, 2014 at 12:13 PM, Janko Thyson <janko.thyson at gmail.com> >> wrote: >>> >>> I'd so much appreciate if someone could have a look at this. If I can be >>> of >>> any help whatsoever, please let me know! >>> >> >> Your current code uses various functions from XML and rvest so it is not >> a *minimal* reproducible example. Even if you are unfamiliar with C, you >> should be able to investigate exactly which function in the XML package you >> think has issues. Once you found the problematic R function, inspect the >> source code or use debug() to see if you can narrow it down even further, >> preferably to a particular call to C. >> >> Moreover you should create a reproducible example that allows us (and >> you) to test if this problem appears on other systems such as OSX or linux. >> Development and debugging on Windows is very painful so your windows-only >> example is not too helpful. Making people use windows is not a good >> strategy for getting help. >> >> If the "leak" does not appear on other systems, it is likely a problem in >> the libxml2 windows library on cran. In that case we can try to link >> against another build. On the other hand, if the problem does appear across >> systems, and you have provided a minimal reproducible example that >> pinpoints the problematic C function, we can help you review/debug the code >> C to see if/where some allocated object is not properly freed. >> >> >> >>[[alternative HTML version deleted]]
Sorry guys, didn't see your responses before sending mine. Thanks jeroen!! I'll test your version today and get back to you. Gesendet von meinem Smartphone Am 15.12.2014 12:12 schrieb "Janko Thyson" <janko.thyson at gmail.com>:> Thanks a lot for answering. Before I get into it, please note that > everything below bears the big capture "Thanks for trying to help me at > all". > > 1) Yeah, those examples - quite hard to satisfy everyone's needs ;-) While > the one side complained that my past examples regarding this issue were not > informative enough, others didn't like the more elaborated version (as > seems to be the case for you). I simply tried to make it as easy as > possible for people to see what's actually going on so they wouldn't have > to program their own stuff for things like reading the actual memory > consumed by the Rterm process etc.. If you prefer plain vanilla, though, I > guess this would be it: > > memoryLeak <- function( > x = system.file("exampleData", "mtcars.xml", package="XML"), > n = 5000, > free_doc = FALSE, > rm_doc = FALSE, > use_gc = FALSE > ) { > lapply(1:n, function(ii) { > doc <- xmlParse(x) > if (free_doc) free(doc) > if (rm_doc) rm(doc) > if (use_gc) gc() > NULL > }) > } > > 2) If I knew my way around OSX or Linux, I would be happy to go with your > suggestions - but as I'm not, unfortunately that's out of reach for me. But > IMO, a deeper level of cross-platform expertise should **not** be a > generall prerequisite before you can ask for help - even at r-devel (as > opposed to r-help). However, AFAIK from past conversations with Duncan, the > problem is indeed Windows-specific as on all his non-Windows infrastructure > (definitely Linux, possibly OSX), everything went fine. > > 3) The same goes for the level of expertise in C. After all, R is not C. I > totally agree that the more programming languages one knows, the better. > But again: I don't think that knowing your way around C should be a > prerequisite for asking for help when an *R function* interfacing C causes > trouble. Requesting this would sort of oppose R's nature/paradigm of being > an awesome "top-level" interfacing language. But I'll try to narrow the > problem down on a C-level if I can help you with that. > > 4) Both Duncan as well as Hadley have suggested that libxml2 is indeed > causing the problem. So trying to link against another build would possibly > be a great way to start! How would I go about that? > > Thanks if you should take the time to further look into this! > Janko > > On Mon, Dec 15, 2014 at 4:54 AM, Jeroen Ooms <jeroenooms at gmail.com> wrote: >> >> On Thu, Dec 11, 2014 at 12:13 PM, Janko Thyson <janko.thyson at gmail.com> >> wrote: >>> >>> I'd so much appreciate if someone could have a look at this. If I can be >>> of >>> any help whatsoever, please let me know! >>> >> >> Your current code uses various functions from XML and rvest so it is not >> a *minimal* reproducible example. Even if you are unfamiliar with C, you >> should be able to investigate exactly which function in the XML package you >> think has issues. Once you found the problematic R function, inspect the >> source code or use debug() to see if you can narrow it down even further, >> preferably to a particular call to C. >> >> Moreover you should create a reproducible example that allows us (and >> you) to test if this problem appears on other systems such as OSX or linux. >> Development and debugging on Windows is very painful so your windows-only >> example is not too helpful. Making people use windows is not a good >> strategy for getting help. >> >> If the "leak" does not appear on other systems, it is likely a problem in >> the libxml2 windows library on cran. In that case we can try to link >> against another build. On the other hand, if the problem does appear across >> systems, and you have provided a minimal reproducible example that >> pinpoints the problematic C function, we can help you review/debug the code >> C to see if/where some allocated object is not properly freed. >> >> >> >>[[alternative HTML version deleted]]
@Jeroen: nope, seems like the problem unfortunately persists: require("XML") getTaskMemoryByPid <- function( pid = Sys.getpid() ) { cmd <- sprintf("tasklist /FI \"pid eq %s\" /FO csv", pid) mem <- read.csv(text=shell(cmd, intern = TRUE), stringsAsFactors=FALSE)[,5] mem <- as.numeric(gsub("\\.|\\s|K", "", mem))/1000 mem } getCurrentMemoryStatus <- function() { mem_os <- getTaskMemoryByPid() mem_r <- memory.size() prof_1 <- memory.profile() list(r = mem_r, os = mem_os, ratio = mem_os/mem_r) } memoryLeak <- function( x = system.file("exampleData", "mtcars.xml", package="XML"), n = 5000, free_doc = FALSE, rm_doc = FALSE, use_gc = FALSE ) { lapply(1:n, function(ii) { doc <- xmlParse(x) if (free_doc) free(doc) if (rm_doc) rm(doc) if (use_gc) gc() NULL }) } mem_1 <- getCurrentMemoryStatus() memoryLeak(n = 50000, free_doc = TRUE, rm_doc = TRUE) mem_2 <- getCurrentMemoryStatus()> rbind(data.frame(mem_1), data.frame(mem_2))r os ratio 1 63.65 87.148 1.369175 2 97.63 122.160 1.251255 On Mon, Dec 15, 2014 at 12:25 PM, Janko Thyson <janko.thyson at gmail.com> wrote:> > Sorry guys, didn't see your responses before sending mine. > > Thanks jeroen!! I'll test your version today and get back to you. > > Gesendet von meinem Smartphone > Am 15.12.2014 12:12 schrieb "Janko Thyson" <janko.thyson at gmail.com>: > > > Thanks a lot for answering. Before I get into it, please note that > > everything below bears the big capture "Thanks for trying to help me at > > all". > > > > 1) Yeah, those examples - quite hard to satisfy everyone's needs ;-) > While > > the one side complained that my past examples regarding this issue were > not > > informative enough, others didn't like the more elaborated version (as > > seems to be the case for you). I simply tried to make it as easy as > > possible for people to see what's actually going on so they wouldn't have > > to program their own stuff for things like reading the actual memory > > consumed by the Rterm process etc.. If you prefer plain vanilla, though, > I > > guess this would be it: > > > > memoryLeak <- function( > > x = system.file("exampleData", "mtcars.xml", package="XML"), > > n = 5000, > > free_doc = FALSE, > > rm_doc = FALSE, > > use_gc = FALSE > > ) { > > lapply(1:n, function(ii) { > > doc <- xmlParse(x) > > if (free_doc) free(doc) > > if (rm_doc) rm(doc) > > if (use_gc) gc() > > NULL > > }) > > } > > > > 2) If I knew my way around OSX or Linux, I would be happy to go with your > > suggestions - but as I'm not, unfortunately that's out of reach for me. > But > > IMO, a deeper level of cross-platform expertise should **not** be a > > generall prerequisite before you can ask for help - even at r-devel (as > > opposed to r-help). However, AFAIK from past conversations with Duncan, > the > > problem is indeed Windows-specific as on all his non-Windows > infrastructure > > (definitely Linux, possibly OSX), everything went fine. > > > > 3) The same goes for the level of expertise in C. After all, R is not C. > I > > totally agree that the more programming languages one knows, the better. > > But again: I don't think that knowing your way around C should be a > > prerequisite for asking for help when an *R function* interfacing C > causes > > trouble. Requesting this would sort of oppose R's nature/paradigm of > being > > an awesome "top-level" interfacing language. But I'll try to narrow the > > problem down on a C-level if I can help you with that. > > > > 4) Both Duncan as well as Hadley have suggested that libxml2 is indeed > > causing the problem. So trying to link against another build would > possibly > > be a great way to start! How would I go about that? > > > > Thanks if you should take the time to further look into this! > > Janko > > > > On Mon, Dec 15, 2014 at 4:54 AM, Jeroen Ooms <jeroenooms at gmail.com> > wrote: > >> > >> On Thu, Dec 11, 2014 at 12:13 PM, Janko Thyson <janko.thyson at gmail.com> > >> wrote: > >>> > >>> I'd so much appreciate if someone could have a look at this. If I can > be > >>> of > >>> any help whatsoever, please let me know! > >>> > >> > >> Your current code uses various functions from XML and rvest so it is not > >> a *minimal* reproducible example. Even if you are unfamiliar with C, you > >> should be able to investigate exactly which function in the XML package > you > >> think has issues. Once you found the problematic R function, inspect the > >> source code or use debug() to see if you can narrow it down even > further, > >> preferably to a particular call to C. > >> > >> Moreover you should create a reproducible example that allows us (and > >> you) to test if this problem appears on other systems such as OSX or > linux. > >> Development and debugging on Windows is very painful so your > windows-only > >> example is not too helpful. Making people use windows is not a good > >> strategy for getting help. > >> > >> If the "leak" does not appear on other systems, it is likely a problem > in > >> the libxml2 windows library on cran. In that case we can try to link > >> against another build. On the other hand, if the problem does appear > across > >> systems, and you have provided a minimal reproducible example that > >> pinpoints the problematic C function, we can help you review/debug the > code > >> C to see if/where some allocated object is not properly freed. > >> > >> > >> > >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Apparently Analagous Threads
- Significant memory leak when using XML on Windows
- Significant memory leak when using XML on Windows
- Significant memory leak when using XML on Windows
- Significant memory leak when using XML on Windows
- Flattening lists and environments (was: "how to flatten a list to the same level?")