Hello,
I wrote the function below and have the problem, that the "text" bit
returns
only a trimmed version (686 chars as far as I can see) of the content under 
the "fetchPeaks" condition.
Any hunches why that might be?
Thanks for pointer, Joh
xmlEventParse(fileName,
    list(
      startElement=function(name, attrs){
	if(name == "scan"){
	  if(.GlobalEnv$ms2Scan == TRUE & .GlobalEnv$scanDone == TRUE){
	    cat(.GlobalEnv$scanNum,"\n")
	    MakeSpektrumEntry()
	  }
	  .GlobalEnv$scanDone <- FALSE
	  .GlobalEnv$fetchPrecMz <- FALSE
	  .GlobalEnv$fetchPeaks <- FALSE
	  .GlobalEnv$ms2Scan <- FALSE
	  if(attrs[["msLevel"]] == "2"){
	    .GlobalEnv$ms2Scan <- TRUE
	    .GlobalEnv$scanNum <- as.integer(attrs[["num"]])
	  }
	} else if(name == "precursorMz" & .GlobalEnv$ms2Scan == TRUE){
	  .GlobalEnv$fetchPrecMz <- TRUE
	} else if(name == "peaks" & .GlobalEnv$ms2Scan == TRUE){
	  .GlobalEnv$fetchPeaks <- TRUE
	}
      },
      text=function(text){
	if(.GlobalEnv$fetchPrecMz == TRUE){
	  .GlobalEnv$precursorMz <- as.numeric(text)
	  .GlobalEnv$fetchPrecMz <- FALSE
	}
	if(.GlobalEnv$fetchPeaks == TRUE){
	  .GlobalEnv$peaks <- text
	  .GlobalEnv$fetchPeaks <- FALSE
	  .GlobalEnv$scanDone <- TRUE
	}
      }
    )
  )
> sessionInfo() 
R version 2.9.0 beta (2009-04-03 r48277) 
x86_64-pc-linux-gnu 
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=en_US.UTF-8;LC_ADDRESS=en_US.UTF-8;LC_TELEPHONE=en_US.UTF-8;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=en_US.UTF-8
attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods  
[8] base     
other attached packages:
 [1] caMassClass_1.6 MASS_7.2-46     digest_0.3.1    caTools_1.9    
 [5] bitops_1.0-4.1  rpart_3.1-43    nnet_7.2-46     e1071_1.5-19   
 [9] class_7.2-46    PROcess_1.19.1  Icens_1.15.2    survival_2.35-4
[13] RCurl_0.94-1    XML_2.3-0       rkward_0.5.0   
loaded via a namespace (and not attached):
[1] tools_2.9.0
Hi Johannes I would "guess" that the trimming of the text occurs because you do not specify trim = FALSE in the call to xmlEventParse(). If you specify this, you might well get the results you expect. If not, can you post the actual file you are reading so we can reproduce your results. D. Johannes Graumann wrote:> Hello, > > I wrote the function below and have the problem, that the "text" bit returns > only a trimmed version (686 chars as far as I can see) of the content under > the "fetchPeaks" condition. > Any hunches why that might be? > > Thanks for pointer, Joh > > xmlEventParse(fileName, > list( > startElement=function(name, attrs){ > if(name == "scan"){ > if(.GlobalEnv$ms2Scan == TRUE & .GlobalEnv$scanDone == TRUE){ > cat(.GlobalEnv$scanNum,"\n") > MakeSpektrumEntry() > } > .GlobalEnv$scanDone <- FALSE > .GlobalEnv$fetchPrecMz <- FALSE > .GlobalEnv$fetchPeaks <- FALSE > .GlobalEnv$ms2Scan <- FALSE > if(attrs[["msLevel"]] == "2"){ > .GlobalEnv$ms2Scan <- TRUE > .GlobalEnv$scanNum <- as.integer(attrs[["num"]]) > } > } else if(name == "precursorMz" & .GlobalEnv$ms2Scan == TRUE){ > .GlobalEnv$fetchPrecMz <- TRUE > } else if(name == "peaks" & .GlobalEnv$ms2Scan == TRUE){ > .GlobalEnv$fetchPeaks <- TRUE > } > }, > text=function(text){ > if(.GlobalEnv$fetchPrecMz == TRUE){ > .GlobalEnv$precursorMz <- as.numeric(text) > .GlobalEnv$fetchPrecMz <- FALSE > } > if(.GlobalEnv$fetchPeaks == TRUE){ > .GlobalEnv$peaks <- text > .GlobalEnv$fetchPeaks <- FALSE > .GlobalEnv$scanDone <- TRUE > } > } > ) > ) > >> sessionInfo() > R version 2.9.0 beta (2009-04-03 r48277) > x86_64-pc-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=en_US.UTF-8;LC_ADDRESS=en_US.UTF-8;LC_TELEPHONE=en_US.UTF-8;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=en_US.UTF-8 > > attached base packages: > [1] splines stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] caMassClass_1.6 MASS_7.2-46 digest_0.3.1 caTools_1.9 > [5] bitops_1.0-4.1 rpart_3.1-43 nnet_7.2-46 e1071_1.5-19 > [9] class_7.2-46 PROcess_1.19.1 Icens_1.15.2 survival_2.35-4 > [13] RCurl_0.94-1 XML_2.3-0 rkward_0.5.0 > > loaded via a namespace (and not attached): > [1] tools_2.9.0 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Duncan, Thanks for your thoughts. "trim=FALSE" does not fix my issues, so I attach pared down versions of my script and data file. Thanks for any further hint. Joh Duncan Temple Lang wrote:> Hi Johannes > > I would "guess" that the trimming of the text occurs because > you do not specify trim = FALSE in the call to xmlEventParse(). > If you specify this, you might well get the results you expect. > If not, can you post the actual file you are reading so we can > reproduce your results. > > D. > > Johannes Graumann wrote: >> Hello, >> >> I wrote the function below and have the problem, that the "text" bit >> returns only a trimmed version (686 chars as far as I can see) of the >> content under the "fetchPeaks" condition. >> Any hunches why that might be? >> >> Thanks for pointer, Joh >> >> xmlEventParse(fileName, >> list( >> startElement=function(name, attrs){ >> if(name == "scan"){ >> if(.GlobalEnv$ms2Scan == TRUE & .GlobalEnv$scanDone == TRUE){ >> cat(.GlobalEnv$scanNum,"\n") >> MakeSpektrumEntry() >> } >> .GlobalEnv$scanDone <- FALSE >> .GlobalEnv$fetchPrecMz <- FALSE >> .GlobalEnv$fetchPeaks <- FALSE >> .GlobalEnv$ms2Scan <- FALSE >> if(attrs[["msLevel"]] == "2"){ >> .GlobalEnv$ms2Scan <- TRUE >> .GlobalEnv$scanNum <- as.integer(attrs[["num"]]) >> } >> } else if(name == "precursorMz" & .GlobalEnv$ms2Scan == TRUE){ >> .GlobalEnv$fetchPrecMz <- TRUE >> } else if(name == "peaks" & .GlobalEnv$ms2Scan == TRUE){ >> .GlobalEnv$fetchPeaks <- TRUE >> } >> }, >> text=function(text){ >> if(.GlobalEnv$fetchPrecMz == TRUE){ >> .GlobalEnv$precursorMz <- as.numeric(text) >> .GlobalEnv$fetchPrecMz <- FALSE >> } >> if(.GlobalEnv$fetchPeaks == TRUE){ >> .GlobalEnv$peaks <- text >> .GlobalEnv$fetchPeaks <- FALSE >> .GlobalEnv$scanDone <- TRUE >> } >> } >> ) >> ) >> >>> sessionInfo() >> R version 2.9.0 beta (2009-04-03 r48277) >> x86_64-pc-linux-gnu >> >> locale: >>LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=en_US.UTF-8;LC_ADDRESS=en_US.UTF-8;LC_TELEPHONE=en_US.UTF-8;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=en_US.UTF-8>> >> attached base packages: >> [1] splines stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] caMassClass_1.6 MASS_7.2-46 digest_0.3.1 caTools_1.9 >> [5] bitops_1.0-4.1 rpart_3.1-43 nnet_7.2-46 e1071_1.5-19 >> [9] class_7.2-46 PROcess_1.19.1 Icens_1.15.2 survival_2.35-4 >> [13] RCurl_0.94-1 XML_2.3-0 rkward_0.5.0 >> >> loaded via a namespace (and not attached): >> [1] tools_2.9.0 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html and provide commented, >> minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, > minimal, self-contained, reproducible code.