thr3ads.net - R help - [R] R crashing during batch file formatting [Oct 2006]

If this information is useful, please help other people find it:
Share via:

Jon Minton

2006-Oct-31 11:43 UTC

[R] R crashing during batch file formatting

Hi R users:

 

I have the British Household Panel Survey (BHPS) in .tab format. I want to
feed it through the Amelia package (which will be an ‘interesting’ job in
itself)..

But first I need to convert the various types of missing value (from about
-9 to -1) to a more generic ‘NA’ code.

 

I’ve written the following function to do this:

 

BHPS.converter <- function(from="D:/Data/BHPS/UKDA-5151-tab/tab/",
to="D:/BHPS/NA/", ext="tab" ) {

                from.files <- dir(from,
pattern=paste(".",ext,"$",sep="") )

                existing.to.files <- dir(to,
pattern=paste(".",ext,"$",sep="") )

                still.to.do.index <- 1:length(from.files)

                still.to.do.index <-
still.to.do.index[-match(existing.to.files, from.files)]

                obs.to.do <- length(still.to.do.index)

                for (i in 1:obs.to.do){

                                temp.table <-
read.delim(paste(from,from.files[still.to.do.index[i]], sep=""))

                                print(paste("read:",
from.files[still.to.do.index[i]]))

                                temp.table[temp.table < 0 ] <- NA

                                write.table(temp.table,
file=paste(to,from.files[still.to.do.index[i]], sep=""))

                                print(paste("written:",
from.files[still.to.do.index[i]]))

                }

 

 

                rm(i, from.files, existing.to.files, still.to.do.index,
obs.to.do, temp.table)

}

 

It checks for existing files in the ‘to’ directory (where files which have
been modified with R- -> NA) because when I tried to do this conversion
operation previously it got about ½ way through then crashed.

 

The problem is that it crashes *this time* too, without displaying a prompt
to say it’s read a single file. 

 

The file it gets stuck on is about 75mb in size. 

 

I am using a dual-core 3.2Ghz Pentium D processor with 2 Gb memory (& 2Gb
virtual memory), and (unfortunately) Windows XP.

 

Questions:

 1) Any general tips on how to increase the amount of memory available to
process the file?

2) Can you see a more efficient way of doing what I’m doing?

3) What’s the best way of coding for multiple forms of NA? – the BHPS code
‘-8’ (meaning ‘inapplicable’, not routed for this respondent) should really
be distinguished from other forms of nonresponse...

 

 

Thanks,

 

Jon

 

 

p.s. Apologies if this is slightly too vague/long winded...

 

 

Jon Minton

 

 


	[[alternative HTML version deleted]]

Duncan Murdoch

2006-Oct-31 12:04 UTC

head link

[R] R crashing during batch file formatting

On 10/31/2006 6:43 AM, Jon Minton wrote:
  ....> It checks for existing files in the ?to? directory (where files which have
> been modified with R- -> NA) because when I tried to do this conversion
> operation previously it got about ? way through then crashed.
> 
>  
> 
> The problem is that it crashes *this time* too, without displaying a prompt
> to say it?s read a single file. 
When you say "crash", do you mean it displays an R error (like
"unable
to allocate vector of length ....") or a real crash with a Windows popup?

Which version are you using?  There were some fixes to the memory 
management after the 2.3.1 release, but I haven't heard of any problems 
in 2.4.0 before this.

Duncan Murdoch

Petr Pikal

2006-Oct-31 12:15 UTC

head link

[R] R crashing during batch file formatting

Hi

you shall probably provide more information (OS, R version).
I cannot help you much with crash but here is some opinion.
I would try to do conversion interactively before I transferred it to 
a function.

However, if you want different types of NA and your data is numeric, 
you probably could make a distinction by using -Inf, Inf, NaN and NA, 
but then you need to be careful when doing analysis, as these values 
can be treated differently.

HTH
Petr



On 31 Oct 2006 at 11:43, Jon Minton wrote:

From:           	"Jon Minton" <jm540 at york.ac.uk>
To:             	<r-help at stat.math.ethz.ch>
Date sent:      	Tue, 31 Oct 2006 11:43:22 -0000
Subject:        	[R] R crashing during batch file formatting
> Hi R users:
> 
> 
> 
> I have the British Household Panel Survey (BHPS) in .tab format. I
> want to feed it through the Amelia package (which will be an
> ?interesting? job in itself)..
> 
> But first I need to convert the various types of missing value (from
> about -9 to -1) to a more generic ?NA? code.
> 
> 
> 
> I?ve written the following function to do this:
> 
> 
> 
> BHPS.converter <-
function(from="D:/Data/BHPS/UKDA-5151-tab/tab/",
> to="D:/BHPS/NA/", ext="tab" ) {
> 
>                 from.files <- dir(from,
>                
pattern=paste(".",ext,"$",sep="") )
> 
>                 existing.to.files <- dir(to,
> pattern=paste(".",ext,"$",sep="") )
> 
>                 still.to.do.index <- 1:length(from.files)
> 
>                 still.to.do.index <-
> still.to.do.index[-match(existing.to.files, from.files)]
> 
>                 obs.to.do <- length(still.to.do.index)
> 
>                 for (i in 1:obs.to.do){
> 
>                                 temp.table <-
> read.delim(paste(from,from.files[still.to.do.index[i]], sep=""))
> 
>                                 print(paste("read:",
> from.files[still.to.do.index[i]]))
> 
>                                 temp.table[temp.table < 0 ] <- NA
> 
>                                 write.table(temp.table,
> file=paste(to,from.files[still.to.do.index[i]], sep=""))
> 
>                                 print(paste("written:",
> from.files[still.to.do.index[i]]))
> 
>                 }
> 
> 
> 
> 
> 
>                 rm(i, from.files, existing.to.files,
>                 still.to.do.index,
> obs.to.do, temp.table)
> 
> }
> 
> 
> 
> It checks for existing files in the ?to? directory (where files which
> have been modified with R- -> NA) because when I tried to do this
> conversion operation previously it got about ? way through then
> crashed.
> 
> 
> 
> The problem is that it crashes *this time* too, without displaying a
> prompt to say it?s read a single file. 
> 
> 
> 
> The file it gets stuck on is about 75mb in size. 
> 
> 
> 
> I am using a dual-core 3.2Ghz Pentium D processor with 2 Gb memory (&
> 2Gb virtual memory), and (unfortunately) Windows XP.
> 
> 
> 
> Questions:
> 
>  1) Any general tips on how to increase the amount of memory available
>  to
> process the file?
> 
> 2) Can you see a more efficient way of doing what I?m doing?
> 
> 3) What?s the best way of coding for multiple forms of NA? ? the BHPS
> code ?-8? (meaning ?inapplicable?, not routed for this respondent)
> should really be distinguished from other forms of nonresponse...
> 
> 
> 
> 
> 
> Thanks,
> 
> 
> 
> Jon
> 
> 
> 
> 
> 
> p.s. Apologies if this is slightly too vague/long winded...
> 
> 
> 
> 
> 
> Jon Minton
> 
> 
> 
> 
> 
> 
>  [[alternative HTML version deleted]]
> 
> 
Petr Pikal
petr.pikal at precheza.cz

Reasonably Related Threads

Search for more apparently analagous threads

R help - Oct 2006 - R crashing during batch file formatting

[R] R crashing during batch file formatting

[R] R crashing during batch file formatting

[R] R crashing during batch file formatting

Reasonably Related Threads