How to programmatically (i.e., without no or minimal handcoding) copy a netCDF file? (Without calling> system("cp whatever wherever"):-) Why I ask: I need to "do surgery" on a large netCDF file (technically an I/O API file which uses netCDF). My group believes a data-assimilation error caused a data variable to be corrupted in a certain way, so I'm going to "decorrupt" it so we can compare values with the raw data. Thanks to help from this list, * I understand that, generally, R wants value semantics (though mechanisms like package=proto allow reference semantics). Therefore, ISTM, rather than attempting to modify a (copy of a) file in-place, I should instead [copy the bits I want to keep from the source file to the new/target file, read the one data variable I want to modify from the source file, write the one modified datavar to the target file]. (Please correct me if wrong!) * I have a routine that (I believe) does the desired modification of the one datavar. And from reading `help(*ncdf)` and http://www.image.ucar.edu/Software/Netcdf/ I believe (ICBW :-) I understand the definition, get, and put steps that are involved in reading and writing netCDF. However, * the file I'm working with is large and complex * the examples above hand-craft their output files So I'm wondering, can anyone point me to, or provide, code that copies a netCDF file both * completely: all coordinate variables, all data variables and their attributes, and all global attributes, such that $ diff -wB <( ncdump -h source.nc ) <( ncdump -h target.nc ) | wc -l 0 * programmatically: no or minimal hand-coding of, e.g., attribute names and values, missing-value value. ? If not, can this be done in principle, or are there steps that must (at least currently) necessarily be hand-coded? TIA, Tom Roche <Tom_Roche at pobox.com>
David William Pierce
2012-Jan-06 03:49 UTC
[R] [ncdf] programmatically copying a netCDF file
On Thu, Jan 5, 2012 at 3:29 PM, Tom Roche <Tom_Roche@pobox.com> wrote:> > How to programmatically (i.e., without no or minimal handcoding) copy > a netCDF file? (Without calling > > > system("cp whatever wherever") > > [...] >> So I'm wondering, can anyone point me to, or provide, code that copies > a netCDF file both > > * completely: all coordinate variables, all data variables and their > attributes, and all global attributes, such that > > $ diff -wB <( ncdump -h source.nc ) <( ncdump -h target.nc ) | wc -l > 0 > > * programmatically: no or minimal hand-coding of, e.g., attribute > names and values, missing-value value. > > ? If not, can this be done in principle, or are there steps that must > (at least currently) necessarily be hand-coded? > > Hi Tom,yes, this can be done in principle, although it would be a pain. Mostly because a netcdf file is a surprisingly complicated object, so asking for R script that copies one "programmatically" is actually asking quite a lot. You might think that it should be as easy as "grab THIS var ... then grab THAT var .. then write them both to the output file." But that fails, because vars have dims, and in classic netcdf files the dims are identified by name. What if one var has a lon dim with 360 entries, and the other has a lon dim with 128 entries? Putting them both into the same file will give an error. What I'd suggest is keeping focused on your goal. Since "cp file1.nc file2.nc" accomplished a complete copy of your netcdf file in a few seconds of typing, simply copying the file generally isn't the point of an R script. If I wanted to copy a var from an existing file to a new file, manipulating it along the way, I'd do something like this (untested code off the top of my head): varname = 'temperature' file_in = 'data1.nc' file_out = 'data2.nc' # Get var to copy ncid_in = open.ncdf( file1 ) var = ncid_in$var[[varname]] # Make new output dims that are copies of input dims ndims = var$ndims dim_out = list() for( idim in 1:ndims ) { dim_in = var$dim[[idim]] dim_out[[idim]] = dim.def.ncdf( dim_in$name, dim_in$units, dim_in$vals, unlim=dim_in$unlim ) } # Make output var that is copy of input var var_out = var.def.ncdf( var$name, var$units, dim_out, var$missval ) ncid_out = create.ncdf( file_out, var_out ) # Loop over timesteps to avoid running out of memory sz = var$varsize nt = sz[ndims] for( it in 1:nt ) { # Goal of following lines is to construct a 'start' array that is (1,1,...,it) and a count array # that is (nx,ny,....,1) start = array(1,ndims-1) count = sz[1:(ndims-1)] start = c(start, it) # Get just this timestep count = c(count, 1) # Get just this timestep data = get.var.ncdf( ncid_in, var, start=start, count=count ) # ... Manipulate data here ... put.var.ncdf( ncid_out, var_out, data, start=start, count=count ) sync.ncdf( ncid_out ) # always a good idea to keep your file sync'ed } close.ncdf( ncid_out ) Hope that gets you started, --Dave -- David W. Pierce Division of Climate, Atmospheric Science, and Physical Oceanography Scripps Institution of Oceanography, La Jolla, California, USA (858) 534-8276 (voice) / (858) 534-8561 (fax) dpierce@ucsd.edu [[alternative HTML version deleted]]
Reasonably Related Threads
- Editing the variables attributes section in the netCDF header of netCDF files created using the package ncdf.
- netCDF to TIFF
- "non-numeric argument to binary operator" error while reading ncdf file
- [ncdf4] error converting GEIA data to netCDF
- Teach me how to transpose in R