raphael.felber at agroscope.admin.ch
2017-Aug-14 12:29 UTC
[R] ncdf4: Why are NAs converted to _FillValue when saving?
Dear all I'm a newbie regarding netcdf data. Today I realized that I maybe do not understand some basics of the netcdf. I want to create a *.nc file containing three variables for Switzerland. All data outside of the country are NAs. The third variable is calculated from the first two variables. Basically there is no problem to do that. I copy the file with the data of the first variable, open this file with 'write=TRUE' (nc1 <- nc_open()), read the data to 'var1', open the other file (nc2 <- nc_open()), read the data to variable 'var2', put this variable to the file (nc1) and calculate the third variable based on var1 and var2. So far everything is fine. But I figured out that when I write the data 'var2' to nc1, all NAs in this variable are converted to the _FillValue-value. Clearly, I expect that all NAs are converted to the _FillValue in the file, but I do not expect that also the NAs in 'var2' (i.e. the data which can be called in the R-console) is changed. Since I use this data for further calculations, the NAs should remain. Is that a bug or intended? Below you find a minimal example (adapted from the code in the netcdf4 manual) of the ? in my eye ? strange behavior. Thanks for any explanation. Kind regards Raphael Minimal working example (adapted from netcdf4 manual): library(ncdf4) #---------------- # Make dimensions #---------------- xvals <- 1:360 yvals <- -90:90 nx <- length(xvals) ny <- length(yvals) xdim <- ncdim_def('Lon','degreesE', xvals ) ydim <- ncdim_def('Lat','degreesE', yvals ) tdim <- ncdim_def('Time','days since 1900-01-01', 0, unlim=TRUE ) #--------- # Make var #--------- mv <- 1.e30 # missing value var_temp <- ncvar_def('Temperature','K', list(xdim,ydim,tdim), mv ) #--------------------- # Make new output file #--------------------- output_fname <-'test_real3d.nc' ncid_new <- nc_create( output_fname, list(var_temp)) #------------------------------- # Put some test data in the file #------------------------------- data_temp <- array(0.,dim=c(nx,ny,1)) for( j in 1:ny ) for( i in 1:nx ) data_temp[i,j,1] <- sin(i/10)*sin(j/10) # add some NAs data_temp[1:10, 1:5, 1] <- NA # copy data data_temp2 <- data_temp # show what we have data_temp[1:12, 1:7, 1] data_temp2[1:12, 1:7, 1] # write to netCDF connection ncvar_put( ncid_new, var_temp, data_temp, start=c(1,1,1), count=c(nx,ny,1)) # show what we have now data_temp[1:12, 1:7, 1] data_temp2[1:12, 1:7, 1] # Why are there no more NAs in data_temp? ? ncvar_put changed NAs to _FillValue-value # But why are the NAs in data_temp2 also changed to _FillValue? #-------------------------- # Close #-------------------------- nc_close( ncid_new ) ------------------------------------------------------------------------------------ Raphael Felber, Dr. sc. Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene Eidgen?ssisches Departement f?r Wirtschaft, Bildung und Forschung WBF Agroscope Forschungsbereich Agrar?kologie und Umwelt Reckenholzstrasse 191, 8046 Z?rich Tel. 058 468 75 11 Fax 058 468 72 01 raphael.felber at agroscope.admin.ch<mailto:raphael.felber at agroscope.admin.ch> www.agroscope.ch<http://www.agroscope.ch/> [[alternative HTML version deleted]]
David W. Pierce
2017-Aug-14 15:28 UTC
[R] ncdf4: Why are NAs converted to _FillValue when saving?
On Mon, Aug 14, 2017 at 5:29 AM, <raphael.felber at agroscope.admin.ch> wrote: Dear all> > I'm a newbie regarding netcdf data. Today I realized that I maybe do not > understand some basics of the netcdf. I want to create a *.nc file > containing three variables for Switzerland. All data outside of the country > are NAs. The third variable is calculated from the first two variables. > Basically there is no problem to do that. I copy the file with the data of > the first variable, open this file with 'write=TRUE' (nc1 <- nc_open()), > read the data to 'var1', open the other file (nc2 <- nc_open()), read the > data to variable 'var2', put this variable to the file (nc1) and calculate > the third variable based on var1 and var2. > > So far everything is fine. But I figured out that when I write the data > 'var2' to nc1, all NAs in this variable are converted to the > _FillValue-value. Clearly, I expect that all NAs are converted to the > _FillValue in the file, but I do not expect that also the NAs in 'var2' > (i.e. the data which can be called in the R-console) is changed. Since I > use this data for further calculations, the NAs should remain. > > Is that a bug or intended? Below you find a minimal example (adapted from > the code in the netcdf4 manual) of the ? in my eye ? strange behavior. >?HI Raphael, I'm going to claim that this is more of an R question than a ncdf4 question per se. For example, you will notice that if you multiply data_temp2 times 1.0 (leaving values unchanged) or add zero to data_temp2, then the behavior is what you are expecting. Same holds if you multiply data_temp by 1.0 or add zero to it. It would seem that R does the equivalent of assigning another pointer to the data stored in data_temp rather than copying data_temp until ?either data_temp or data_temp2 is operated upon, at which point a copy is made. I personally did not realize this was part of R's magic. Regards, --Dave> > > > Minimal working example (adapted from netcdf4 manual): > > library(ncdf4) > #---------------- > # Make dimensions > #---------------- > xvals <- 1:360 > yvals <- -90:90 > nx <- length(xvals) > ny <- length(yvals) > xdim <- ncdim_def('Lon','degreesE', xvals ) > ydim <- ncdim_def('Lat','degreesE', yvals ) > tdim <- ncdim_def('Time','days since 1900-01-01', 0, unlim=TRUE ) > #--------- > # Make var > #--------- > mv <- 1.e30 # missing value > var_temp <- ncvar_def('Temperature','K', list(xdim,ydim,tdim), mv ) > #--------------------- > # Make new output file > #--------------------- > output_fname <-'test_real3d.nc' > ncid_new <- nc_create( output_fname, list(var_temp)) > #------------------------------- > # Put some test data in the file > #------------------------------- > data_temp <- array(0.,dim=c(nx,ny,1)) > for( j in 1:ny ) > for( i in 1:nx ) > data_temp[i,j,1] <- sin(i/10)*sin(j/10) > > # add some NAs > data_temp[1:10, 1:5, 1] <- NA > > # copy data > data_temp2 <- data_temp > > # show what we have > data_temp[1:12, 1:7, 1] > data_temp2[1:12, 1:7, 1] > > # write to netCDF connection > ncvar_put( ncid_new, var_temp, data_temp, start=c(1,1,1), count=c(nx,ny,1)) > > # show what we have now > data_temp[1:12, 1:7, 1] > data_temp2[1:12, 1:7, 1] > > # Why are there no more NAs in data_temp? ? ncvar_put changed NAs to > _FillValue-value > # But why are the NAs in data_temp2 also changed to _FillValue? > #-------------------------- > # Close > #-------------------------- > nc_close( ncid_new ) > > ------------------------------------------------------------ > ------------------------ > Raphael Felber, Dr. sc. > Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene > > Eidgen?ssisches Departement f?r > Wirtschaft, Bildung und Forschung WBF > Agroscope > Forschungsbereich Agrar?kologie und Umwelt > > Reckenholzstrasse 191, 8046 Z?rich > Tel. 058 468 75 11 > Fax 058 468 72 01 > raphael.felber at agroscope.admin.ch<mailto:raphael.felber at agroscope.admin.ch > > > www.agroscope.ch<http://www.agroscope.ch/> > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posti > ng-guide.html > and provide commented, minimal, self-contained, reproducible code.-- David W. Pierce Division of Climate, Atmospheric Science, and Physical Oceanography Scripps Institution of Oceanography, La Jolla, California, USA (858) 534-8276 (voice) / (858) 534-8561 (fax) dpierce at ucsd.edu [[alternative HTML version deleted]]
raphael.felber at agroscope.admin.ch
2017-Aug-15 07:53 UTC
[R] ncdf4: Why are NAs converted to _FillValue when saving?
Dear Dave Thanks a lot for your answer. I agree that it is more an R issue than a package issue. But it's the first time I encountered such a problem. For my R version (v3.4.1) on x86_64-w64-mingw32 the second part of your answer only holds for data_temp2; if I do any manipulation to data_temp2 before using ncvar_put(?, data_temp) then data_temp2 remains. However this doesn't hold for data_temp; after using ncvar_put(?, data_temp), the NAs in data_temp are converted to _FillValues (-999.99). For clarification I added two examples below. Regards Raphael Examples:> # ************************************* > # without data manipulation > # ************************************* > > # copy data > data_temp2 <- data_temp > > # show what we have > data_temp[1:5, 1:5, 1][,1] [,2] [,3] [,4] [,5] [1,] NA NA NA 0.03887696 0.04786269 [2,] NA NA NA 0.07736548 0.09524715 [3,] NA NA NA 0.11508099 0.14167993 [4,] NA NA NA 0.15164665 0.18669710 [5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885> > data_temp2[1:5, 1:5, 1][,1] [,2] [,3] [,4] [,5] [1,] NA NA NA 0.03887696 0.04786269 [2,] NA NA NA 0.07736548 0.09524715 [3,] NA NA NA 0.11508099 0.14167993 [4,] NA NA NA 0.15164665 0.18669710 [5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885> > # write to netCDF connection > ncvar_put( ncid_new, var_temp, data_temp ) > > # show what we have > data_temp[1:5, 1:5, 1][,1] [,2] [,3] [,4] [,5] [1,] -999.99000000 -999.99000000 -999.9900000 0.03887696 0.04786269 [2,] -999.99000000 -999.99000000 -999.9900000 0.07736548 0.09524715 [3,] -999.99000000 -999.99000000 -999.9900000 0.11508099 0.14167993 [4,] -999.99000000 -999.99000000 -999.9900000 0.15164665 0.18669710 [5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885> > data_temp2[1:5, 1:5, 1][,1] [,2] [,3] [,4] [,5] [1,] -999.99000000 -999.99000000 -999.9900000 0.03887696 0.04786269 [2,] -999.99000000 -999.99000000 -999.9900000 0.07736548 0.09524715 [3,] -999.99000000 -999.99000000 -999.9900000 0.11508099 0.14167993 [4,] -999.99000000 -999.99000000 -999.9900000 0.15164665 0.18669710 [5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885> # ************************************* > # with data manipulation > # ************************************* > > # show what we have > data_temp[1:5, 1:5, 1][,1] [,2] [,3] [,4] [,5] [1,] NA NA NA 0.03887696 0.04786269 [2,] NA NA NA 0.07736548 0.09524715 [3,] NA NA NA 0.11508099 0.14167993 [4,] NA NA NA 0.15164665 0.18669710 [5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885> > data_temp2[1:5, 1:5, 1][,1] [,2] [,3] [,4] [,5] [1,] NA NA NA 0.03887696 0.04786269 [2,] NA NA NA 0.07736548 0.09524715 [3,] NA NA NA 0.11508099 0.14167993 [4,] NA NA NA 0.15164665 0.18669710 [5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885> > # do some manipulations > data_temp <- data_temp * 1.0 > data_temp2 <- data_temp2 * 1.0 > > # write to netCDF connection > ncvar_put( ncid_new, var_temp, data_temp ) > > # show what we have > data_temp[1:5, 1:5, 1][,1] [,2] [,3] [,4] [,5] [1,] -999.99000000 -999.99000000 -999.9900000 0.03887696 0.04786269 [2,] -999.99000000 -999.99000000 -999.9900000 0.07736548 0.09524715 [3,] -999.99000000 -999.99000000 -999.9900000 0.11508099 0.14167993 [4,] -999.99000000 -999.99000000 -999.9900000 0.15164665 0.18669710 [5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885> > data_temp2[1:5, 1:5, 1][,1] [,2] [,3] [,4] [,5] [1,] NA NA NA 0.03887696 0.04786269 [2,] NA NA NA 0.07736548 0.09524715 [3,] NA NA NA 0.11508099 0.14167993 [4,] NA NA NA 0.15164665 0.18669710 [5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885> > > # ************************************* > # RESULT > # with manipulation of data_temp2 the variable is copied and NAs remain NAs > # but manipulation of data_temp doesn't helpVon: davidwilliampierce at gmail.com [mailto:davidwilliampierce at gmail.com] Im Auftrag von David W. Pierce Gesendet: Montag, 14. August 2017 17:29 An: Felber Raphael Agroscope <raphael.felber at agroscope.admin.ch> Cc: r-help at r-project.org Betreff: Re: [R] ncdf4: Why are NAs converted to _FillValue when saving? On Mon, Aug 14, 2017 at 5:29 AM, <raphael.felber at agroscope.admin.ch<mailto:raphael.felber at agroscope.admin.ch>> wrote: Dear all I'm a newbie regarding netcdf data. Today I realized that I maybe do not understand some basics of the netcdf. I want to create a *.nc file containing three variables for Switzerland. All data outside of the country are NAs. The third variable is calculated from the first two variables. Basically there is no problem to do that. I copy the file with the data of the first variable, open this file with 'write=TRUE' (nc1 <- nc_open()), read the data to 'var1', open the other file (nc2 <- nc_open()), read the data to variable 'var2', put this variable to the file (nc1) and calculate the third variable based on var1 and var2. So far everything is fine. But I figured out that when I write the data 'var2' to nc1, all NAs in this variable are converted to the _FillValue-value. Clearly, I expect that all NAs are converted to the _FillValue in the file, but I do not expect that also the NAs in 'var2' (i.e. the data which can be called in the R-console) is changed. Since I use this data for further calculations, the NAs should remain. Is that a bug or intended? Below you find a minimal example (adapted from the code in the netcdf4 manual) of the ? in my eye ? strange behavior. ?HI Raphael, I'm going to claim that this is more of an R question than a ncdf4 question per se. For example, you will notice that if you multiply data_temp2 times 1.0 (leaving values unchanged) or add zero to data_temp2, then the behavior is what you are expecting. Same holds if you multiply data_temp by 1.0 or add zero to it. It would seem that R does the equivalent of assigning another pointer to the data stored in data_temp rather than copying data_temp until ?either data_temp or data_temp2 is operated upon, at which point a copy is made. I personally did not realize this was part of R's magic. Regards, --Dave Minimal working example (adapted from netcdf4 manual): library(ncdf4) #---------------- # Make dimensions #---------------- xvals <- 1:360 yvals <- -90:90 nx <- length(xvals) ny <- length(yvals) xdim <- ncdim_def('Lon','degreesE', xvals ) ydim <- ncdim_def('Lat','degreesE', yvals ) tdim <- ncdim_def('Time','days since 1900-01-01', 0, unlim=TRUE ) #--------- # Make var #--------- mv <- 1.e30 # missing value var_temp <- ncvar_def('Temperature','K', list(xdim,ydim,tdim), mv ) #--------------------- # Make new output file #--------------------- output_fname <-'test_real3d.nc<http://test_real3d.nc>' ncid_new <- nc_create( output_fname, list(var_temp)) #------------------------------- # Put some test data in the file #------------------------------- data_temp <- array(0.,dim=c(nx,ny,1)) for( j in 1:ny ) for( i in 1:nx ) data_temp[i,j,1] <- sin(i/10)*sin(j/10) # add some NAs data_temp[1:10, 1:5, 1] <- NA # copy data data_temp2 <- data_temp # show what we have data_temp[1:12, 1:7, 1] data_temp2[1:12, 1:7, 1] # write to netCDF connection ncvar_put( ncid_new, var_temp, data_temp, start=c(1,1,1), count=c(nx,ny,1)) # show what we have now data_temp[1:12, 1:7, 1] data_temp2[1:12, 1:7, 1] # Why are there no more NAs in data_temp? ? ncvar_put changed NAs to _FillValue-value # But why are the NAs in data_temp2 also changed to _FillValue? #-------------------------- # Close #-------------------------- nc_close( ncid_new ) ------------------------------------------------------------------------------------ Raphael Felber, Dr. sc. Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene Eidgen?ssisches Departement f?r Wirtschaft, Bildung und Forschung WBF Agroscope Forschungsbereich Agrar?kologie und Umwelt Reckenholzstrasse 191, 8046 Z?rich Tel. 058 468 75 11 Fax 058 468 72 01 raphael.felber at agroscope.admin.ch<mailto:raphael.felber at agroscope.admin.ch><mailto:raphael.felber at agroscope.admin.ch<mailto:raphael.felber at agroscope.admin.ch>> www.agroscope.ch<http://www.agroscope.ch><http://www.agroscope.ch/> [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- David W. Pierce Division of Climate, Atmospheric Science, and Physical Oceanography Scripps Institution of Oceanography, La Jolla, California, USA (858) 534-8276<tel:(858)%20534-8276> (voice) / (858) 534-8561<tel:(858)%20534-8561> (fax) dpierce at ucsd.edu<mailto:dpierce at ucsd.edu> [[alternative HTML version deleted]]
Seemingly Similar Threads
- ncdf4: Why are NAs converted to _FillValue when saving?
- Extracting subset from netCDF file using lat/lon and converting into .csv in R
- Extracting subset from netCDF file using lat/lon and converting into .csv in R
- read a netcdf file _Fill_value=-32768
- Inconsistent results from var.get.nc in RNetCDF