Roy Mendelssohn - NOAA Federal
2016-Jul-07 23:27 UTC
[R] netcdf data precision or least significant digit
Hi Ismail: Can you point me to a particular netcdf file you are working with. I would like to play with it for awhile. I am pretty certain the scale factor is 0.01 and what you are seeing in rounding error (or mor precisely I should say problems with representations of floating point numbers), but i would like to see if there is away around this. Thank, -Roy> On Jul 7, 2016, at 4:16 PM, Ismail SEZEN <sezenismail at gmail.com> wrote: > > Thank you very much Jeff. I think I?m too far to be able to explain myself. Perhaps, this is the wrong list for this question but I sent it in hope there is someone has deep understanding of netcdf data and use R. Let me tell the story simpler. Assume that you read a numeric vector of data from a netcdf file: > > data <- c(9.1999979, 8.7999979, 7.9999979, 3.0999980, 6.1000018, 10.1000017, 10.4000017, 9.2000017) > > you know that the values above are a model output and also you know that, physically, first and last values must be equal but somehow they are not. > > And now, you want to use ?periodic? spline for the values above. > > spline(1:8, data, method = ?periodic?) > > Voila! spline method throws a warning message: ?spline: first and last y values differ - using y[1] for both?. Then I go on digging and discover 2 attributes in netcdf file: ?precision = 2? and ?least_significant_digit = 1?. And I also found their definitions at [1]. > > precision -- number of places to right of decimal point that are significant, based on packing used. Type is short. > least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value. Type is short. > > Please, do not condemn me, english is not my main language :). At this point, as a scientist, what would you do according to explanations above? I think I didn?t exactly understand the difference between precision and least_significant_digit. One says ?significant? and latter says ?reliable?. Should I round the numbers to 2 decimal places or 1 decimal place after decimal point? > > Thanks, > > 1- http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml > > >> On 08 Jul 2016, at 01:29, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: >> >> Correction: >> >> ?options (not par) >> -- >> Sent from my phone. Please excuse my brevity. >> >> On July 7, 2016 3:26:06 PM PDT, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: >>> Same as with any floating point numeric computation environment... you >>> don't. There is always uncertainty in any floating point number... it >>> is just larger in this data than you might be used to. >>> >>> Once you get to the stage where you want to output values, read up on >>> >>> ?round >>> ?par (digits) >>> >>> and don't worry about the incidental display of extra digits prior to >>> presentation (output). >>> -- >>> Sent from my phone. Please excuse my brevity. >>> >>> On July 7, 2016 12:50:54 AM PDT, Ismail SEZEN <sezenismail at gmail.com> >>> wrote: >>>> Hello, >>>> >>>> I use ncdf4 and ncdf4.helpers packages to get wind data from ncep/ncar >>>> reanalysis ncetcdf files. But data is in the form of (9.199998, >>>> 8.799998, 7.999998, 3.099998, -6.8000018, ?). I?m aware of precision >>>> and least_significant_digit attributes of ncdf4 object [1]. For uwnd >>>> data, precision = 2 and least_significant_digits = 1. My doubt is that >>>> should I round data to 2 decimal places or 1 decimal place after >>>> decimal point? >>>> >>>> Same issue is valid for some header info. >>>> >>>> Output of ncdf4 object: >>>> >>>> >>>> Output of ncdump on terminal: >>>> >>>> >>>> for instance, ncdump's scale factor is 0.01f but ncdf4 object?s >>>> scale_factor is 0.00999999977648258. You can notice same issue for >>>> actual_range and add_offset. Also a similar issue exist for the data. >>>> How can I truncate those extra unsignificant decimal places or round >>>> the numbers to significant decimal places? >>>> >>>> 1 - >>>> http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml >>>> <http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.********************** "The contents of this message do not reflect any position of the U.S. Government or NOAA." ********************** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center ***Note new address and phone*** 110 Shaffer Road Santa Cruz, CA 95060 Phone: (831)-420-3666 Fax: (831) 420-3980 e-mail: Roy.Mendelssohn at noaa.gov www: http://www.pfeg.noaa.gov/ "Old age and treachery will overcome youth and skill." "From those who have been given much, much will be expected" "the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
Thank you Roy. I use NCEP/NCAR Reanalysis 2 data [1]. More precisely, u-wind data of the year 2015 [2]. I am also pretty sure that the variables like scale_factor or add_offset should be precise like 0.01 or 187.65 but somehow (I hope this is not an issue originated by me) they are not, including data. Also let me note that I already contacted to author of ncdf4 package and also sent an email to ESRL, too, but no luck yet. For a vectoral data, absolute and mutual u components of wind speed at the poles must be equal. For instance, at ?2015-01-01 00 GMT?, u-wind at longitude=0 and latitude=90 is 9.1999979 m/s and u-wind at longitude=180 and latitude=90 is -9.2000017 m/s. Minus sign comes from positive north direction. Physically, their absolute values must be equal. 1- http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalysis2.html 2- ftp://ftp.cdc.noaa.gov/Datasets/ncep.reanalysis2.dailyavgs/pressure/uwnd.2015.nc> On 08 Jul 2016, at 02:27, Roy Mendelssohn - NOAA Federal <roy.mendelssohn at noaa.gov> wrote: > > Hi Ismail: > > Can you point me to a particular netcdf file you are working with. I would like to play with it for awhile. I am pretty certain the scale factor is 0.01 and what you are seeing in rounding error (or mor precisely I should say problems with representations of floating point numbers), but i would like to see if there is away around this. > > Thank, > > -Roy > >> On Jul 7, 2016, at 4:16 PM, Ismail SEZEN <sezenismail at gmail.com> wrote: >> >> Thank you very much Jeff. I think I?m too far to be able to explain myself. Perhaps, this is the wrong list for this question but I sent it in hope there is someone has deep understanding of netcdf data and use R. Let me tell the story simpler. Assume that you read a numeric vector of data from a netcdf file: >> >> data <- c(9.1999979, 8.7999979, 7.9999979, 3.0999980, 6.1000018, 10.1000017, 10.4000017, 9.2000017) >> >> you know that the values above are a model output and also you know that, physically, first and last values must be equal but somehow they are not. >> >> And now, you want to use ?periodic? spline for the values above. >> >> spline(1:8, data, method = ?periodic?) >> >> Voila! spline method throws a warning message: ?spline: first and last y values differ - using y[1] for both?. Then I go on digging and discover 2 attributes in netcdf file: ?precision = 2? and ?least_significant_digit = 1?. And I also found their definitions at [1]. >> >> precision -- number of places to right of decimal point that are significant, based on packing used. Type is short. >> least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value. Type is short. >> >> Please, do not condemn me, english is not my main language :). At this point, as a scientist, what would you do according to explanations above? I think I didn?t exactly understand the difference between precision and least_significant_digit. One says ?significant? and latter says ?reliable?. Should I round the numbers to 2 decimal places or 1 decimal place after decimal point? >> >> Thanks, >> >> 1- http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml >> >> >>> On 08 Jul 2016, at 01:29, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: >>> >>> Correction: >>> >>> ?options (not par) >>> -- >>> Sent from my phone. Please excuse my brevity. >>> >>> On July 7, 2016 3:26:06 PM PDT, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: >>>> Same as with any floating point numeric computation environment... you >>>> don't. There is always uncertainty in any floating point number... it >>>> is just larger in this data than you might be used to. >>>> >>>> Once you get to the stage where you want to output values, read up on >>>> >>>> ?round >>>> ?par (digits) >>>> >>>> and don't worry about the incidental display of extra digits prior to >>>> presentation (output). >>>> -- >>>> Sent from my phone. Please excuse my brevity. >>>> >>>> On July 7, 2016 12:50:54 AM PDT, Ismail SEZEN <sezenismail at gmail.com> >>>> wrote: >>>>> Hello, >>>>> >>>>> I use ncdf4 and ncdf4.helpers packages to get wind data from ncep/ncar >>>>> reanalysis ncetcdf files. But data is in the form of (9.199998, >>>>> 8.799998, 7.999998, 3.099998, -6.8000018, ?). I?m aware of precision >>>>> and least_significant_digit attributes of ncdf4 object [1]. For uwnd >>>>> data, precision = 2 and least_significant_digits = 1. My doubt is that >>>>> should I round data to 2 decimal places or 1 decimal place after >>>>> decimal point? >>>>> >>>>> Same issue is valid for some header info. >>>>> >>>>> Output of ncdf4 object: >>>>> >>>>> >>>>> Output of ncdump on terminal: >>>>> >>>>> >>>>> for instance, ncdump's scale factor is 0.01f but ncdf4 object?s >>>>> scale_factor is 0.00999999977648258. You can notice same issue for >>>>> actual_range and add_offset. Also a similar issue exist for the data. >>>>> How can I truncate those extra unsignificant decimal places or round >>>>> the numbers to significant decimal places? >>>>> >>>>> 1 - >>>>> http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml >>>>> <http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ********************** > "The contents of this message do not reflect any position of the U.S. Government or NOAA." > ********************** > Roy Mendelssohn > Supervisory Operations Research Analyst > NOAA/NMFS > Environmental Research Division > Southwest Fisheries Science Center > ***Note new address and phone*** > 110 Shaffer Road > Santa Cruz, CA 95060 > Phone: (831)-420-3666 > Fax: (831) 420-3980 > e-mail: Roy.Mendelssohn at noaa.gov www: http://www.pfeg.noaa.gov/ > > "Old age and treachery will overcome youth and skill." > "From those who have been given much, much will be expected" > "the arc of the moral universe is long, but it bends toward justice" -MLK Jr. >
Roy Mendelssohn - NOAA Federal
2016-Jul-08 00:21 UTC
[R] netcdf data precision or least significant digit
After looking at the file, doing an extract say into the variable uwind, if I do: str(uwind) I see what I expect, but if I just do: uwind I see what you are seeing. Try: uwindnew <- round(uwind, digits = 2) and see if that gives you the results you would expect. HTH, -Roy> On Jul 7, 2016, at 4:49 PM, Ismail SEZEN <sezenismail at gmail.com> wrote: > > Thank you Roy. > > I use NCEP/NCAR Reanalysis 2 data [1]. More precisely, u-wind data of the year 2015 [2]. I am also pretty sure that the variables like scale_factor or add_offset should be precise like 0.01 or 187.65 but somehow (I hope this is not an issue originated by me) they are not, including data. Also let me note that I already contacted to author of ncdf4 package and also sent an email to ESRL, too, but no luck yet. > > For a vectoral data, absolute and mutual u components of wind speed at the poles must be equal. For instance, at ?2015-01-01 00 GMT?, u-wind at longitude=0 and latitude=90 is 9.1999979 m/s and u-wind at longitude=180 and latitude=90 is -9.2000017 m/s. Minus sign comes from positive north direction. Physically, their absolute values must be equal. > > 1- http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalysis2.html > 2- ftp://ftp.cdc.noaa.gov/Datasets/ncep.reanalysis2.dailyavgs/pressure/uwnd.2015.nc > > > >> On 08 Jul 2016, at 02:27, Roy Mendelssohn - NOAA Federal <roy.mendelssohn at noaa.gov> wrote: >> >> Hi Ismail: >> >> Can you point me to a particular netcdf file you are working with. I would like to play with it for awhile. I am pretty certain the scale factor is 0.01 and what you are seeing in rounding error (or mor precisely I should say problems with representations of floating point numbers), but i would like to see if there is away around this. >> >> Thank, >> >> -Roy >> >>> On Jul 7, 2016, at 4:16 PM, Ismail SEZEN <sezenismail at gmail.com> wrote: >>> >>> Thank you very much Jeff. I think I?m too far to be able to explain myself. Perhaps, this is the wrong list for this question but I sent it in hope there is someone has deep understanding of netcdf data and use R. Let me tell the story simpler. Assume that you read a numeric vector of data from a netcdf file: >>> >>> data <- c(9.1999979, 8.7999979, 7.9999979, 3.0999980, 6.1000018, 10.1000017, 10.4000017, 9.2000017) >>> >>> you know that the values above are a model output and also you know that, physically, first and last values must be equal but somehow they are not. >>> >>> And now, you want to use ?periodic? spline for the values above. >>> >>> spline(1:8, data, method = ?periodic?) >>> >>> Voila! spline method throws a warning message: ?spline: first and last y values differ - using y[1] for both?. Then I go on digging and discover 2 attributes in netcdf file: ?precision = 2? and ?least_significant_digit = 1?. And I also found their definitions at [1]. >>> >>> precision -- number of places to right of decimal point that are significant, based on packing used. Type is short. >>> least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value. Type is short. >>> >>> Please, do not condemn me, english is not my main language :). At this point, as a scientist, what would you do according to explanations above? I think I didn?t exactly understand the difference between precision and least_significant_digit. One says ?significant? and latter says ?reliable?. Should I round the numbers to 2 decimal places or 1 decimal place after decimal point? >>> >>> Thanks, >>> >>> 1- http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml >>> >>> >>>> On 08 Jul 2016, at 01:29, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: >>>> >>>> Correction: >>>> >>>> ?options (not par) >>>> -- >>>> Sent from my phone. Please excuse my brevity. >>>> >>>> On July 7, 2016 3:26:06 PM PDT, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: >>>>> Same as with any floating point numeric computation environment... you >>>>> don't. There is always uncertainty in any floating point number... it >>>>> is just larger in this data than you might be used to. >>>>> >>>>> Once you get to the stage where you want to output values, read up on >>>>> >>>>> ?round >>>>> ?par (digits) >>>>> >>>>> and don't worry about the incidental display of extra digits prior to >>>>> presentation (output). >>>>> -- >>>>> Sent from my phone. Please excuse my brevity. >>>>> >>>>> On July 7, 2016 12:50:54 AM PDT, Ismail SEZEN <sezenismail at gmail.com> >>>>> wrote: >>>>>> Hello, >>>>>> >>>>>> I use ncdf4 and ncdf4.helpers packages to get wind data from ncep/ncar >>>>>> reanalysis ncetcdf files. But data is in the form of (9.199998, >>>>>> 8.799998, 7.999998, 3.099998, -6.8000018, ?). I?m aware of precision >>>>>> and least_significant_digit attributes of ncdf4 object [1]. For uwnd >>>>>> data, precision = 2 and least_significant_digits = 1. My doubt is that >>>>>> should I round data to 2 decimal places or 1 decimal place after >>>>>> decimal point? >>>>>> >>>>>> Same issue is valid for some header info. >>>>>> >>>>>> Output of ncdf4 object: >>>>>> >>>>>> >>>>>> Output of ncdump on terminal: >>>>>> >>>>>> >>>>>> for instance, ncdump's scale factor is 0.01f but ncdf4 object?s >>>>>> scale_factor is 0.00999999977648258. You can notice same issue for >>>>>> actual_range and add_offset. Also a similar issue exist for the data. >>>>>> How can I truncate those extra unsignificant decimal places or round >>>>>> the numbers to significant decimal places? >>>>>> >>>>>> 1 - >>>>>> http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml >>>>>> <http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml> >>>>>> ______________________________________________ >>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ********************** >> "The contents of this message do not reflect any position of the U.S. Government or NOAA." >> ********************** >> Roy Mendelssohn >> Supervisory Operations Research Analyst >> NOAA/NMFS >> Environmental Research Division >> Southwest Fisheries Science Center >> ***Note new address and phone*** >> 110 Shaffer Road >> Santa Cruz, CA 95060 >> Phone: (831)-420-3666 >> Fax: (831) 420-3980 >> e-mail: Roy.Mendelssohn at noaa.gov www: http://www.pfeg.noaa.gov/ >> >> "Old age and treachery will overcome youth and skill." >> "From those who have been given much, much will be expected" >> "the arc of the moral universe is long, but it bends toward justice" -MLK Jr. >> >********************** "The contents of this message do not reflect any position of the U.S. Government or NOAA." ********************** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center ***Note new address and phone*** 110 Shaffer Road Santa Cruz, CA 95060 Phone: (831)-420-3666 Fax: (831) 420-3980 e-mail: Roy.Mendelssohn at noaa.gov www: http://www.pfeg.noaa.gov/ "Old age and treachery will overcome youth and skill." "From those who have been given much, much will be expected" "the arc of the moral universe is long, but it bends toward justice" -MLK Jr.