Code Monkey home page Code Monkey logo

Comments (5)

rsignell-usgs avatar rsignell-usgs commented on August 29, 2024 1

Will do @milankl !! Thanks for the great rounding examples also. I feel like I'm starting to understand this stuff! :)

from bitinformation.jl.

milankl avatar milankl commented on August 29, 2024

Do you know whether they rounded in decimal or in binary? Because while rounding in one base also quantizes in the other it doesn't round (as in make the trailing digits/bits shorter/simpler, the actual idea of rounding) in the other. Example:

julia> a = randn(Float32,5)
5-element Vector{Float32}:
 -0.35390654
  1.223104
 -0.04762434
 -1.5507344
  0.89877653

julia> b = round.(a,digits=3)
5-element Vector{Float32}:
 -0.354
  1.223
 -0.048
 -1.551
  0.899

julia> bitstring.(b,:split)
5-element Vector{String}:
 "1 01111101 01101010011111101111101"
 "0 01111111 00111001000101101000100"
 "1 01111010 10001001001101110100110"
 "1 01111111 10001101000011100101011"
 "0 01111110 11001100010010011011101"

While the mantissa bits aren't all zero after some mantissa bits, the resulting array may still be more compressible as the number of possible mantissa bitpatterns still have been greatly reduced.

But to actually answer your question, you can (roughly) translate between significant digits and bits via

nsb(nsd::Integer) = Integer(ceil(log(10)/log(2)*nsd))

which just arises from the idea that with $d$ digits you have a maximum absolute error of $\tfrac{10^{-d}}{2}$, and with $b$ bits you have a maximum absolute error of $\tfrac{2^{-b}}{2}$, equating the two and rounding up rather than down gives you the number of bits you need to have an error that isn't greater than the error from rounding in decimal. There are some caveats that @czender can elaborate more on.

In your case this means that if you already know that your dataset contains only $d$ digits of precision you can definitely round to

$d$ digits $b$ bits
1 4
2 7
3 10
4 14
5 17
6 20
7 24

without losing any information. And in fact, you probably want to because if there's real information in the mantissa bits past those it's artificial from the quantization.

from bitinformation.jl.

milankl avatar milankl commented on August 29, 2024

In fact, we already discussed that here nco/nco#250

from bitinformation.jl.

rsignell-usgs avatar rsignell-usgs commented on August 29, 2024

@pnorton-usgs and I checked, and these variables were processed with NCO using args like ppc ALBEDO:5, which specifies the Number of Significant Digits (as opposed to the Decimal Significant Digits).

So a few values look like this:

import struct

a = ds['ALBEDO'][100,500:510,600].values

a
array([0.2371893 , 0.22657919, 0.22525072, 0.21817589, 0.20056486,
       0.2204647 , 0.22222233, 0.22418547, 0.22438478, 0.22910786],
      dtype=float32)
def binary(num):
    return ''.join('{:0>8b}'.format(c) for c in struct.pack('!f', num))

for v in list(a):
    print(binary(v))
00111110011100101110000111000000
00111110011010000000010001100000
00111110011001101010100000100000
00111110010111110110100110000000
00111110010011010110000011100000
00111110011000011100000110000000
00111110011000111000111001000000
00111110011001011001000011100000
00111110011001011100010100100000
00111110011010101001101101000000

from bitinformation.jl.

milankl avatar milankl commented on August 29, 2024

Just checked what Charlie means by NSD vs DSD, but the former is a relative error vs the latter is an absolute error. Great that you used nco's ppc option, because then rounding is actually done in binary. Depending on the version of nco this should also do granular bitrounding (see here for more on this) meaning that keepbits is somewhat variable from value to value. I guess it's between 16-18 keepbits here, looking at trailing zeros but it's obv impossible to say for sure without knowing the full precision values.

That gives you an upper bound for the bitinformation, but I'd still just run it over and see what the bitinformation analysis says?

from bitinformation.jl.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.