Comments (5)
Will do @milankl !! Thanks for the great rounding examples also. I feel like I'm starting to understand this stuff! :)
from bitinformation.jl.
Do you know whether they rounded in decimal or in binary? Because while rounding in one base also quantizes in the other it doesn't round (as in make the trailing digits/bits shorter/simpler, the actual idea of rounding) in the other. Example:
julia> a = randn(Float32,5)
5-element Vector{Float32}:
-0.35390654
1.223104
-0.04762434
-1.5507344
0.89877653
julia> b = round.(a,digits=3)
5-element Vector{Float32}:
-0.354
1.223
-0.048
-1.551
0.899
julia> bitstring.(b,:split)
5-element Vector{String}:
"1 01111101 01101010011111101111101"
"0 01111111 00111001000101101000100"
"1 01111010 10001001001101110100110"
"1 01111111 10001101000011100101011"
"0 01111110 11001100010010011011101"
While the mantissa bits aren't all zero after some mantissa bits, the resulting array may still be more compressible as the number of possible mantissa bitpatterns still have been greatly reduced.
But to actually answer your question, you can (roughly) translate between significant digits and bits via
nsb(nsd::Integer) = Integer(ceil(log(10)/log(2)*nsd))
which just arises from the idea that with
In your case this means that if you already know that your dataset contains only
|
|
---|---|
1 | 4 |
2 | 7 |
3 | 10 |
4 | 14 |
5 | 17 |
6 | 20 |
7 | 24 |
without losing any information. And in fact, you probably want to because if there's real information in the mantissa bits past those it's artificial from the quantization.
from bitinformation.jl.
In fact, we already discussed that here nco/nco#250
from bitinformation.jl.
@pnorton-usgs and I checked, and these variables were processed with NCO using args like ppc ALBEDO:5
, which specifies the Number of Significant Digits (as opposed to the Decimal Significant Digits).
So a few values look like this:
import struct
a = ds['ALBEDO'][100,500:510,600].values
a
array([0.2371893 , 0.22657919, 0.22525072, 0.21817589, 0.20056486,
0.2204647 , 0.22222233, 0.22418547, 0.22438478, 0.22910786],
dtype=float32)
def binary(num):
return ''.join('{:0>8b}'.format(c) for c in struct.pack('!f', num))
for v in list(a):
print(binary(v))
00111110011100101110000111000000
00111110011010000000010001100000
00111110011001101010100000100000
00111110010111110110100110000000
00111110010011010110000011100000
00111110011000011100000110000000
00111110011000111000111001000000
00111110011001011001000011100000
00111110011001011100010100100000
00111110011010101001101101000000
from bitinformation.jl.
Just checked what Charlie means by NSD vs DSD, but the former is a relative error vs the latter is an absolute error. Great that you used nco's ppc
option, because then rounding is actually done in binary. Depending on the version of nco this should also do granular bitrounding (see here for more on this) meaning that keepbits is somewhat variable from value to value. I guess it's between 16-18 keepbits here, looking at trailing zeros but it's obv impossible to say for sure without knowing the full precision values.
That gives you an upper bound for the bitinformation, but I'd still just run it over and see what the bitinformation analysis says?
from bitinformation.jl.
Related Issues (17)
- TagBot trigger issue HOT 10
- @inbounds for array rounding HOT 1
- Where is the best place to discuss usage/interpretation/best practices? HOT 30
- Applying BitInformation to compress WRF model results HOT 4
- Discuss best practices for `xr.Dataset.to_netcdf()` HOT 3
- Bitinformation of masked data HOT 14
- use of bitinformation(dim) HOT 5
- Incorrect round away from zero for keepbits=significand_bits HOT 1
- Understanding latitudinal bounds of bitrounding absolute error HOT 11
- Improve Error message when `dim` in `bitinformation(data, dim)` too short HOT 3
- How to implement boundary conditions with `masked_value` HOT 3
- Method definition triggers warning in precompilation HOT 1
- Compressing zarr data store for simulation data HOT 6
- Smallest chunk based on statistics of random information HOT 1
- Check for NaNs and raise warning
- Bitinformation along dimensions of size 2 fails when masked_value given HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bitinformation.jl.