Comments (3)
Hi @vmand4
Sorry for the late reply -- I was on holiday.
What is "n" here?
If "n" is the dimensionality of your data, then log(n) is proportional to the size of the state space and then the normalization makes some sense if you are comparing apples to oranges -- for example, a collection of binary vectors of length 3 with a collection of binary vectors of length 4. If your continuous data is sampled from bounded domains of varying size, then you could normalize by the entropies of the uniform distributions on those domains. If your continuous data is sampled from unbounded domains, the maximum entropy is infinity, and then such a normalization is impossible.
If "n" is the total number of samples, then this normalization does not make much sense to me. Entropies are computed from probabilities (or some proxy thereof). Your total probability mass should always sum to one, and hence in theory you should not need to normalize by some measure of the number of samples -- neither in the discrete, nor in the continuous case. This is not quite true for the continuous case, as the K-L estimator has a bias that is dependent on the total number of samples. However, that bias does not scale with 1/log(N). People have tried to estimate the bias empirically by subsampling their data, computing the entropy for successively smaller samples, and then extrapolating to infinity. If you have a lot of samples to begin with, that may work for you.
Hope that helps.
Best,
Paul
from entropy_estimators.
I am closing this as this is not really an issue with the code base. Feel free to re-open if you have any remaining questions.
from entropy_estimators.
Thank you Paul for sharing your insights. Very helpful.
from entropy_estimators.
Related Issues (13)
- continuous entropy with KNN HOT 21
- Support for computing entropy of a Tensor HOT 9
- Mutual information is greater than information entropy HOT 25
- Does "partial mutual information" here mean "conditional mutual information"?
- Does "partial mutual information" here mean "conditional mutual information"? HOT 2
- Multiplying the euclidian distance by 2. HOT 2
- Error occurred with "get_imin" HOT 1
- categorical values mutual info pls HOT 4
- Unexpected -inf entropy estimations HOT 6
- Transfer Entropy on Different Dimensions? HOT 2
- Process finished with exit code -1073741571 (0xC00000FD) HOT 14
- readme import numpy as np HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from entropy_estimators.