Comments (14)
I am skeptical that this issue has anything to do with my code. Per this list, the error code is a notification on Windows systems of a stackoverflow. get_h
is not computing the entropy recursively, so I can't see how it would cause the stackoverflow. Can you provide a minimal, reproducible example, that produces the error in a clean virtual environment (i.e. only containing the dependencies of this module)?
Also, if you google around, a lot of people seem to be running into this error code when using pycharm, especially in combination with Qt and/or tensorflow (or packages built on top of tensorflow such as keras). Are you using any of these programs/packages?
from entropy_estimators.
To be clear, I am not ruling out completely that my code is at fault, I am just saying that I need a lot more evidence to convince me.
from entropy_estimators.
yes, i figured out why fun get_h does't work well, "sum_log_dist = np.sum(log(2*distances))"is a line in function get_h, in my data, some samples have identity values, which leads to some elements in distances become zeros, the the sum_log_dist gets a value of '-inf', then, the following codes with run into error.
from entropy_estimators.
by far ,i have no idea how to solve this situation, i decide to read up the original paper, could you give me some advice? Thanks!
from entropy_estimators.
get_h
has a min_dist
parameter, which when set to non-zero should circumvent your issue (distances between points smaller than min_dist
are capped to min_dist
such that points with the same coordinates are forced to have non-zero distances to each other). A principled choice for min_dist
is half of your measurement precision, typically the minimum non-zero nearest-neighbour distance in your dataset.
from entropy_estimators.
thanks for your guidance. while I test get_h and get_h_mvn on my data for feature selection, i found get_h_mvn works ideally, the calculated entropy values consist with the intuitive observation of feature data. Especially, one feature in fact is discrete, the entropy calculated from get_h_mvn is close to the entropy from standard information entropy equation for discrete variable. However, the get_h performs awfully, firstly, the entropy values calculated are counterintuitive and i also ranked the features based on the entropy, the rank from get_h and get_h_mvn have great difference, second, one feature which values are composed of {0.0: 7950, 0.0003636: 1, 0.0263157: 1}, while runing for this feature, get_h_mvn is stuck at this line" kdtree = cKDTree(x)", the python stop and print "Process finished with exit code -1073741571 (0xC00000FD)"
from entropy_estimators.
I'm working on feature selection for my project, in my situation, the features are used for clustering and have little labeled data. I have tried information entropy, Laplacian score as feature filter, do you have experience on feature selection for this scenario?
from entropy_estimators.
entropy calculated from get_h_mvn is close to the entropy from standard information entropy equation for discrete variable
That could be entirely accidental.
while runing for this feature, get_h_mvn is stuck at this line" kdtree = cKDTree(x)", the python stop and print "Process finished with exit code -1073741571 (0xC00000FD)
There is no call to cKDTree
in get_h_mvn
. It uses the (co-)variance to compute the entropy under the assumption that the samples are drawn from a multivariante normal distribution. Are you sure you are calling get_h_mvn
?
from entropy_estimators.
oh,no ,the get_h_mvn works well ,the get_h is stuck at cKDTree
from entropy_estimators.
Ok. How many samples are in your dataset and what values are you using for k and min_dist?
from entropy_estimators.
I used the default k and selected min_dist as your advice, that is, the minimum nonzero distance is selected as min_dist. In fact, these have nothing to do with aforementioned error. you can run the following code with python 3.6, the error will reappear.
from scipy.spatial import cKDTree
x=[[i] for i in [0]*7950+[ 0.0003636,0.0263157]]
tree=cKDTree(np.array(x))
Process finished with exit code -1073741571 (0xC00000FD)
from entropy_estimators.
If there are continuous features and discrete features in my feature set, i need to rank their information entropies for feature filtering. Is it justify to treat all features as continuous variables and evaluate the entropy using get_h_mvn() function? Or, only continuous features' entropies are computed by get_h_mvn() while the discrete ones are calutated by shennon entropy equation , then rank all the entropies? looking forward for your guidance , thanks!
from entropy_estimators.
I can't reproduce your error. If I were you, I would investigate your setup and eventually file a bug report on scipy.
In [1]: %paste
from scipy.spatial import cKDTree
x=[[i] for i in [0]*7950+[ 0.0003636,0.0263157]]
tree=cKDTree(np.array(x))
## -- End pasted text --
In [2]: tree
Out[2]: <scipy.spatial.ckdtree.cKDTree at 0x7f427495f4a8>
Entropy is an extensive property. So no, I don't think that you can compare entropy values for discrete variables with the entropy values for continuous variables. Even within your continuous features such a comparison may be nonsensical.
from entropy_estimators.
Since I haven't heard from you for a week, I will close this issue for now. Feel free to re-open if necessary.
from entropy_estimators.
Related Issues (15)
- continuous entropy with KNN HOT 21
- Support for computing entropy of a Tensor HOT 9
- Mutual information is greater than information entropy HOT 25
- Does "partial mutual information" here mean "conditional mutual information"?
- Does "partial mutual information" here mean "conditional mutual information"? HOT 2
- Multiplying the euclidian distance by 2. HOT 2
- Error occurred with "get_imin" HOT 1
- Can we truely get the joint distribution P(x,y) to calculate the H(x,y) ? HOT 4
- getting negative values for mutual information HOT 1
- categorical values mutual info pls HOT 4
- Unexpected -inf entropy estimations HOT 6
- Transfer Entropy on Different Dimensions? HOT 2
- Regarding Maximal Entropy HOT 3
- readme import numpy as np HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from entropy_estimators.