Hi, I dont know the meaning of the parameter 'w_threshold' in "from_pandas",becaus

Hello, Even with strong L1 regularisation, the output of <code class

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Question about w_threshold about causalnex HOT 8 CLOSED

1021808202 commented on June 4, 2024

Question about w_threshold

from causalnex.

Comments (8)

qbphilip commented on June 4, 2024

Hello,

Even with strong L1 regularisation, the output of from_pandas or from_pandas_lasso will never be sparse. There will always be values that are small but different to 0.

w_threshold removes all edges with absolute weights below the value.
If you want to automate the selection, you could use cross-validation to pick the "best-performing" threshold.

The least aggressive approach is to use the threshold that creates a valid DAG (as by the constraint of NoTears)

sm = from_pandas(df, w_threshold=0)
thresh = 0
step=0.01
while not nx.algorithms.is_directed_acyclic_graph(sm):
    sm.remove_edges_below_threshold(thresh)
    thresh += step

more efficient, looping over actual weights without steps:

sm = from_pandas(df, w_threshold=0)
all_weights =[w for _, _, w in sm.edges(data='weight')
sorted_weights = sorted(all_weights)
for thresh in all_weights:
    if nx.algorithms.is_directed_acyclic_graph(sm):
        break
    sm.remove_edges_below_threshold(thresh)

from causalnex.

1021808202 commented on June 4, 2024

Thank you for your careful explanation, I have benefited a lot from it.

from causalnex.

1021808202 commented on June 4, 2024

By the way, could you please show me how to use cross-validation to pick the "best-performing" threshold rather than the "worst-performing" threshold. I believe that many people would like to see it in the docs.
Thanks again.

from causalnex.

1021808202 commented on June 4, 2024

As you said,we can get a DAG with a min w_threshold, but I need to get a better StructureModel.For exmaple, the docs' first CausalNex tutorial , 'whether a student will pass or fail an exam', set the w_threshold as 0.8. So I want to know which value of w_threshold is good in my dataset.

from causalnex.

SteveLerQB commented on June 4, 2024

Hi @1021808202,

I suggest treating w_threshold as your hyperparameter, and use tools like hyperopt along with a specified range of w_treshold to find the best w_treshold to use. Thanks 🙂

from causalnex.

ziyuwzf commented on June 4, 2024

As you said,we can get a DAG with a min w_threshold, but I need to get a better StructureModel.For exmaple, the docs' first CausalNex tutorial , 'whether a student will pass or fail an exam', set the w_threshold as 0.8. So I want to know which value of w_threshold is good in my dataset.

i have the same problem

from causalnex.

ziyuwzf commented on June 4, 2024

Hi @1021808202,

I suggest treating w_threshold as your hyperparameter, and use tools like hyperopt along with a specified range of w_treshold to find the best w_treshold to use. Thanks

i have the same problem:
As you said,we can get a DAG with a min w_threshold, but I need to get a better StructureModel.For exmaple, the docs' first CausalNex tutorial , 'whether a student will pass or fail an exam', set the w_threshold as 0.8. So I want to know which value of w_threshold is good in my dataset.

from causalnex.

oentaryorj commented on June 4, 2024

The so-called "good" or "correct" graph structure should be validated based on domain knowledge. In this case, you may want to define a structure quality metric based on what you know about the data/domain, and perform grid search (or use hyperopt as above) on w_threshold until you find a structure that optimises this metric. A plausible idea to implement this is to use our DAGRegressor or DAGClassifier interface together with scikit-learn's GridSearchCV, providing your own custom scoring function, for example.

Integrating GridSearchCV or hyperopt into CausalNex would be beyond the scope of this project, however. As such, I propose we close this issue for now. Nevertheless, feel free to raise a new issue if you still have difficulties in tweaking w_threshold.

from causalnex.

Question about w_threshold about causalnex HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent