Code Monkey home page Code Monkey logo

Comments (8)

qbphilip avatar qbphilip commented on June 4, 2024

Hello,

Even with strong L1 regularisation, the output of from_pandas or from_pandas_lasso will never be sparse. There will always be values that are small but different to 0.

w_threshold removes all edges with absolute weights below the value.
If you want to automate the selection, you could use cross-validation to pick the "best-performing" threshold.

The least aggressive approach is to use the threshold that creates a valid DAG (as by the constraint of NoTears)

sm = from_pandas(df, w_threshold=0)
thresh = 0
step=0.01
while not nx.algorithms.is_directed_acyclic_graph(sm):
    sm.remove_edges_below_threshold(thresh)
    thresh += step

more efficient, looping over actual weights without steps:

sm = from_pandas(df, w_threshold=0)
all_weights =[w for _, _, w in sm.edges(data='weight')
sorted_weights = sorted(all_weights)
for thresh in all_weights:
    if nx.algorithms.is_directed_acyclic_graph(sm):
        break
    sm.remove_edges_below_threshold(thresh)

from causalnex.

1021808202 avatar 1021808202 commented on June 4, 2024

Thank you for your careful explanation, I have benefited a lot from it.

from causalnex.

1021808202 avatar 1021808202 commented on June 4, 2024

By the way, could you please show me how to use cross-validation to pick the "best-performing" threshold rather than the "worst-performing" threshold. I believe that many people would like to see it in the docs.
Thanks again.

from causalnex.

1021808202 avatar 1021808202 commented on June 4, 2024

As you said,we can get a DAG with a min w_threshold, but I need to get a better StructureModel.For exmaple, the docs' first CausalNex tutorial , 'whether a student will pass or fail an exam', set the w_threshold as 0.8. So I want to know which value of w_threshold is good in my dataset.

from causalnex.

SteveLerQB avatar SteveLerQB commented on June 4, 2024

Hi @1021808202,

I suggest treating w_threshold as your hyperparameter, and use tools like hyperopt along with a specified range of w_treshold to find the best w_treshold to use. Thanks 🙂

from causalnex.

ziyuwzf avatar ziyuwzf commented on June 4, 2024

As you said,we can get a DAG with a min w_threshold, but I need to get a better StructureModel.For exmaple, the docs' first CausalNex tutorial , 'whether a student will pass or fail an exam', set the w_threshold as 0.8. So I want to know which value of w_threshold is good in my dataset.

i have the same problem

from causalnex.

ziyuwzf avatar ziyuwzf commented on June 4, 2024

Hi @1021808202,

I suggest treating w_threshold as your hyperparameter, and use tools like hyperopt along with a specified range of w_treshold to find the best w_treshold to use. Thanks

i have the same problem:
As you said,we can get a DAG with a min w_threshold, but I need to get a better StructureModel.For exmaple, the docs' first CausalNex tutorial , 'whether a student will pass or fail an exam', set the w_threshold as 0.8. So I want to know which value of w_threshold is good in my dataset.

from causalnex.

oentaryorj avatar oentaryorj commented on June 4, 2024

The so-called "good" or "correct" graph structure should be validated based on domain knowledge. In this case, you may want to define a structure quality metric based on what you know about the data/domain, and perform grid search (or use hyperopt as above) on w_threshold until you find a structure that optimises this metric. A plausible idea to implement this is to use our DAGRegressor or DAGClassifier interface together with scikit-learn's GridSearchCV, providing your own custom scoring function, for example.

Integrating GridSearchCV or hyperopt into CausalNex would be beyond the scope of this project, however. As such, I propose we close this issue for now. Nevertheless, feel free to raise a new issue if you still have difficulties in tweaking w_threshold.

from causalnex.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.