Code Monkey home page Code Monkey logo

Comments (2)

isaksamsten avatar isaksamsten commented on September 28, 2024

Hi!

Thanks for your question!

You can get the leaf node index using the apply method of ShapeletTreeClassifier

For example:

import numpy as np
from wildboar.ensemble import ShapeletForestClassifier
from wildboar.datasets import load_gun_point

X, y = load_gun_point()
f = ShapeletForestClassifier()
f.fit(X, y)

# Get the index of the node where each sample ends up
# eg leaves[0] contains the leaf node index for the first sample
leaves = f.estimators_[0].apply(X)

# count which leaf has the most samples in it
np.unique(leaves, return_counts=True)

You can also request it for the forest:

leaves = f.apply(X)
leaves[0, 0] # the index of the leaf of the first sample in the first tree
leaves[:, 0] # the index of the leaves for all samples in the first tree

While writing this, I realise this might not be what you are asking for? Are you asking for information about the majority class in each leaf?

You can get the probability distribution over labels in a tree from the tree_.value property:

f.estimators_[0].value

You'll note that some of these values are nan, inf etc, those values are for brach nodes, so to only get the values for leaf nodes we have to figure out which node indicies refer to leafs. We can do that using the tree_.left (or tree_.right) and find the values that are -1 which indicates that a specific index refers to a leaf

tree = f.estimators_[0]
leaf_proba = tree.tree_.value[tree.tree_.left == -1]

This will give us a matrix of (leaf_node, n_labels) with the distribution of labels that ended up in the leaf during training.

To get the class name that a leaf will assign we take the labels for the index of the maximal column from the value array:

leaf_nodes = tree.tree_.left == -1
labels = f.classes_.take(np.argmax(tree.tree_.value[leaf_nodes], axis=1))

We can use zip to get the node_index and the assigned label:

list(zip(np.nonzero(leaf_nodes)[0], labels))

This would return something like:

[(3, 2.0),
 (5, 2.0),
 (7, 1.0),
 (8, 2.0),
 (10, 1.0),
 (12, 2.0),
 (17, 2.0),
 (19, 2.0),
 (20, 1.0),
 (21, 2.0),
 (22, 1.0),
 (25, 1.0),
 (27, 2.0),
 (30, 1.0),
 (31, 2.0),
 (32, 2.0),
 (34, 1.0),
 (37, 1.0),
 (38, 2.0),
 (39, 2.0),
 (41, 2.0),
 (43, 1.0),
 (44, 2.0)]

Hope this helps!

I will close the issue (since its not an issue :))

We can continue the discussion under "Discussions"

Cheers,
Isak

from wildboar.

lorebon avatar lorebon commented on September 28, 2024

Thank you very much!

from wildboar.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.