Code Monkey home page Code Monkey logo

scikit-multilearn

PyPI version License

scikit-multilearn is a Python module capable of performing multi-label learning tasks. It is built on-top of various scientific Python packages (numpy, scipy) and follows a similar API to that of scikit-learn.

Features

  • Native Python implementation. A native Python implementation for a variety of multi-label classification algorithms. To see the list of all supported classifiers, check this link.

  • Interface to Meka. A Meka wrapper class is implemented for reference purposes and integration. This provides access to all methods available in MEKA, MULAN, and WEKA — the reference standard in the field.

  • Builds upon giants! Team-up with the power of numpy and scikit. You can use scikit-learn's base classifiers as scikit-multilearn's classifiers. In addition, the two packages follow a similar API.

Installation & Dependencies

To install scikit-multilearn, simply type the following command:

$ pip install scikit-multilearn

This will install the latest release from the Python package index. If you wish to install the bleeding-edge version, then clone this repository and run setup.py:

$ git clone https://github.com/scikit-multilearn/scikit-multilearn.git
$ cd scikit-multilearn
$ python setup.py

In most cases requirements are installed when you install using pip install scikit-multilearn or run python setup.py install. There are also optional dependencies pip install scikit-multilearn[gpl,keras,meka] installs the GPL-incurring igraph for for igraph library based clusterers, keras for the keras classifiers and requirements for the meka bridge respectively.

To install openNE, run:

pip install 'openne @ git+https://github.com/thunlp/OpenNE.git@master#subdirectory=src'

Note that installing the GPL licensed graphtool, for graphtool based clusters, is complicated, and must be done manually, please see: graphtool install instructions

Basic Usage

Before proceeding to classification, this library assumes that you have a dataset with the following matrices:

  • x_train, x_test: training and test feature matrices of size (n_samples, n_features)
  • y_train, y_test: training and test label matrices of size (n_samples, n_labels)

Suppose we wanted to use a problem-transformation method called Binary Relevance, which treats each label as a separate single-label classification problem, to a Support-vector machine (SVM) classifier, we simply perform the following tasks:

# Import BinaryRelevance from skmultilearn
from skmultilearn.problem_transform import BinaryRelevance

# Import SVC classifier from sklearn
from sklearn.svm import SVC

# Setup the classifier
classifier = BinaryRelevance(classifier=SVC(), require_dense=[False,True])

# Train
classifier.fit(X_train, y_train)

# Predict
y_pred = classifier.predict(X_test)

More examples and use-cases can be seen in the documentation. For using the MEKA wrapper, check this link.

Contributing

This project is open for contributions. Here are some of the ways for you to contribute:

  • Bug reports/fix
  • Features requests
  • Use-case demonstrations
  • Documentation updates

In case you want to implement your own multi-label classifier, please read our Developer's Guide to help you integrate your implementation in our API.

To make a contribution, just fork this repository, push the changes in your fork, open up an issue, and make a Pull Request!

We're also available in Slack! Just go to our slack group.

Cite

If you used scikit-multilearn in your research or project, please cite our work:

@ARTICLE{2017arXiv170201460S,
   author = {{Szyma{\'n}ski}, P. and {Kajdanowicz}, T.},
   title = "{A scikit-based Python environment for performing multi-label classification}",
   journal = {ArXiv e-prints},
   archivePrefix = "arXiv",
   eprint = {1702.01460},
   year = 2017,
   month = feb
}

scikit-multilearn's Projects

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.