Code Monkey home page Code Monkey logo

xddun / nmslib Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nmslib/nmslib

0.0 0.0 0.0 96.87 MB

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

License: Apache License 2.0

Shell 2.80% C++ 64.22% Python 10.22% Perl 5.65% C 0.51% Java 0.42% R 0.04% TeX 10.09% Makefile 0.59% Thrift 0.14% CMake 0.63% Batchfile 0.60% Jupyter Notebook 3.51% Roff 0.58%

nmslib's Introduction

Pypi version Downloads Downloads Build Status Windows Build Status Join the chat at https://gitter.im/nmslib/Lobby

Non-Metric Space Library (NMSLIB)

Important Notes

  • NMSLIB is generic but fast, see the results of ANN benchmarks.
  • A standalone implementation of our fastest method HNSW also exists as a header-only library.
  • All the documentation (including using Python bindings and the query server, description of methods and spaces, building the library, etc) can be found on this page.
  • For generic questions/inquiries, please, use the Gitter chat: GitHub issues page is for bugs and feature requests.

Objectives

Non-Metric Space Library (NMSLIB) is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The core-library does not have any third-party dependencies. It has been gaining popularity recently. In particular, it has become a part of Amazon Elasticsearch Service.

The goal of the project is to create an effective and comprehensive toolkit for searching in generic and non-metric spaces. Even though the library contains a variety of metric-space access methods, our main focus is on generic and approximate search methods, in particular, on methods for non-metric spaces. NMSLIB is possibly the first library with a principled support for non-metric space searching.

NMSLIB is an extendible library, which means that is possible to add new search methods and distance functions. NMSLIB can be used directly in C++ and Python (via Python bindings). In addition, it is also possible to build a query server, which can be used from Java (or other languages supported by Apache Thrift (version 0.12). Java has a native client, i.e., it works on many platforms without requiring a C++ library to be installed.

Authors: Bilegsaikhan Naidan, Leonid Boytsov, Yury Malkov, David Novak. With contributions from Ben Frederickson, Lawrence Cayton, Wei Dong, Avrelin Nikita, Dmitry Yashunin, Bob Poekert, @orgoro, @gregfriedland, Scott Gigante, Maxim Andreev, Daniel Lemire, Nathan Kurz, Alexander Ponomarenko.

Brief History

NMSLIB started as a personal project of Bilegsaikhan Naidan, who created the initial code base, the Python bindings, and participated in earlier evaluations. The most successful class of methods--neighborhood/proximity graphs--is represented by the Hierarchical Navigable Small World Graph (HNSW) due to Malkov and Yashunin (see the publications below). Other most useful methods, include a modification of the VP-tree due to Boytsov and Naidan (2013), a Neighborhood APProximation index (NAPP) proposed by Tellez et al. (2013) and improved by David Novak, as well as a vanilla uncompressed inverted file.

Credits and Citing

If you find this library useful, feel free to cite our SISAP paper [BibTex] as well as other papers listed in the end. One crucial contribution to cite is the fast Hierarchical Navigable World graph (HNSW) method [BibTex]. Please, also check out the stand-alone HNSW implementation by Yury Malkov, which is released as a header-only HNSWLib library.

License

The code is released under the Apache License Version 2.0 http://www.apache.org/licenses/. Older versions of the library include additional components, which have different licenses (but this does not apply to NMLISB 2.x):

Older versions of the library included the following components:

  • The LSHKIT, which is embedded in our library, is distributed under the GNU General Public License, see http://www.gnu.org/licenses/.
  • The k-NN graph construction algorithm NN-Descent due to Dong et al. 2011 (see the links below), which is also embedded in our library, seems to be covered by a free-to-use license, similar to Apache 2.
  • FALCONN library's licence is MIT.

Funding

Leonid Boytsov was supported by the Open Advancement of Question Answering Systems (OAQA) group and the following NSF grant #1618159: "Matching and Ranking via Proximity Graphs: Applications to Question Answering and Beyond". Bileg was supported by the iAd Center.

Related Publications

Most important related papers are listed below in the chronological order:

nmslib's People

Contributors

searchivarius avatar benfred avatar jmazanec15 avatar gregf-atomwise avatar yurymalkov avatar bileg avatar scottgigante avatar sjhewitt avatar bejvisek avatar orgoro avatar jvkersch avatar bobpoekert avatar andrusha97 avatar amoussawi avatar 8w9ag avatar jjjamie avatar pabs3 avatar janaknat avatar huonw avatar bhavaygg avatar brendanchambers avatar cadovvl avatar doug-friedman avatar geofft avatar gokceneraslan avatar jschmitz28 avatar deneutoy avatar mdeff avatar neggert avatar gitter-badger avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.