Code Monkey home page Code Monkey logo

lcsvec's Introduction

Python 3.8 GitHub CI GitHub license Downloads Code style

LCSvec

Longest Common Subsequence/Substring (LCS) solving package for vectors.

Why LCSvec

While looking for fast implementations solving the Longest Common Subsequence and Longest Common Substring (i.e. Contiguous Subsequence) (LCCS) problems, I only found pieces of code for strings, while I needed something working for vectors of integers. Yet, string LCS implementations 1) work at the character level, which might not be suitable for certain use-cases where one wants to work at word or sentence level; 2) only work with strings, other modalities will not work and might not be designed to be converted to bytes.

LCSvec aims to solve this gap by providing user-friendly and fast implementations of the LCS and LCCS problems for vectors. It works with numpy, pytorch, tensorflow and jax arrays/tensors! The code is written in C++ and the methods are bind with Python with nanobind for optimal performances.

Example

You can install the package with pip: pip install lcsvec. Here we will use numpy, the usage for other deep learning libraries is the same.

from lcsvec import lccs, lccs_length, lcs, lcs_length
import numpy as np

seq1 = np.arange(0, 12)
seq2 = np.array([8, 0, 1, 2, 8, 2, 3, 4, 5, 6], dtype=np.int64)

lcs_ = lcs(seq1, seq2)  # [0, 1, 2, 3, 4, 5, 6]
lcs_len = lcs_length(seq1, seq2)  # 7, more efficient than calling len(lcs(seq1, seq2))

lccs = lccs(seq1, seq2)  # [2, 3, 4, 5, 6]
lccs_len = lccs_length(seq1, seq2)  # 5, more efficient than calling len(lccs(seq1, seq2))

TODOs

  • implement lccs with suffix tree;
  • batch methods, i.e. supporting 2D arrays;
  • batch methods with any number of dimensions (nD array) and add a dim argument;
  • make it work with an unlimited number of sequences, and set dim and pad_token as kwargs only;

lcsvec's People

Contributors

natooz avatar dependabot[bot] avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.