Code Monkey home page Code Monkey logo

cpp-fstlib's Introduction

cpp-fstlib

C++11 header-only FST (finite state transducer) library. We can use it as Trie data structure. This library uses the algorithm "Minimal Acyclic Subsequential Transducers".

Usage

  1. Prepare entries sorted by key string
  2. Build a FST database
  3. Try 'exact match search' or 'common prefix search'

NOTE: Key type should be std::string. Value type can be either uint32_t or std::string. If value is not provided, value type will be uint32_t and the value starts with 0.

Exact match search

std::vector<std::pair<std::string, uint32_t>> input = {
  { "apr", 30 },
  { "aug", 31 },
  { "dec", 31 },
  { "feb", 28 },
  { "feb", 29 },
  { "jan", 31 },
  { "jul", 31 },
  { "jun", 30 },
};

std::vector<char> t = fst::build(input);

assert(fst::exact_match_search<uint32_t>(t.data(), t.size(), "apr")[0] == 30);
assert(fst::exact_match_search<uint32_t>(t.data(), t.size(), "ap").empty());
assert(fst::exact_match_search<uint32_t>(t.data(), t.size(), "apr_").empty());
assert(fst::exact_match_search<uint32_t>(t.data(), t.size(), "feb")[0] == 28);
assert(fst::exact_match_search<uint32_t>(t.data(), t.size(), "feb")[1] == 29);

Common prefix search

auto t = fst::build<std::string>([](auto add_entry) {
  add_entry("a",       "one");
  add_entry("and",     "two");
  add_entry("android", "three");
});

auto ret = fst::common_prefix_search<std::string>(t.data(), t.size(), "android phone");

assert(ret[0].length == 1);
assert(ret[0].outputs[0] == "one");

assert(ret[1].length == 3);
assert(ret[1].outputs[0] == "two");

assert(ret[2].length == 7);
assert(ret[2].outputs[0] == "three");

Auto Index Dictionary

auto t = fst::build({
  "a",
  "and",
  "android",
});

auto ret = fst::common_prefix_search<uint32_t>(t.data(), t.size(), "android phone");

assert(ret[0].length == 1);
assert(ret[0].outputs[0] == 0);

assert(ret[1].length == 3);
assert(ret[1].outputs[0] == 1);

assert(ret[2].length == 7);
assert(ret[2].outputs[0] == 2);

API

build

std::vector<char> build(
  const std::vector<std::string>& input)

std::vector<char> build(
  const std::vector<std::pair<std::string, std::string>>& input)

std::vector<char> build(
  std::function<void (
    std::function<void (const std::string& str)> add_entry)> input);

std::vector<char> build(
  std::function<void (
    std::function<void (const std::string& str, const std::string& value)> add_entry)> input);

exact_match_search

std::vector<std::string> exact_match_search(
  const char* byte_code,
  size_t      byte_code_size,
  const char* str)

bool exact_match_search(
  const char*  byte_code,
  size_t       byte_code_size,
  const char*  str,
  std::string& output)

common_prefix_search

struct CommonPrefixSearchResult {
  size_t                   length;
  std::vector<std::string> outputs;
};

std::vector<CommonPrefixSearchResult> common_prefix_search(
  const char* byte_code,
  size_t      byte_code_size,
  const char* str)

Tested compilers

  • Visual Studio 2017
  • Visual Studio 2015
  • Clang 4
  • Clang 3.8
  • Clang 3.5
  • g++ 5.4

License

MIT license (© 2015 Yuji Hirose)

cpp-fstlib's People

Contributors

yhirose avatar zpalmtree avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.