Code Monkey home page Code Monkey logo

keyvi's People

Contributors

amit-cliqz avatar ankit-cliqz avatar david-cliqz avatar dependabot[bot] avatar gmossessian avatar hendrikmuhs avatar ivanychev avatar jsdelivrbot avatar michael-a-cliqz avatar narekgharibyan avatar simonalger avatar staticmukesh avatar subu-cliqz avatar vvucetic avatar y3ti avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

keyvi's Issues

Investigate PyPy wheel upload issue from Travis-CI (OS X)

+cd python

++ls -A wheelhouse

+'[' -n keyvi-0.3.1-pp260-pypy_41-macosx_10_11_x86_64.whl ']'

+twine upload --config-file ../travis/pypirc -u [secure] -p [secure] wheelhouse/keyvi-0.3.1-pp260-pypy_41-macosx_10_11_x86_64.whl

dyld: lazy symbol binding failed: Symbol not found: _clock_gettime

  Referenced from: /Users/travis/.pyenv/versions/pypy2.7-6.0.0/bin//libpypy-c.dylib

  Expected in: flat namespace



dyld: Symbol not found: _clock_gettime

  Referenced from: /Users/travis/.pyenv/versions/pypy2.7-6.0.0/bin//libpypy-c.dylib

  Expected in: flat namespace



travis/upload_packages.sh: line 9: 16406 Trace/BPT trap: 5       twine upload --config-file ../travis/pypirc -u $PYPI_USERNAME -p $PYPI_PASSWORD wheelhouse/*

Script failed with status 133

full log: https://travis-ci.org/KeyviDev/keyvi/jobs/422959161

IntDictionaryCompiler should throw error for negative values

The interface of the IntDictionaryCompiler.Add allows all long values:

Add(self, libcpp_utf8_string, long_int)

But doesn't work with negative integers:

In [5]: compiler = IntDictionaryCompiler()
In [5]: compiler.Add('a', 1)
In [6]: compiler.Add('b', -11)
In [7]: compiler.Compile(); compiler.WriteToFile(path)

In [8]: dictionary = Dictionary(path)

In [9]: dictionary.get('a').GetValue()
Out[9]: 1

In [10]: dictionary.get('b').GetValue()
Out[10]: 18446744073709551605

Instead the compiler should've thrown an error on seeing negative values.

Renovate python codebase

The python code base should get some updates on tooling:

  • use pyproject.toml, get rid of requirements.txt and renovate setup.py accordingly
  • introduce ruff linting

core dump with multiple key update and compile

The following Python code produces a coredump

import keyvi
c = keyvi.JsonDictionaryCompiler()
c.Add('a', '1')
c.Add('a', '1')
c.Compile()
c.Add('a', '1')

Ubuntu 18.04, Python3 virtual environment, keyvi installed with pip.

Implement new StringValueStore supporting compression

Followup of #69

The StringValueStore should be re-implemented to reuse the same techniques as todays JsonValueStore which is supporting compression and using length-prefixes instead of zero-termination.

To ensure backwards compatibility rename StringValueStore to StringValueStoreDeprecated, add a new StringValueStore using a new enum value for the type. While writer and mergers can be deleted the reader must be kept to support existing keyvi files at least until the next bigger release.

The new StringValueStore and the JsonValueStore should only differ by 2 operations: encoding and decoding json, everything else should be almost identical.

keyvi binary not compiled for m1 mac

Ok, so now I'm trying to get keyvi working on my m1 mac.
It pip installs fine, but as soon as I try to import it, I get the following:

Traceback (most recent call last):
  File "[redacted]", line 2, in <module>
    import keyvi.compiler
  File "/Users/ysaxon/.pyenv/versions/3.9.10/lib/python3.9/site-packages/keyvi/__init__.py", line 23, in <module>
    from keyvi._core import MatchIterator, Match, loading_strategy_types
ImportError: dlopen(/Users/ysaxon/.pyenv/versions/3.9.10/lib/python3.9/site-packages/keyvi/_core.cpython-39-darwin.so, 0x0002): tried: '/Users/ysaxon/.pyenv/versions/3.9.10/lib/python3.9/site-packages/keyvi/_core.cpython-39-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e')), '/usr/local/lib/_core.cpython-39-darwin.so' (no such file), '/usr/lib/_core.cpython-39-darwin.so' (no such file)

The key line in there seems to be the following
_core.cpython-39-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))

I'm working on compiling the core library myself right now, so if you don't have access to an M1 mac, let me know if there's anything special I should do to build a .so that you could include in the pip repo.

free(): invalid pointer when using GetFuzzy()

Script to reproduce to problem:

import numpy as np
import keyvi
from keyvi.dictionary import Dictionary 
from keyvi.compiler import CompletionDictionaryCompiler                                                                             

d = Dictionary('terms.kv')                                                                               
keys = list(d.GetAllKeys())                                                                                                         


while True: 
  key = np.random.choice(keys) 
  matches = list(d.GetFuzzy(key, 3)) 
  for m in matches: 
    m.GetMatchedString() 
    m.GetValue() 

For me, it exits after ~10 seconds with:

free(): invalid pointer
[1]    22023 abort (core dumped)  ipython

Let me know if there's anything else I could help with!

How to use this library for C++

Thanks for this awesome library.

I want to use the c_api of your library in my project. For example my sample code looks like this:

#include "keyvi/c_api/c_api.h"

...

auto* dict = keyvi_create_dictionary("/path/to/compiled.kv");
auto* match = keyvi_dictionary_get(dict, "my_key");
std::string val(keyvi_match_get_value_as_string(match));
cout << val;

However, this doesn't get compiled and gives error like unknown symbols. Could you please suggest how to compile and run this code?

index can crash if running out of file handles

With the default number of file handles (1024) on an ubuntu 16.04 the indexer can crash caused by a corrupted file.

This seems to originate from an unhandled problem on a filestream somewhere in the value store/memory map manager. The error gets ignored, the output file gets corrupted and the indexer crashes shortly after trying to read the file.

Increasing the number of file handles workarounds the problem.

A fix should

  • improve error handling, properly detect if we run out of file handles
  • throttle indexer to avoid problems if file handles get short
  • proper error messages

Allow integer list as a key

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

It will allow to store arbitrary data/sequences not just text

Describe the solution you'd like
A clear and concise description of what you want to happen.

Ability to use list of INTEGER's as a key f.e.:

index.Set([5,77,89,999], "24.9")

if my understanding is correct that the key is FSM-like then utf-8 characters are special case of number list with numbers of 0 .. 255

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

from outside may be u can have code to convert/encode numbers to characters, but it will be slower and cumbersome ... keyvi idea is that its fast isnt it ;)

Additional context
Add any other context or screenshots about the feature request here.

No errors from `WriteToFile` when output path contains non existent folder

Describe the bug
When you call WriteToFile and specify output path with folder that doesn't exists, WriteToFile silently do nothing

To Reproduce

import os
import keyvi.compiler


c = keyvi.compiler.JsonDictionaryCompiler()
c.Add(b'1', b'123')
c.Compile()

output_path = '/tmp/thispathdoesntexists/test.kv'
c.WriteToFile(output_path)

assert os.path.exists(output_path)  # will fail, because file doesn't exists

Expected behavior

Exception on WriteToFile

Coverage reporting broken for newer versions of gcc(gcov)

After bumping the ubuntu image to 20.04 (to fix rust builds) the coverage report broke and e.g. reported wrong line counts and as a result of that wrong coverage. The image for creating coverage reports has been set back to 16.04.

The problem turned out to be a problem with the extended format of gcov introduced in gcc 8, it reports coverage for instances of templates. The parser from cpp-coveralls produces wrong results as result of that, I created an upstream fix: eddyxu/cpp-coveralls#157.

Fuzzy match produces wrong results

Erroneous matches produced by GetFuzzy() when the prefix length equals the length of the key string in the index.
For example, the dictionary
{"a": 1, "apple": 2}
GetFuzzy("app", 0, 1) would match "a"->1 with the score "0.0", and GetMatchedString() would return "app"

to reproduce execute the code in the attached file below

expected behavior:
the query should return and empty result set

keyvi_bug_20211217.py.zip

GetFuzzyCompletions: Aborted (core dumped)

Describe the bug
Program terminates abnormally when using GetFuzzyCompletions with a multibyte character in position 2.

To Reproduce

$ pip install keyvi==0.5.3

test.py:

from keyvi.compiler import KeyOnlyDictionaryCompiler
from keyvi.dictionary import Dictionary
from keyvi.completion import PrefixCompletion

c = KeyOnlyDictionaryCompiler()
c.Add("mß")
c.Compile()
c.WriteToFile("d.kv")

d = Dictionary("d.kv")
p = PrefixCompletion(d)

p.GetFuzzyCompletions("mß", 1)
$ python test.py 
terminate called after throwing an instance of 'std::invalid_argument'
  what():  Illegal UTF-8 lead byte: 159
Aborted (core dumped)

Expected behavior
Program returns without an error.

Additional context
I am guessing it is related to this https://github.com/KeyviDev/keyvi/blob/master/keyvi/include/keyvi/dictionary/completion/prefix_completion.h#L133

Use list of numbers as a key or values

With those two convert list of nums to string to use as a key or value and back :

      # list of numbers =to=> string
	def nums2str(nums, itype='H'): #B:int8,H:int16,I:int32,Q:int64
		return struct.pack(f">{len(nums)}{itype}", *nums)

	#string =2=> tuple of numbers, use list()
	def str2nums(b, itype='H'):
		size = 1 if itype == 'B' else ('HIQ'.index(itype) + 1) * 2
		return struct.unpack(f">{len(b)//size}{itype}", bytes(b,'utf-8'))	

       index.Set(nums2str([454,7889]), nums2str([1,2,3,4]))

       m = index.Get(nums2str([454,7889])
       res = str2nums(m.GetValue())

u can add this to the docs

bytes(b,'utf-8') : assumes u got back string if u got bytestring just use 'b' directly

Replace boost property tree with rapidjson

The code uses rapidjson for json parsing but also the json version of boost property tree in other places. It should be possible to replace all boost property trees with rapidjson.

(FWIW: The reason is historical, property trees have been used before all the json support was added)

Intermittent test failure index unit tests

happened for #79 (but likely not caused by the changes of it):

terminate called after throwing an instance of 'msgpack::v1::insufficient_bytes'
what(): insufficient bytes
unknown location(0): fatal error in "indexwithdeletedkeys": signal: SIGABRT (application abort requested)
/home/travis/build/KeyviDev/keyvi/keyvi/tests/keyvi/index/read_only_index_test.cpp(147): last checkpoint

It seems like the deleted keys which are stored in a msgpack buffer were somehow truncated

Fetch all the keys with a specific length

Because keyvi is based on FST It should be possible to be able to get the key-value pairs based on the length of the key irregardless of the content.

This with the capability to store sequences : https://gist.github.com/vsraptor/8beb0c04fe5914c50d6d307393b34893

opens many new areas of use..

Describe the solution you'd like
A clear and concise description of what you want to happen.

kmatch = keyvi.index.get_keys_by_len(5)
[ k for k in kmatch ]

kvmatch = keyvi.index.get_kv_by_len(5)
{ k:v for k,v in kvmatch }

Investigate alternative hashing algorithms

Compile time: An important and runtime wise expensive part are hashes used for minimization and value de-duplication. For states the algorithm is based on bob-jenkins hashing, for values it is some common string hashing.

Files:

https://github.com/KeyviDev/keyvi/blob/master/keyvi/include/keyvi/dictionary/fsa/internal/unpacked_state.h#L151
https://github.com/KeyviDev/keyvi/blob/master/keyvi/include/keyvi/dictionary/fsa/internal/value_store_persistence.h#L83

The idea is to try different alternatives to improve performance, state hashing should have the bigger impact.

Some links:

https://github.com/rurban/smhasher
https://github.com/Cyan4973/xxHash
https://github.com/RedSpah/xxhash_cpp/blob/master/xxhash/xxhash.hpp
https://github.com/leo-yuriev/t1ha

Advice to build a dictionary of already sorted keys

I have looked through the documentation and issues and it looks like keyvi itself sorts the data, for example when I read #181. In my case I have two numpy arrays, one with 5 billion keys and the another one with the corresponding values. We can sort our 5 billion keys in around 30 minutes using argsort. Therefore I wonder what would be the quickest way to build the keyvi dictionary from this? Does keyvi for example already notice the data is sorted (and does not have to check this again)?

Segmentation fault (core dumped) when putting items in Index

Hello,

first of all I really appreciate your work and tool you have created. Currently I am trying to create a PoC using keyvi but I am facing following issue.
I have decided to put some data in one batch in order to create as few files as possible. I am using python so I cannot verify the issue by calling C++ code directly but it seems that error returned Segmentation fault (core dumped) is related to C++ part.
Basically I am trying to create 100300 items Index. Interesting part is that when I try to create i.e. 200300 items Index the issue is not occurring. You can find repro script below.
Any help will be appreciated.

from keyvi._core import Index
def run_repro():
    index = Index('repro')
    data = [(str(i), 'test') for i in range(100300)] # 
    index.MSet(data)
    index.Flush()
if __name__ == '__main__':
    run_repro()

best regards,
Kamil

keyvi-server reloaded

The index is only embedded for now, compared to the keyvi-server experimental branch we miss a network stack.

After looking into different alternatives it seems to me that grpc would be a good choice.

Depending on the amount of dependencies this could live in a separate repository.

Replace TPIE sorting by partioning and merging fsa's

#180 made my think.

For some reason my vision was to replace the sorting code. I realized this might not be necessary, we have everything we need.

I compared the suggestion I gave in #180 on larger data sets:

  • creating a keyvi file from scratch, utilizing TPIE
  • creating x small keyvi files (using the "small data compilers" which don't use TPIE sort) and run merger on it

I ran different cases, in summary the merge approach was roughly 20% slower. Note, I did not optimize anything (I used simple python scripts). My merge approach had to copy more data, an improved implementation would avoid that.

The idea is as follows

  1. create an in-memory sorter
  2. if the in-memory sort buffer hits the threshold, sort the data, create an fsa, persist it, free buffers
  3. go to 1.
  4. after all data has been processed, sort, create, persist the final chunk
  5. merge the fsa's and create the final keyvi file

No coveralls data on recent builds

Coveralls data is pushed but reported as empty at Coveralls: https://coveralls.io/jobs/42985161

started seeing this in #109, but also happens for #110, so it looks like an upstream problem, maybe due to cpp-coveralls which released a new version: https://github.com/eddyxu/cpp-coveralls

Otherwise it looks like we sent at least some data:

+ coveralls-merge keyvi.cov_report python.cov_report
Reporting on files:
keyvi/include/keyvi/compression/predictive_compression.h
keyvi/include/keyvi/dictionary/fsa/internal/int_value_store.h
keyvi/include/keyvi/util/vint.h
keyvi/include/keyvi/util/os_utils.h
keyvi/include/keyvi/dictionary/fsa/internal/unpacked_state_stack.h
keyvi/include/keyvi/index/internal/merge_job.h
keyvi/include/keyvi/dictionary/fsa/internal/int_inner_weights_value_store.h
keyvi/include/keyvi/dictionary/dictionary_compiler.h
keyvi/include/keyvi/stringdistance/needleman_wunsch.h
keyvi/include/keyvi/index/internal/merge_policy.h
keyvi/include/keyvi/dictionary/sort/in_memory_sorter.h
keyvi/include/keyvi/dictionary/fsa/generator_adapter.h
keyvi/include/keyvi/dictionary/fsa/internal/value_store_persistence.h
keyvi/include/keyvi/dictionary/fsa/internal/ivalue_store.h
keyvi/include/keyvi/dictionary/match_iterator.h
keyvi/include/keyvi/dictionary/sort/tpie_sorter.h

Improve json string -> msgpack

On compiler side when encoding json values to msgpack we use a rapidjson document as intermediate format to finally turn it into msgpack. That's actually unnecessary, it would be possible to write a rapidjson handler to directly write msgpack.

Code: https://github.com/KeyviDev/keyvi/blob/master/keyvi/include/keyvi/util/json_value.h

Impact: Safes 1 allocation and a couple of operations per value but overall I am not sure if it makes a big difference. It for sure simplifies the code.

Impact should be benchmarked, at least with a micro-benchmark.

race condition in lazy loading of index

Found in keyvi-server but root cause seems to be an issue in lazy loading of segments in index, this is not thread-safe and requires a guard.

Repro:

  • start keyviserver (with redis support -r)
  • redis-benchmark -t get,set,mset -p 7586 -n 100000

stacktrace:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055bfd8f0695b in keyvi::dictionary::fsa::Automata::ResolvePointer (this=0x7f23a0034900, starting_state=17, c=107 'k')
    at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/keyvi/keyvi/include/keyvi/dictionary/fsa/automata.h:406
406         uint16_t pt = le16toh(transitions_compact_[starting_state + c]);
[Current thread is 1 (Thread 0x7f23c3fff700 (LWP 18143))]
(gdb) bt
#0  0x000055bfd8f0695b in keyvi::dictionary::fsa::Automata::ResolvePointer (this=0x7f23a0034900, starting_state=17, c=107 'k')
    at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/keyvi/keyvi/include/keyvi/dictionary/fsa/automata.h:406
#1  0x000055bfd8f34e9b in keyvi::dictionary::fsa::Automata::TryWalkTransition (this=0x7f23a0034900, starting_state=17, c=107 'k')
    at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/keyvi/keyvi/include/keyvi/dictionary/fsa/automata.h:138
#2  0x000055bfd8f35414 in keyvi::dictionary::Dictionary::operator[] (this=0x7f23b0020840, key="key:__rand_int__")
    at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/keyvi/keyvi/include/keyvi/dictionary/dictionary.h:103
#3  0x000055bfd8f377e2 in keyvi::index::internal::BaseIndexReader<keyvi::index::internal::IndexWriterWorker, keyvi::index::internal::Segment>::operator[] (this=0x55bfd9fac560, key="key:__rand_int__")
    at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/keyvi/keyvi/include/keyvi/index/internal/base_index_reader.h:56
#4  0x000055bfd8f44b7b in keyvi_server::service::redis::RedisServiceImpl::Get (this=0x55bfd9fb18f0, key="key:__rand_int__", value=0x7f239edf9ae0)
    at /home/hendrik/work/git-personal/keyvi-server/src/keyvi_server/service/redis/redis_service_impl.cpp:39
#5  0x000055bfd8eb2f86 in keyvi_server::service::redis::CommandHandler::GetCommandHandler::Run (this=0x55bfda00ede0, args=std::vector of length 2, capacity 2 = {...}, output=0x7f239edf9b50)
    at /home/hendrik/work/git-personal/keyvi-server/src/keyvi_server/service/redis/command_handler.h:55
#6  0x00007f23d5dceac8 in brpc::policy::ConsumeCommand (ctx=ctx@entry=0x7f23ac04e320, commands=std::vector of length 2, capacity 2 = {...}, flush_batched=flush_batched@entry=true, 
    appender=appender@entry=0x7f239edf9cc0) at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/brpc/src/brpc/policy/redis_protocol.cpp:99
#7  0x00007f23d5dcf624 in brpc::policy::ParseRedisMessage (source=0x7f23c81105c0, socket=0x7f23c8110540, read_eof=<optimized out>, arg=<optimized out>)
    at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/brpc/src/brpc/policy/redis_protocol.cpp:181
#8  0x00007f23d5d70496 in brpc::InputMessenger::CutInputMessage (this=this@entry=0x55bfda0623b0, m=m@entry=0x7f23c8110540, index=index@entry=0x7f239edf9e90, read_eof=read_eof@entry=false)
    at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/brpc/src/brpc/input_messenger.cpp:71
#9  0x00007f23d5d70e13 in brpc::InputMessenger::OnNewMessages (m=0x7f23c8110540) at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/brpc/src/brpc/input_messenger.cpp:234
#10 0x00007f23d5e387ad in brpc::Socket::ProcessEvent (arg=0x7f23c8110540) at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/brpc/src/brpc/socket.cpp:1017
#11 0x00007f23d5cb223f in bthread::TaskGroup::task_runner (skip_remained=<optimized out>) at /home/hendrik/work/git-personal/keyvi-server/src/3rdparty/brpc/src/bthread/task_group.cpp:296
#12 0x00007f23d5c96c21 in bthread_make_fcontext () from /home/hendrik/work/git-personal/keyvi-server/build/src/3rdparty/brpc/output/lib/libbrpc.so

The crash happens due to corrupted automata instance.

I already have a fix, doing some more tests before opening a PR.

Fuzzy completion doesn't return all expected results

@hendrikmuhs as you've asked, opening an issue for FuzzyCompletions with some examples of not returning an expected results.

Input data:

# -*- coding: utf-8 -*-


import keyvi

c = keyvi.CompletionDictionaryCompiler({"memory_limit_mb": "10"})
c.Add("turkei news", 23698)
c.Add("turkei side", 18838)
c.Add("turkei urlaub", 23424)
c.Add("turkisch anfänger", 20788)
c.Add("turkisch für", 21655)
c.Add("turkisch für anfänger", 20735)
c.Add("turkçe dublaj", 28575)
c.Add("turkçe dublaj izle", 16391)
c.Add("turkçe izle", 19946)
c.Add("tuv akademie", 9557)
c.Add("tuv hessen", 7744)
c.Add("tuv i", 331)
c.Add("tuv in", 10188)
c.Add("tuv ib", 10189)
c.Add("tuv kosten", 11387)
c.Add("tuv nord", 46052)
c.Add("tuv sood", 46057)
c.Add("tus rhein", 462)
c.Add("tus rheinland", 39131)
c.Add("tus öffnungszeiten", 15999)

c.Compile()
c.WriteToFile('fuzzy.kv')

d = keyvi.Dictionary('fuzzy.kv')
c = keyvi.PrefixCompletion(d)

This returns an empty iterator: expected to see everything starting with tuv

for m in c.GetFuzzyCompletions('tuv', 0):
    print m.GetMatchedString()

This returns only tus rhein: expected to have tus rheinland as well.

for m in c.GetFuzzyCompletions('tuv rhein', 1):
    print m.GetMatchedString()

This returns an empty iterator: expected to see everything starting with turkçe

for m in c.GetFuzzyCompletions('turkçe', 1):
    print m.GetMatchedString()

error on pip installing: spinsort,block_indirect_sort, not a member of boost sort

So I did manage to solve this myself, but just wanted to post in case anyone runs into the same problem.

I was trying to pip install keyvi but kept getting

    In file included from /tmp/pip-build-tzs1qyas/keyvi/src/cpp/keyvi/include/keyvi/dictionary/dictionary_types.h:28:0,
                     from _core_p.cpp:652:
    /tmp/pip-build-tzs1qyas/keyvi/src/cpp/keyvi/include/keyvi/dictionary/dictionary_compiler.h: In member function ‘void keyvi::dictionary::DictionaryCompiler<ValueStoreType>::Sort()’:
    /tmp/pip-build-tzs1qyas/keyvi/src/cpp/keyvi/include/keyvi/dictionary/dictionary_compiler.h:184:20: error: ‘block_indirect_sort’ is not a member of ‘boost::sort’
           boost::sort::block_indirect_sort(key_values_.begin(), key_values_.end());
                        ^~~~~~~~~~~~~~~~~~~
    In file included from /tmp/pip-build-tzs1qyas/keyvi/src/cpp/keyvi/include/keyvi/dictionary/dictionary_types.h:29:0,
                     from _core_p.cpp:652:
    /tmp/pip-build-tzs1qyas/keyvi/src/cpp/keyvi/include/keyvi/dictionary/dictionary_index_compiler.h: In member function ‘void keyvi::dictionary::DictionaryIndexCompiler<ValueStoreType>::Sort()’:
    /tmp/pip-build-tzs1qyas/keyvi/src/cpp/keyvi/include/keyvi/dictionary/dictionary_index_compiler.h:215:20: error: ‘spinsort’ is not a member of ‘boost::sort’
           boost::sort::spinsort(key_values_.begin(), key_values_.end());

I had just apt installed libboost-all-dev (1.65.1.0ubuntu1), so that wasn't the problem.
I gave up and downloaded the git repo, and ran pip3 install -r requirements.txt, only to run into a problem with the cryptography lib.

            =============================DEBUG ASSISTANCE==========================
            If you are seeing an error here please try the following to
            successfully install cryptography:

            Upgrade to the latest pip and try again. This will fix errors for most
            users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip
            =============================DEBUG ASSISTANCE==========================

So I did pip3 install pip, then reran pip3 install -r requirements.txt, and then found that pip install keyvi worked just fine afterwards!

I'm can't be 100% positive whether it was upgrading pip or installing the requirements manually that did it (and I can't easily go back and check), but I assume the former.

You might consider putting such a DEBUG ASSISTANCE message into the keyvi library if you believe that's the ultimate cause.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.