Code Monkey home page Code Monkey logo

postgres-word2vec's People

Contributors

guenthermi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

postgres-word2vec's Issues

Loading GloVe data?

Kudos on the project!

I stumbled upon GloVe and immediately wanted to know if I could get this type of data loaded into Postgres and found this repo.

I was wondering what data sets you're using and if GloVe or similar are on your radar.

Thanks,
Jeff

No function found for top_k_in_pq

SELECT * FROM top_k_in_pq('Godfather', 5, ARRAY(SELECT title FROM movies));
HINT: No function matches the given name and argument types. You might need to add explicit type casts.

install Issue

hi there,
I manage to get install all dependencies and load your extension in a docker container, however when i arrive to te las step in the process, (Statistics) I am getting this error:

SELECT create_statistics('google_vecs_norm', 'word', 'coarse_quantization_ivpq');
ERROR:  function get_vecs_name_ivpq_quantization() does not exist
LINE 1: SELECT get_vecs_name_ivpq_quantization()
               ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
QUERY:  SELECT get_vecs_name_ivpq_quantization()
CONTEXT:  PL/pgSQL function create_statistics(character varying,character varying,character varying) line 10 at EXECUTE

Could you please help me or point me out some sort of solution?
thanks very much

Using knn search when the input is python list (representing the embeddings)

I have problems with sending the embedding to the knn_in_pq function.
I tried:
embedding = [0.12109375, 0.056640625, ..., -0.2421875] embedding = np.array(embedding) cursor.execute("SELECT * FROM knn_in_pq((%s), 2, ARRAY(SELECT event_name FROM events));", (embedding,))
it throws:
function knn_in_pq(record, integer, character varying[]) does not exist
LINE 1: SELECT * FROM knn_in_pq(((0.12109375, 0.056640625, -0.242187...

when I try with the list:
embedding = [0.12109375, 0.056640625, ..., -0.2421875] cursor.execute("SELECT * FROM knn_in_pq((%s), 2, ARRAY(SELECT event_name FROM events));", (embedding,))
it throws:
function knn_in_pq(numeric[], integer, character varying[]) does not exist
LINE 1: SELECT * FROM knn_in_pq((ARRAY[0.12109375,0.056640625, -0.24...

From the \df command I can see that the knn_in_pq function is overriden and here are the possible function calls:
`
public | knn_in_pq | TABLE(word character varying, similarity real) | query_vector anyarray, k integer, input_set integer[] | normal

public | knn_in_pq | TABLE(word character varying, similarity real) | query_vector bytea, k integer, input_set character varying[] | normal

public | knn_in_pq | TABLE(word character varying, similarity real) | token character varying, k integer, input_set character varying[] | normal

public | knn_in_pq | TABLE(word character varying, similarity real) | token character varying, k integer, input_set integer[]

`
I know that the embedding needs to be casted somehow to query_vector bytea (I think) but do not know how to do that. Can u help me on this?
If I send a word instead the vector, the function works properly, but I need to send a vector and to get the k most close events to the input vector

psycopg2.ProgrammingError: function vec_to_bytea(real[]) does not exist

(base) admin@ifood-Latitude-5490:~/dev/search/postgres-word2vec/index_creation$ python3 vec2database.py config/vecs_config.json
INFO [2019-06-07 16:39:06] : Exexuted DROP TABLE on google_vecs

INFO [2019-06-07 16:39:06] : Created new table google_vecs

Traceback (most recent call last):
File "vec2database.py", line 136, in
main(len(sys.argv), sys.argv)
File "vec2database.py", line 124, in main
insert_vectors(vec_config.get_value('vec_file_path'), con, cur, vec_config.get_value('table_name'), db_config.get_value('batch_size'), vec_config.get_value('normalized'), logger)
File "vec2database.py", line 78, in insert_vectors
cur.executemany("INSERT INTO "+ table_name + " (word,vector) VALUES (%(word)s, vec_to_bytea(%(vector)s::float4[]))", tuple(values))
psycopg2.ProgrammingError: function vec_to_bytea(real[]) does not exist
LINE 1: ...RT INTO google_vecs (word,vector) VALUES ('', vec_to_byt...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.

linker error on macos 10.14, postgres.app 2.25

sudo make install gives:
Undefined symbols for architecture x86_64:
"_addToTargetList", referenced from:
_ivpq_search_in in ivpq_search_in.o
"_computePQDistanceInt16", referenced from:
_pq_search_in_batch in freddy.o
_ivpq_search_in in ivpq_search_in.o
"_initTargetLists", referenced from:
_ivpq_search_in in ivpq_search_in.o
"_reorderTopKPV", referenced from:
_ivpq_search_in in ivpq_search_in.o
"_updateTopKPVFast", referenced from:
_ivpq_search_in in ivpq_search_in.o
ld: symbol(s) not found for architecture x86_64

i can get around most of them by removing the 'inline'.

however, the addToTargetList function seems to be missing.
any ideas

addToTargetList(targetLists, queryVectorsIndex, TARGET_LISTS_SIZE,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.