mlnick / elasticsearch-vector-scoring Goto Github PK
View Code? Open in Web Editor NEWScore documents with pure dot product / cosine similarity with ES
License: Apache License 2.0
Score documents with pure dot product / cosine similarity with ES
License: Apache License 2.0
Hello,
first of all kudos for the amazing idea for this plugin.
I am naively wondering it might be feasible to get the similar vectors (of texts) in terms of k-nn based on the Euclidean distance. The indexed vectors are the output of a NN language model, which I guess can also be factorized.
Greetings.
Hello,
first of all, thanks for the great work. I'm currently looking into it and got the demo working as posted on the IBM repository.
Going through it, I have one question. If there's a new user coming in and I get some implicit ratings from him, I would at a certain point want to make recommendations to her/him. Now, if (s)he hasn't been included in the training data, no user vector exists for him. Is there any way I can get recommendations for her/him without retraining the whole model?
Thanks! I hope this implementation sees an update to work with ES 5.5+ ๐
I'm trying to score and retrieve relevant documents in elasticsearch 5.4. I tried out the demo in read me and everything works fine. Now I want to convert a string to vectors and save it in elasticsearch.
curl -s -XPUT 'http://localhost:9200/test/movies/1?pretty' -d '
{
"@model_factor":"0|1.2 1|0.1 2|0.4 3|-0.2 4|0.3",
"name": "Test 1"
}'
Can i know how you got the above vectors in @model_factor. Can you share something on converting string to vectors?.
It's actually fairly easy to upgrade to 5.6.... just replace the ES version number in the pom file to the actual ES 5.6.x version...
Can we expect a 6.3 upgrade . ?
I recommend doing an index refresh before you run your final scoring query in your example. My finding is that if you don't do that, you may not get all the test docs in your first pass, it takes some very shot time for ES to propagate your indexing changes before it makes them searchable.
This is a really cool tool but now I am using elasticSearch 5.6.1, you said it can fit to 5.3+ but maybe there are some problem as follow:
$ ./elasticsearch-plugin install file:///home/es/elasticsearch-vector-scoring/target/releases/elasticsearch-vector-scoring-5.3.0.zip
-> Downloading file:///home/es/elasticsearch-vector-scoring/target/releases/elasticsearch-vector-scoring-5.3.0.zip
[=================================================] 100%ย ย
Exception in thread "main" java.lang.IllegalArgumentException: plugin [elasticsearch-vector-scoring] is incompatible with version [5.6.1]; was designed for version [5.3.0]
at org.elasticsearch.plugins.PluginInfo.readFromProperties(PluginInfo.java:146)
at org.elasticsearch.plugins.InstallPluginCommand.verify(InstallPluginCommand.java:474)
at org.elasticsearch.plugins.InstallPluginCommand.install(InstallPluginCommand.java:543)
at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:217)
at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:201)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:67)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134)
at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:69)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134)
at org.elasticsearch.cli.Command.main(Command.java:90)
at org.elasticsearch.plugins.PluginCli.main(PluginCli.java:47)
How can I deal with this problem? Thx~
Thanks for this cool plugin. Can you update for elasticsearch 5.x? or any tips to update it
when I install elasticsearch-vector-scoring-plugin,it occured an error :
PS C:\Program Files\elasticsearch-6.4.3> ./bin/elasticsearch-plugin install https://github.com/MLnick/elasticsearch-vector-scoring/releases/download/v5.3.0/elasticsearch-vector-scoring-5.3.0.zip -> Downloading https://github.com/MLnick/elasticsearch-vector-scoring/releases/download/v5.3.0/elasticsearch-vector-scoring-5.3.0.zip [=================================================] 100%?? ERROR: This plugin was built with an older plugin structure. Contact the plugin author to remove the intermediate "elasticsearch" directory within the plugin zip.
I am using win10 ,elasticsearch-6.4.3.zip.
look forward to your kind reply,thank you!
Hi Nick,
I have installed this plugin in my ElasticSearch 5.4.0 and enabled the inline script in elasticsearch.yml. The indexing phase works fine but I receive the following error when I issue a query. I paste the error message as following:
"error" : { "root_cause" : [ { "type" : "general_script_exception", "reason" : "Failed to compile inline script [payload_vector_score] using lang [native]" } ], "type" : "search_phase_execution_exception", "reason" : "all shards failed", "phase" : "query", "grouped" : true, "failed_shards" : [ { "shard" : 0, "index" : "test", "node" : "1zCItkTwQX2ujhLc4TH0Hg", "reason" : { "type" : "query_shard_exception", "reason" : "script_score: the script could not be loaded", "index_uuid" : "T9hH9ZZaTZm21pLfv81eZQ", "index" : "test", "caused_by" : { "type" : "general_script_exception", "reason" : "Failed to compile inline script [payload_vector_score] using lang [native]", "caused_by" : { "type" : "illegal_argument_exception", "reason" : "Native script [payload_vector_score] not found" } } } } ] }, "status" : 400 }
I wonder whether you have any idea on the potential reason behind this error.
Thanks.
Hi, it would be very helpful if you could distribute a compiled version of this plugin so that others can just install it with a URL.
NOTE: Unable to verify checksum for downloaded plugin (unable to find .sha1 or .md5 file to verify)
ERROR: Could not find plugin descriptor 'plugin-descriptor.properties' in plugin zip
Hello , when I use Java code to implement this function, there are some errors:
[indices:data/read/search[phase/query]]]; nested: QueryShardException[script_score: the script could not be loaded]; nested: NotSerializableExceptionWrapper[class_cast_exception: [F cannot be cast to java.util.List]; }
The following is the part related to the "payload_vector_score":
Caused by: NotSerializableExceptionWrapper[class_cast_exception: [F cannot be cast to java.util.List]
at com.github.mlnick.elasticsearch.script.PayloadVectorScoreScript.(PayloadVectorScoreScript.java:98)
at com.github.mlnick.elasticsearch.script.PayloadVectorScoreScript.(PayloadVectorScoreScript.java:37)
at com.github.mlnick.elasticsearch.script.PayloadVectorScoreScript$Factory.newScript(PayloadVectorScoreScript.java:55)
at org.elasticsearch.script.NativeScriptEngineService.search(NativeScriptEngineService.java:75)
at org.elasticsearch.script.ScriptService.search(ScriptService.java:499)
at org.elasticsearch.script.ScriptService.search(ScriptService.java:491)
at org.elasticsearch.index.query.QueryShardContext.getSearchScript(QueryShardContext.java:345)
at org.elasticsearch.index.query.functionscore.ScriptScoreFunctionBuilder.doToFunction(ScriptScoreFunctionBuilder.java:97)
... 19 more
The all errors is following:
Exception in thread "main" Failed to execute phase [query], all shards failed; shardFailures {[R-SMBGOkSdOwmsCBUtjP3A][contentrecom][0]: RemoteTransportException[[node1][127.0.0.1:9301][indices:data/read/search[phase/query]]]; nested: QueryShardException[script_score: the script could not be loaded]; nested: NotSerializableExceptionWrapper[class_cast_exception: [F cannot be cast to java.util.List]; }{[JlpGnU20RHGAHDM9hd9kzw][contentrecom][1]: RemoteTransportException[[master][127.0.0.1:9300][indices:data/read/search[phase/query]]]; nested: QueryShardException[script_score: the script could not be loaded]; nested: ClassCastException[[F cannot be cast to java.util.List]; }{[JlpGnU20RHGAHDM9hd9kzw][contentrecom][2]: RemoteTransportException[[master][127.0.0.1:9300][indices:data/read/search[phase/query]]]; nested: QueryShardException[script_score: the script could not be loaded]; nested: ClassCastException[[F cannot be cast to java.util.List]; }{[JlpGnU20RHGAHDM9hd9kzw][contentrecom][3]: RemoteTransportException[[master][127.0.0.1:9300][indices:data/read/search[phase/query]]]; nested: QueryShardException[script_score: the script could not be loaded]; nested: ClassCastException[[F cannot be cast to java.util.List]; }{[JlpGnU20RHGAHDM9hd9kzw][contentrecom][4]: RemoteTransportException[[master][127.0.0.1:9300][indices:data/read/search[phase/query]]]; nested: QueryShardException[script_score: the script could not be loaded]; nested: ClassCastException[[F cannot be cast to java.util.List]; }
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onInitialPhaseResult(AbstractSearchAsyncAction.java:223)
at org.elasticsearch.action.search.AbstractSearchAsyncAction.access$100(AbstractSearchAsyncAction.java:58)
at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:148)
at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:51)
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1032)
at org.elasticsearch.transport.TcpTransport.lambda$handleException$17(TcpTransport.java:1411)
at org.elasticsearch.common.util.concurrent.EsExecutors$1.execute(EsExecutors.java:109)
at org.elasticsearch.transport.TcpTransport.handleException(TcpTransport.java:1409)
at org.elasticsearch.transport.TcpTransport.handlerResponseError(TcpTransport.java:1401)
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1345)
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:280)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:396)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at java.lang.Thread.run(Thread.java:748)
Caused by: RemoteTransportException[[node1][127.0.0.1:9301][indices:data/read/search[phase/query]]]; nested: QueryShardException[script_score: the script could not be loaded]; nested: NotSerializableExceptionWrapper[class_cast_exception: [F cannot be cast to java.util.List];
Caused by: [contentrecom/NBkpXATTSfKlMlpfzgXVSQ] QueryShardException[script_score: the script could not be loaded]; nested: NotSerializableExceptionWrapper[class_cast_exception: [F cannot be cast to java.util.List];
at org.elasticsearch.index.query.functionscore.ScriptScoreFunctionBuilder.doToFunction(ScriptScoreFunctionBuilder.java:100)
at org.elasticsearch.index.query.functionscore.ScoreFunctionBuilder.toFunction(ScoreFunctionBuilder.java:137)
at org.elasticsearch.index.query.functionscore.FunctionScoreQueryBuilder.doToQuery(FunctionScoreQueryBuilder.java:304)
at org.elasticsearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:96)
at org.elasticsearch.index.query.QueryShardContext.lambda$toQuery$1(QueryShardContext.java:313)
at org.elasticsearch.index.query.QueryShardContext.toQuery(QueryShardContext.java:325)
at org.elasticsearch.index.query.QueryShardContext.toQuery(QueryShardContext.java:312)
at org.elasticsearch.search.SearchService.parseSource(SearchService.java:599)
at org.elasticsearch.search.SearchService.createContext(SearchService.java:468)
at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:444)
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:252)
at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:331)
at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:328)
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69)
at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1488)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:613)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: NotSerializableExceptionWrapper[class_cast_exception: [F cannot be cast to java.util.List]
at com.github.mlnick.elasticsearch.script.PayloadVectorScoreScript.(PayloadVectorScoreScript.java:98)
at com.github.mlnick.elasticsearch.script.PayloadVectorScoreScript.(PayloadVectorScoreScript.java:37)
at com.github.mlnick.elasticsearch.script.PayloadVectorScoreScript$Factory.newScript(PayloadVectorScoreScript.java:55)
at org.elasticsearch.script.NativeScriptEngineService.search(NativeScriptEngineService.java:75)
at org.elasticsearch.script.ScriptService.search(ScriptService.java:499)
at org.elasticsearch.script.ScriptService.search(ScriptService.java:491)
at org.elasticsearch.index.query.QueryShardContext.getSearchScript(QueryShardContext.java:345)
at org.elasticsearch.index.query.functionscore.ScriptScoreFunctionBuilder.doToFunction(ScriptScoreFunctionBuilder.java:97)
... 19 more
Could you give me some ways to solve this problem? In addition, when I use your method, it works well, the Restful way.
When I use the most current 5.4. version (5.4.3) i got the error message that it only runs with 5.4.0 which is strange since 5.4.0 and 5.4.3 should be API compatible.
Please either fix in code or within the README file.
Steps to reproduce
FROM docker.elastic.co/elasticsearch/elasticsearch:5.4.3
RUN elasticsearch-plugin install https://github.com/MLnick/elasticsearch-vector-scoring/releases/download/v5.4.0/elasticsearch-vector-scoring-5.4.0.zip
docker build -t es_with_vector-scoring .
which results in following error
Sending build context to Docker daemon 2.048kB
Step 1/2 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.4.3
---> 2ae8547160a7
Step 2/2 : RUN elasticsearch-plugin install https://github.com/MLnick/elasticsearch-vector-scoring/releases/download/v5.4.0/elasticsearch-vector-scoring-5.4.0.zip
---> Running in ca31562f69ae
-> Downloading https://github.com/MLnick/elasticsearch-vector-scoring/releases/download/v5.4.0/elasticsearch-vector-scoring-5.4.0.zip
[=================================================] 100%??
Exception in thread "main" java.lang.IllegalArgumentException: plugin [elasticsearch-vector-scoring] is incompatible with version [5.4.3]; was designed for version [5.4.0]
at org.elasticsearch.plugins.PluginInfo.readFromProperties(PluginInfo.java:146)
at org.elasticsearch.plugins.InstallPluginCommand.verify(InstallPluginCommand.java:428)
at org.elasticsearch.plugins.InstallPluginCommand.install(InstallPluginCommand.java:495)
at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:215)
at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:199)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:67)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122)
at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:69)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122)
at org.elasticsearch.cli.Command.main(Command.java:88)
at org.elasticsearch.plugins.PluginCli.main(PluginCli.java:47)
The command '/bin/sh -c elasticsearch-plugin install https://github.com/MLnick/elasticsearch-vector-scoring/releases/download/v5.4.0/elasticsearch-vector-scoring-5.4.0.zip' returned a non-zero code: 1
I am using elastic search : 5.6.2. Whenever i try to install the plugin using :
Install plugin in Elasticsearch: ELASTIC_HOME/bin/elasticsearch-plugin install file:///PROJECT_HOME/target/releases/elasticsearch-vector-scoring-5.3.0.zip (stop ES first).
I get an error: ERROR: elasticsearch
directory is missing in the plugin zip
I am new to this. Please tell me if there is something wrong with my installation step or something wrong with the zip file?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.