Code Monkey home page Code Monkey logo

elastica's Introduction

Elastica: elasticsearch PHP Client

Latest Stable Version Build Status codecov.io Dependency Status Scrutinizer Code Quality Total Downloads Join the chat at https://gitter.im/ruflin/Elastica

All documentation for Elastica can be found under Elastica.io. If you have questions, don't hesitate to ask them on Stack Overflow and add the Tag "Elastica" or in our Gitter channel. All library issues should go to the issue tracker from GitHub.

Compatibility

This release is compatible with all Elasticsearch 8.0 releases and onwards.

The testsuite is run against the most recent minor version of Elasticsearch, currently 8.8.0 and 8.9.0-SNAPSHOT

Contributing

Contributions are always welcome. For details on how to contribute, check the CONTRIBUTING file.

Versions & Dependencies

This project tries to follow Elasticsearch in terms of End of Life and maintenance since 5.x. It is generally recommended to use the latest point release of the relevant branch.

Elastica branch ElasticSearch elasticsearch-php PHP
8.x 8.x ^8.4 >=8.0 <8.4
7.x 7.x ^7.0 ^7.2 || ^8.0
6.x 6.x ^6.0 ^7.0 || ^8.0

Unmaintained versions:

Elastica version ElasticSearch elasticsearch-php PHP
5.x 5.x ^5.0 >=5.6
3.x 2.4.0 no >=5.4
2.x 1.7.2 no >=5.3.3

elastica's People

Contributors

caphrim007 avatar christeredvartsen avatar comulinux avatar damienalexandre avatar deguif avatar ewgra avatar f21 avatar fabian avatar franmomu avatar im-denisenko avatar jappievw avatar jdeniau avatar jlinn avatar jsonblob avatar juneym avatar krzaczek avatar lavoiesl avatar leabaertschi avatar massimilianobraglia avatar munkie avatar mwiercinski avatar ornicar avatar p365labs avatar rayward avatar rmruano avatar ruflin avatar thepanz avatar tobion avatar webdevshub avatar xyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elastica's Issues

Sort by _geo_distance showing wrong order

Here is the complete data (index, mapping and data) for the tests:
https://gist.github.com/2483009

How is it possible that sorting by _geo_distance (arc) ASC results shows in following order:


Pos Name        Location                        Distance
---|----------|-------------------------------|-------------------
1   far         56.9440017845,24.117343883      1.7865229796151898
2   average     56.9624382022,24.1378033877     1.4899987260417218
3   closest     56.958344835,24.110908139       0.243505338088285
4   farthest    56.9428606076,24.077425299      2.9042299224497112

Elastica_Request, $this->_query is not used

In Elastica_Request, $this->_query is not used.
Is it normal?

I want to delete child documents, with this command:

$this->_elasticaTypeAvailability->request('', Elastica_Request::DELETE, array(), array('parent' => $idProduct));

It doesn't work, because $query parameters is not used :P

error bulk request

when I try it, i receive this error

Message: Error in one or more bulk request actions

Elastica_Script

It doesn't work :P

My code:

$scriptFields = array(
    'foobar' => new Elastica_Script('doc[\'brand\'].value')
);
$query->setScriptFields($scriptFields);

JSON query:

{
    "query": {
        "match_all": {

        }
    },
    "fields": [
        "name"
    ],
    "script_fields": {
        "foobar": {

        }
    }
}

Can't user another server than localhost

Hi,

I am trying to use this bundle using an external server and when I put the ip in the configuration instead of localhost it doesn't change anything. Anyone have an idea?

Allow building client by specifying URL

Some elasticsearch servers may not be rooted on /.

It would be useful in these cases to allow building a client with a URL instead of a host/port combination:

$elascticaClient = new Elastica_Client(array(
        'url' => 'http://mydomain.org:7865/somePath/'
));

Allow filters to be cacheable

Filters can become cacheable by setting the _cache parameter to true.
Some filters require a different format for the filter in order to supply the _cache attribute. Shay just updated the documentation about the longer format for the and-filter. Copied below as an example:

Short notation

{
    "filtered" : {
        "query" : {
            "term" : { "name.first" : "shay" }
        },
        "filter" : {
            "and" : [
                {
                    "range" : { 
                        "postDate" : { 
                            "from" : "2010-03-01",
                            "to" : "2010-04-01"
                        }
                    }
                },
                {
                    "prefix" : { "name.second" : "ba" }
                }
            ]
        }
    }
}

Long notation

{
    "filtered" : {
        "query" : {
            "term" : { "name.first" : "shay" }
        },
        "filter" : {
            "and" : 
                "filters": [
                    {
                        "range" : { 
                            "postDate" : { 
                                "from" : "2010-03-01",
                                "to" : "2010-04-01"
                            }
                        }
                    },
                    {
                        "prefix" : { "name.second" : "ba" }
                    }
                ],
                "_cache" : true
            }
        }
    }
}

Shay advised to use the longer format for use in libraries on the mailing list.

Proposed changes

  1. Use the long notation for all filters.
  2. Rename the current setCached function to setCache(bool).
  3. Only implement the setCache on those filters which support caching.

Query escaping

Actually if a user includes special chars in a query, such as [, the search will fail with an elasticsearch error.

Do you think it is the responsibility of Elastica to escape the queries? I'm unsure about how/where to do that.

Support for connections to multiple servers in cluster

Both the perl API: http://search.cpan.org/~drtech/ElasticSearch-0.37/lib/ElasticSearch.pm#new() and the Ruby rubberband API: https://github.com/grantr/rubberband/blob/master/lib/elasticsearch/client/retrying_client.rb support connections to multiple elasticsearch nodes in a cluster, including round-robin load balancing, auto retrying, and auto discovery of nodes after initial connection. Would you considering adding similar features to Elastica? It would eliminate a single point of failure in a clustered environment in the event the node that the Elastica client is hitting goes down.

Thank you!

Test, bootstrap in all test files

In all test classes, there is:

require_once dirname(__FILE__) . '/../../../bootstrap.php';

Is it really useful?
"bootstrap.php" is already loaded in "phpunit.xml.dist"
You can run a test on a single class with this command: "phpunit MyTest.php"

Incorrect error text in Elastica_Query_Bool->_addQuery(...)

The protected function Elastica_Query_Bool->_addQuery(...) throws an exception with an incorrect error text.

The function expects an array or an Elastica_Query_Abstract but the warning text reads:
'Invalid parameter. Has to be array or instance of Elastica_Query'

Type::addDocument doesn't support parent id with '0' value

I tried to add a document with '0' as parent id.

The 'parent' parameter is not included in the http request, because the test is wrong:

if ($doc->getParent())

The "not set" value should be NULL and tested with

$doc->getParent !== NULL

Elastica_Transport_Https

This transport is setting the following curl option:

curl_setopt($connection, CURLOPT_SSL_VERIFYPEER, false);

This causes connections to be insecure as it won't verify the peers certificate. It allows connections to servers with invalid or expired certificates.

You should set this to true.

Elastica fails when connecting to a cluster node that is not responding

I might be doing something wrong, but here is the issue:

I create the client like this:

<?php
$client = new Elastica_Client(array(
    'servers' => array(
        array('host' => 'localhost', 'port' => 9200),
        array('host' => 'localhost', 'port' => 9201),
        array('host' => 'localhost', 'port' => 9202),
    ),
));

If I take down the elasticsearch node on port 9200 the client will sometimes fail with:

Elastica_Exception_Client: Couldnt connect to host, ElasticSearch down?

Does Elastica support automatically trying to connect to any other server in the pool if one host is down? I have tried setting the roundRobin configuration option to true but that does not seem to help.

Can't set the min_score property for an Elastica_Query_Bool object

Hello,

First, I want to thank you for the efforts you're putting into this excellent piece of software!

Now to my problem - apparently I'm not able to set the min_score property for an Elastica_Query_Bool object.
Perusing the code I can see there's a method Elastica_Query::setMinScore(), but unfortunately Elastica_Query_Bool doesn't derive from Elastica_Query, but from Elastica_Query_Abstract, so the setMinScore() method is not available to it.

I've tried to abuse the interface by directly calling $boolQuery->setParam('min_score', 1), but then I get an exception back from ES.
The thing is, when I run a request like this against an ElasticSearch node:

{
  "min_score": 1,
  "query": {
    "bool": {
      "should": [
        {
          // query object here
        },
        {
          // query object here
        },
        {
          // query object here
        },
        {
          // query object here
        }
      ],
      "minimum_number_should_match": 1
    }
  }
}

... the results are filtered by min_score just fine.

It looks like by (ab)using $boolQuery->setParam('min_score', 1) the effect is that Elastica sets the min_score property on the query level, and not on the request level, hence the ElasticSearch exception.

I'd appreciate any advice on how I'm to achieve my goals, namely:

  • Execute a Bool query
  • Filter the results by min_score

Currently the only alternatives I'm aware of are to either request the ES node's REST interface directly, or still use Elastica, but filter the results manually on the application level by referring to the score of each returned record.

Thanks in advance for your feedback!

Kind regards,
Nasko

Add percolate support

Would be very useful to have this feature.
Need both: when adding document and just for query

ElasticSearch and Doctrine Extensions translatable

Hi,
I am using Doctrine Extensions Translatable and Elastic Search (FOQElasticaBundle) on Symfony2

So for it worked well for default locale, but when i try to insert translation it bumps into an error:

ElasticSearchParseException[Failed to derive xcontent from (offset=0, length=2): [91, 93]] 

Elastica_Exception_Response: ElasticSearchParseException[Failed to derive xcontent from (offset=0, length=2): [91, 93]] (uncaught exception) at /var/www/site.com/Symfony/vendor/elastica/lib/Elastica/Transport/Http.php line 103 

second locale is en:

        $article = $em->getRepository('WebsiteSharedBundle:News')->find(1);
        $article->setTitle('News 1 in EN');
        $article->setContent('News 1 in EN');
        $article->setTranslatableLocale('en_us'); 
        $em->persist($article);
        $em->flush();       

I have posted this error in FOQElasticaBundle ...Any help is much appreciated.

Allow execution of pre-built queries

Elastica should allow a pre-built query to be executed and the results returned as an Elastica_ResultSet.

For example:

{
  "query" : {
    "filtered" : {
      "filter" : {
        "range" : {
          "due" : {
            "gte" : "2011-07-18 00:00:00",
            "lt" : "2011-07-25 00:00:00"
          }
        }
      },
      "query" : {
        "text_phrase" : {
          "title" : "Call back request"
        }
      }
    }
  },
  "sort" : {
    "due" : { 
      "reverse" : true
    }
  },
  "fields" : [
    "created", "assigned_to"
  ]
}

Should return the "created and "assigned_to" fields for any documents with a title of "Call back request" whose due date is in the week stating 18/07/2011.

No way to delete a type?

I can see methods to delete indexes (Elastica_Index::delete) and delete entries (Elastica_Type::deleteById) but no way to delete a specific type (Elastica_Type::delete?). Is this omitted from Elastica because it is not possible with ElasticSearch, or has it been overlooked?

Thanks!

Requesting index via alias "overwrites" index name

I am not sure if this can be considered a bug or a feature so let me explain my situation first: I have to reindex my documents here and then due to changes in the mapping. To do that I create a new index called "index_+currenttimestamp" and assign a new alias called "temporary". Then I load all documents from the production index (using the alias "production") and put them into the temporary index . Now that all the documents are indexed I want to "hotswap" and change the alias from temporary to production and vice versa. Here's the problem:

I use the alias of an index due to the "index_timestamp" naming scheme when using an index by calling $client->getIndex('temporary'). Using this method gives you the right index but the _name property of the index object now holds the alias and not the original name (so "temporary" instead of "index_sometimestamp"). So $client->getIndex('temporary')->addAlias('someAlias',true) will cause an error because the index "temporary" simply doesn't exist and ElasticSearch needs the original name to add an alias.

A possible solution could be not to use the name passed to the getIndex($name) method but to somehow retrieve the original name via an API call, but I am not quite sure what's the best way to do that.

Empty array not passed along to ElasticSearch properly.

Building a query that returns a highlight:

$esQ = new Elastica_Query();
$esQ->setRawQuery(array("query"=>array("bool" => array("should" => array(array("field" => array("content"=>array("query"=>$_GET['q']))),
                                                                         array("field" => array("title"=>array("query"=>$_GET['q']))))))));
$esQ->setHighlight(array("fields"=>array("content"=>array())));

Because of the empty array for "content" when setting the highlight fields, no highlight response is returned from the server. However, if I change the last line to:

$esQ->setHighlight(array("fields"=>array("content"=>array('fragment_size'=>'100'))));

Then I get the correct response. The empty array seems to be appropriate syntax for ElasticSearch based on the first example here: http://www.elasticsearch.org/guide/reference/api/search/highlighting.html

Presumably this is an easy fix, but I decided to submit it rather than fix it, since it may be a problem in other parts of Elastica as well.

Thanks for the great project btw.

Should integrate nicely with Heroku ElasticSearch addon…

Hi @ruflin,

The ElasticSearch addon now in alpha at Heroku provides the ELASTICSEARCH_URL environment variable to specify where the index lives. Is there a way to facilitate integration for Heroku apps that use this value?

I notice that the initializer for Elastica_Client can accept an associative array to configure the server endpoint(s). Perhaps it might help to accept a URL as a single unparsed string instead?

The simplest syntax I can think of:

$client = new Elastica_Client(getenv('ELASTICSEARCH_URL'))

Though this is pretty reasonable too…

$client = new Elastica_Client(array('server' => getenv('ELASTICSEARCH_URL')))

Just thinking out loud here. What do you think?

Scope support

It's not really an issue, but I don't see any code related to "scope" in Elastica.
Do you think it's a problem, or is setParam('_scope', 'myscope') sufficient?

Numeric Range Filter

Numeric Range FIlter is not available in Elastica.
Should I use the Range Filter instead?

AND as default operator

What is the best way to set up a search box to default to AND (instead of OR) for multiple keyword entries (similar to Google)?

Documentation for queries

I'd really appreciate some documentation on how to perform actual queries with Elastica.

There are a lot of classes under Elastica_Query - for a new user it can be somewhat hard to figure out how it all ties together. I'm still trying to figure out how to perform a text query with the AND operator, for example.

Fix Default Limit in Search() Methods

I fixed the default limit code in the search methods for Index, Search, and Type.

However, I think the bigger question is ... should we limit in the search methods now that limit=0 does not return all results?

Docs: search over multiple indexes/types?

(Alert: I have never used Lucene or ElasticSearch, currently working it out for myself, so apologies if this is incorrect or stupid)

The inline docs state:

Search over different indices and types is not supported yet

However the ElasticSearch guide states:

The search API can be applied to multiple types within an index, and across multiple indices. For example, we can search on all documents across all types within the twitter index

Are the Elastica docs incorrect, and ElasticSearch does indeed support searching across multiple indexes/types from a single query? If so, does/will Elastica also support this?

using getDocument() throws PHP error due to undefined index

Using

with a document id that doesn't exist in ElasticSearch throws an PHP error caused by the undefined index key ['_source'].

To avoid this one of the following scenarios could be used:

  1. Returning null when there isn't a document with that id
  2. Throwing an Exception when trying to get a non existing document

Text query support

I'd say the text query is essential for taking search input on a website.

Source

The text family of queries does not go through a “query parsing” process. It does not support field name prefixes, wildcard characters, or other “advance” features. For this reason, chances of it failing are very small / non existent, and it provides an excellent behavior when it comes to just analyze and run that text as a query behavior (which is usually what a text search box does). Also, the phrase_prefix can provide a great “as you type” behavior to automatically load search results.

I'm putting this as a reminder for myself, but if someone beats me to the punch, great.

Updates to Various Getters and Setters (return $this)

General fixes with Getters and Setters

  • Elastica_Document
    • add() - needs return $this;
    • addFile() - needs return $this;
    • addGeoPoint() - needs return $this;
    • setData() needs return $this;
  • Elastica_Filter_GeoDistance
    • setLongitude() - need to return this;
  • Elastica_Filter_Or
    • addFilter() - needs return $this;
  • Elastica_FIlter_Range
    • addField() - needs return $this;

These changes need DocBlock updating for @return values and Correct unit tests need to be updated or added. If return $this then we need to check for assertInstanceOf().

I have this code done and will issue pull request this weekend.

Remove closing PHP tag

There is a closing php tag ?> in Elastica_Query_Ids that needs to be removed.

Fix will be in pull request for ticket #65

undefined variable: minimium in Elastica/Query/Terms

Just stumbled upon this classic typo while using the minimum match function of the terms query. In line 66:

public function setMinimumMatch($minimum) {
return $this->setParam('minimum_match', (int) $minimium);
}

Query Wildcard adding NULL Key

In Query/Wildcard.php has a setValue($key, $value, $boost) inside _construct; which add a Null Key if you don't send $key which results an Exception. We need a Check or Removal of that Line.
Thanx!!

setSort example?

Hey Elastica team/developers/enthusiasts!

I'm pulling my hair out trying to find out how to use the setSort method. I want to set the order to 'desc' - and I've tried several things with no success. Can anyone provide any insight on this?

Thanks!

Documentation Update - CodeSniffer

Change

To run the codesniffer you can use the command ant codesniffer. This will show you more details on where your code does not follow the coding guidelines.

To

To run the codesniffer you can use the command ant phpcs. This will show you more details on where your code does not follow the coding guidelines.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.