Code Monkey home page Code Monkey logo

es-delete-by-query's People

Contributors

al-serebrov avatar dependabot[bot] avatar

Stargazers

 avatar

es-delete-by-query's Issues

Remove connection from function

It's better to have es connection as an argument rather than connecting inside the function (if we use it as a library, we probably have the connection already in our code).

Compatibility with Elasticsearch v1.x server

In order to make this work with Elasticsearch 1.4.2, I had to make a few edits. Just documenting here for other onlookers or to add this information to the readme in a PR.

Thanks for this great project! ❤️

Sniff options not supported

The sniff options defined at cli_delete_by_query.py#L75-L80 aren't supported in Elasticsearch 1.x.

Related issue: elastic/elasticsearch-py#358

Solution: The solution is just to comment out those sniff options.

Error:

Traceback (most recent call last):
  File "cli_delete_by_query.py", line 80, in <module>
    sniffer_timeout=60
  File "/home/deployer/es-delete-by-query/venv/lib/python3.4/site-packages/elasticsearch/client/__init__.py", line 188, in __init__
    self.transport = transport_class(_normalize_hosts(hosts), **kwargs)
  File "/home/deployer/es-delete-by-query/venv/lib/python3.4/site-packages/elasticsearch/transport.py", line 122, in __init__
    self.sniff_hosts(True)
  File "/home/deployer/es-delete-by-query/venv/lib/python3.4/site-packages/elasticsearch/transport.py", line 237, in sniff_hosts
    hosts = list(filter(None, (self._get_host_info(n) for n in node_info)))
  File "/home/deployer/es-delete-by-query/venv/lib/python3.4/site-packages/elasticsearch/transport.py", line 237, in <genexpr>
    hosts = list(filter(None, (self._get_host_info(n) for n in node_info)))
  File "/home/deployer/es-delete-by-query/venv/lib/python3.4/site-packages/elasticsearch/transport.py", line 221, in _get_host_info
    host['port'] = int(host['port'])
ValueError: invalid literal for int() with base 10: '9200]'

Incompatible scroll/scan search API's

Related issue, I was clued in by this comment mentioning the equivalent scroll/scan API call for a Elasticsearch v1.x server: elastic/elasticsearch-net#326 (comment)

API docs for reference: https://www.elastic.co/guide/en/elasticsearch/reference/1.4/search-request-scroll.html

Solution: The solution is to use v1.x of the elasticsearch Python library, pip install elasticsearch==1.9.0. https://pypi.org/project/elasticsearch/1.9.0/ specifies "For Elasticsearch 1.0 and later, use the major version 1 (1.x.y) of the library."

For reference, I was able to use elasticsearch==6.2.0 which is what the current requirements.txt specifies against an Elasticsearch 2.3.1 server without issues.

Error:

update_validated - ERROR - Elasticsearch error: ElasticsearchIllegalArgumentException[Failed to decode scrollId]; nested: IOException[Bad Base64 input character decimal 123 in array position 0];
Traceback (most recent call last):
  File "cli_delete_by_query.py", line 99, in <module>
    query=delete_query
  File "/home/deployer/es-delete-by-query/delete_by_query.py", line 92, in delete_by_query
    raise ex
  File "/home/deployer/es-delete-by-query/delete_by_query.py", line 68, in delete_by_query
    page = es.scroll(scroll_id=sid, scroll='2m')
  File "/home/deployer/es-delete-by-query/venv/lib/python3.4/site-packages/elasticsearch/client/utils.py", line 76, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/home/deployer/es-delete-by-query/venv/lib/python3.4/site-packages/elasticsearch/client/__init__.py", line 1011, in scroll
    params=params, body=body)
  File "/home/deployer/es-delete-by-query/venv/lib/python3.4/site-packages/elasticsearch/transport.py", line 314, in perform_request
    status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
  File "/home/deployer/es-delete-by-query/venv/lib/python3.4/site-packages/elasticsearch/connection/http_urllib3.py", line 180, in perform_request
    self._raise_error(response.status, raw_data)
  File "/home/deployer/es-delete-by-query/venv/lib/python3.4/site-packages/elasticsearch/connection/base.py", line 125, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)

Change structure and names

The function name should be delete_by_query or delete_docs_by_query.

Move the function into separate library file, so it could be used both as the command-line util (current code) and a library to use programmatically:

from delete_by_query import delete_by_query
delete_by_query(es_connection, index, doc_type, query)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.