jadonn / elasticsearch-file-importer Goto Github PK
View Code? Open in Web Editor NEWPython command line script for importing data from CSV files, log files, and JSON files into Elasticsearch.
License: MIT License
Python command line script for importing data from CSV files, log files, and JSON files into Elasticsearch.
License: MIT License
As title, to make issue #2 working friendly, I suggest we can consider let the CLI be separated from elasticsearch-file-importer.py
.
That is, the following Python codes can be on new Python file name called es-importer.py
:
if __name__ == '__main__':
PARSER = argparse.ArgumentParser(
description='Read a data from a variety of file formats and post the data to Elasticsearch'
)
SUBPARSERS = PARSER.add_subparsers(title="data_type", description="Supported data format", help="Choose one of the supported data formats.")
CSV_PARSER = SUBPARSERS.add_parser("CSV", help="Import a CSV file into Elasticsearch")
CSV_PARSER.add_argument('csvFile', help='Path to the CSV file to read')
CSV_PARSER.add_argument('esIndex', help='Name of the Elasticsearch index mapping')
CSV_PARSER.add_argument('--stopWordsFile', help='Path to a file of stopwords')
CSV_PARSER.set_defaults(func=process_report)
LOG_PARSER = SUBPARSERS.add_parser("Logs", help="Import a log into ElasticSearch")
LOG_PARSER.add_argument("logFile", help="Path to the log file to read")
LOG_PARSER.add_argument("formatFile", help="Path to file containing log format regex string.")
LOG_PARSER.add_argument("esIndex", help="Name of the Elasticsearch index mapping")
LOG_PARSER.set_defaults(func=process_log)
JSON_PARSER = SUBPARSERS.add_parser("JSON", help="Import a JSON file into Elasticsearch")
JSON_PARSER.add_argument("jsonFile", help="Path to JSON file to read")
JSON_PARSER.add_argument("esIndex", help="Name of the Elasticsearch index mapping")
JSON_PARSER.set_defaults(func=process_json)
ARGS = PARSER.parse_args()
ARGS.func(ARGS)
To ensure good standards, it would be useful for the project to have configuration and linting using Pylint and PEP8.
The project does not have a requirements.txt file, which would probably make installing the script's requirements easier to do.
Search Guard and other security plugins, like Open Distro for Elasticsearch's security plugin, require authentication and HTTPS for connections to Elasticsearch. The script does not pass authentication information or support SSL certificate verification for self-signed SSL certificates, which is not an uncommon setup for Search Guard or other security plugins for Elasticsearch.
As title, I think it's time to let this elasticsearch-file-importer
to support the Python 3.x
version now.
As title, I think this file importer should need the tests and this can verify the functionalities in this Python script.
Also we can consider the Travis CI build to help us to do test works automatically on upcoming commits.
There was originally not a requirements.txt file or other files for installing dependencies. There is one now, and the requirements.txt should say how to install the dependencies.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.