Code Monkey home page Code Monkey logo

mysqltoelasticsearch's Introduction

MysqlToElasticsearch

用于分库分表,表结构完全相同情况下从Mysql数据到导入数据到Elasticsearch搜索引擎。

用法(usage)

所有的配置都在/common/config.py中,完整配置即可运行。

Python库要求

python 2.7.*
MysqlDB
windows下需要安装包,X64位官方没有版本,可参考http://www.codegood.com/archives/129。
elasticsearch
pip install elasticsearch

配置样例

导入数据库集合

DATABASES = [{
    "es_colony": ["http://192.168.1.188:9200"],
    "db_host": "192.168.1.201",
    "db_user": "root",
    "db_pass": "root",
    "db_port": 3306,
    "db_name": ["cf"],
    "db_charset": "utf8",
    "index": "test",
    "doc_type": "website",
    "doc_field":['ip', 'port', 'site', 'url', 'banner', 'os', 'server', 'script', 'charset', 'title'],
    # 使用流式数据库读取,尽量缩小内存使用量
    "sql": "SELECT `IP` AS `ip`,`Port` AS `port`,`URL` AS `site`,`URL` AS `url`,`Banner` AS `banner`,`OS` AS `os`,"
           "`Server` AS `server`,`Script_Type` AS `script`,`Charset` AS `charset`,`Title` AS `title` FROM website"}]
    # ,{
    # "es_colony": ["http://192.168.1.188:9200"],
    # "db_host": "192.168.1.201",
    # "db_user": "root",
    # "db_pass": "root",
    # "db_port": 3306,
    # "db_name": ["cf"],
    # "db_charset": "utf8",
    # "index": "test",
    # "doc_type": "website",
    # "doc_field":['ip', 'port', 'site', 'url', 'banner', 'os', 'server', 'script', 'charset', 'title'],
    # # 使用流式数据库读取,尽量缩小内存使用量
    # "sql": "SELECT `IP` AS `ip`,`Port` AS `port`,`URL` AS `site`,`URL` AS `url`,`Banner` AS `banner`,`OS` AS `os`,"
    #        "`Server` AS `server`,`Script_Type` AS `script`,`Charset` AS `charset`,`Title` AS `title` FROM website"}] 

其中保证es_colony、db_name为list类型,db_port为int类型,其余是str类型。数据库配置错误程序会直接报错停止运行。

说明

根据个人的服务器配置信息,修改common.py文件中的bulk上传限制跟queue队列大小。服务器性能偏低尽量调小两个参数,比如1000|20000。

mysqltoelasticsearch's People

Contributors

lxiaogirl avatar lxsec avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.