Code Monkey home page Code Monkey logo

sberright's Introduction

SberRight

SberMegaMarket parses which is controlled over a telegram bot and is able to collect data to PostgreSQL.

Table of Contents

Introduction

Parser with controlls via telegram bot. Makes it easy to collect data and manage captcha by using telegram app and sending commands to the bot.

Prerequisites

python = "^3.11.0" poetry = "^1.2.0"

Getting Started

If you want to deploy this bot (render.com or similar), you have to set PYTHON_VERSION and POETRY_VERSION envs, or install poetry. Alternatively usage of requirements.txt will be added in the future.

Installation

  1. Clone the repository:

    git clone https://github.com/malvere/SberRight.git
    cd yourproject
  2. Install dependencies

    poetry install
  3. Run bot

    python main.py

Setting Up Environment Variables

Set BOT_TOKEN which is obtained from @BotFather bot in telegram. You can also set DB_URL if you wish to use PostgreSQL, otherwise - .csv file will be generated.

Usage

  1. Send /init to bot. Command triggers PLaywright instance and starts scrape process

  2. If captcha is found, bot will send you screenshot of it, you need to solve it and send back to bot via /captcha <decyphered_text> command.

  3. If captcha is entered succesfully, bot will trigger a golang script which will then parse pages html content. Golang script source could be found here.

Contributing

Feel free to contribute on this project.

License

MIT

Happy scraping!

sberright's People

Contributors

malvere avatar

Stargazers

Ternovkiy Ilya avatar Anton Mogilev avatar Klyausov Andrey avatar Max avatar

Watchers

 avatar

sberright's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.