Code Monkey home page Code Monkey logo

subscrape's Introduction

subscrape

This is a growing library of Python facilities to scrape Substrate and Moonbeam-based chains. Substrate chains are scraped using Subscan while EVM-based chains use Moonscan.io and Blockscout.io.

Roadmap: https://app.subsocial.network/@alice_und_bob/subscrape-releasing-v1-0-and-roadmap-for-v2-0-32075

The basic workflow of bin/scrape.py considers the configuration presented in data/scrape_config.json to traverse through the given chains and perform the operations for each chain. For Substrate, scraping extrinsics and transfers is supported.

Data is stored locally using SubscanDB and can be used to use the data and transform it. The application works in a way that subsequent runs will only fetch deltas.

bin/transfers_all_chains.py takes addresses and chains from data/transfers_config.json to create a CSV of all transfers associated with that accounts.

Links

Documentation

Limitations

Error handling is not very sophisticated, so if the scrape is interrupted by an uncaught exception, the delta might be incomplete and subsequent runs might miss some data. To remedy the issue, the delta must be deleted and the scraper run again.

Usage

Installation

virtualenv venv venv\scripts\activate pip install -Ur .\PipRequirements.txt

API Keys

If you have a Subscan API key, drop it in a file named config/subscan-key. If you have a Moonscan.io API key, note that they are network-specific, so place it either in a file named config/moonscan-moonriver-key or config/moonscan-moonbeam-key. Blockscout does not need an API key.

Example applications

Here are several specific examples of how the subscrape library has been used in the past. There are several example application scripts in the /bin/ folder.

  • Prepare your taxable transactions - Use scrape.py and specify a list of accounts in your scrape_config.json (using operation account_transactions) to extract all transactions for an account. Multiple accounts and chains can be specified in the config file. Results are currently dumped to a JSON file in the data folder but later we can export this to a CSV file so you can import it into your favorite tax software which would match up transactions to calculate your capital gains.
  • Determine all accounts that have used a DEX - Use scrape.py and operation transactions in your scrape_config.json to extract a list of all addresses which have called a particular smart contract operation, such as making a trade on solarbeam.io. (In sample_scrape_config.json, see the Moonriver config to extract all transactions with contract 0xaa30ef758139ae4a7f798112902bf6d65612045f's method 0xe8e33700.)
  • Extract a list of all RMRK Skybreach land deeds which have been burned, and what Moonriver address they were redeemed to - See sample_rmrk_burns.py for an example script that looks for system remark calls which append a memo with an EVM (0x) address.
  • Extract all transfers across chains for an address - copy config/sample_transfers_config.json to config/transfers_config.json, customize as needed, and then run bin/transfers_all_chains.py to extract a list of all transfers for an address across all chains, which will be dumped into data/transfers.csv.

Using scrape.py as an application

  • copy config/sample_scrape_config.json to config/scrape_config.json
  • configure to your desire. See configuration
  • run scrape.py
  • corresponding files will be created in data/

Consuming scrape.py as a helper

  • from scrape import scrape
  • create a dict analogouos to config/sample_scrape_config.json
  • call it inline via scrape(config)

subscrape's People

Contributors

tomen avatar spazcoin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.