Code Monkey home page Code Monkey logo

harvester's Introduction

Harvester

Puppeteer-based tool for collecting different types of data:

  • links
  • screenshots
  • code snippets

Describe tasks in configs and tool will allow you to run it from web interface:

Harvester web interface

The result will be printed to page:

Harvester web interface

Available tasks

Links

Can be useful for old large sites with vague structure. You can found something unexpected.

Config example: tasks/urls.example.js

Screenshots

Tool can make screenshots of given pages with given dimensions and device emulation. You can run task twice to compare result with previous.

Config example: tasks/screens.example.js

Snippets

Useful if you need download all your demos from external service.

Config example: tasks/snippets.example.js

Usage

  1. Clone:

git clone [email protected]:yoksel/harvester.git --depth 1 && cd harvester

  1. Run npm i

  2. Rename credits-example.js to credits.js and fill it with real logins and passwords. It'll allow you to log in and visit a site as a logged in user.

  3. Take needed example file in tasks, rename it without example (screens.example.js -> screens.js) and fill it with real data.

  4. Run npm start and open localhost:3007

You'll see page wich allows you to start and stop tasks, see collected data and to download it in archive.

credits.js and task files are in gitignore and will not be commited. Don't push your passwords to the public repository.

Previews

Collected links

Links task result

Collected links with screenshots

Links task result with screens

Full view of the screenshot

Full view

Full view of the screenshot with diff

Full view with diff


Tool is in development. If you find a bug, fill an issue

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.