Code Monkey home page Code Monkey logo

web-scrape-worker's Introduction

๐Ÿ‘ท WIP Cloudflare Worker Scraper

Deploy to Cloudflare Workers

index.js has the main content of the Cloudflare Workers script.

Running the Cloudflare Workers script

To run the Cloudflare Workers script you need to create a Cloudflare/Workers account.

Then you will have to pay for the Workers paid plan which is about $5 a month (this unlocks more CPU time which is needed for scraping).

After getting a paid plan you will have to install a CLI tool to deploy your Cloudflare Workers script, in this case we are going to be using wrangler which lets us generate, configure, build, preview, publish our Cloudflare Workers script. You can use npm to install wrangler or yarn.

npm install -g @cloudflare/wrangler
yarn global add @cloudflare/wrangler

To verify you have installed it successfully you can then type wrangler --version to verify its installed successfully.

After we will have to login to our Cloudflare account on wrangler so it can get our API token to manage our Cloudflare Worker. Type the command wrangler login and you should get an option to login via your browser, after loggin in you will be asked to authorize the API key for wrangler.

Now you can git clone my repo and run wrangler dev to run the script in development mode which will give you logs in your terminal:

wrangler dev

There is more functions to wrangler which you can find out about below:

-Wrangler GitHub repo

-Wrangler further documentation with examples

Testing the Cloudflare Workers script

To test the Cloudflare Workers script I suggest using something like Postman which is an easy to use API dev tool that lets you send requests very easily for testing purposes.

We will start by creating a request by clicking the + symbol:

Then we will setup the request by putting in the following URL in the box http://127.0.0.1:8787 (this is our localhost listening URL for our dev mode). Also we will be putting in our header as Content-type: application/json so that our script can process our request with the values we send in the JSON.

Then in the body section we have 4 types of request you can make:

-Mode (scrape will use our presets for sites, parse will just grab the site and output it in plain HTML)

-Site (ebay will just scrape ebay search titles/prices/item links, ebay_extend will do the same as the previous one but also get the item's condition/seller's name/seller's profile link, amazon will just scrape amazon search titles/prices)

-Url (You can put any URL and it will parse the plain HTML code)

web-scrape-worker's People

Contributors

packet-sent avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.