Code Monkey home page Code Monkey logo

upwork-scraper's Introduction

Upwork Scraper

Upwork Scraper is an Apify actor for extracting data from Upwork. It allows you to extract info from freelancers and agencies without login. It is build on top of Apify SDK and you can run it both on Apify platform and locally.

Input

Field Type Description Default value
startUrls array List of Request objects that will be deeply crawled.
useBuiltInSearch boolean When set to true (checked), the startUrls will be ignored and the actor will perform a search based on the fields bellow. false
search string Keyword that will be used in the Upwork's search engine.
category string You can provide a category_uid to filter your search.
englishLevel string You can can pass one of the options bellow as a filter to the desired english level. "0" -> Any level; "1" -> Basic; "2" -> Conversational; "3" -> FLuent; "4" -> Native or bilingual "0"
hourlyRate string You can can pass one of the options bellow as a filter to the desired hourly rate. "" -> Any; "0-10" -> between $0 and $10; "10-30" -> between $10 and $30; "30-60" -> between $30 and $60; "60" -> above $60 ""
maxItems number How many search results should be saved. 100
extendOutputFunction string A Javascript function passed as plain text that can return custom information. More on Extend output function.
proxy object Proxy configuration of the run. {"useApifyProxy": true }

Suported startUrls

Output

Output is stored in a dataset.

{
  "name": "Jon Doe",
  "location": {
    "country": "United States",
    "city": "Goldfield",
    "state": "IA",
    "countryTimezone": "America/Chicago",
    "worldRegion": "Goldfield, United States (America/C)",
    "timezoneOffset": -18000,
    "countryCodeIso2": "US",
    "countryCodeIso3": "USA",
    "countryCode": "USA"
  },
  "title": "Fast, Friendly, Reliable!",
  "description": "I believe highly in perfection in my work.  I have written short articles, reviews, as well as blog posts for different companies using WordPress and have done website testing as well. I am a gifted technical writer and article spinner.  I have also been a ghostwriter for multiple clients on a variety of both fiction and non-fiction writing.  I also do data entry on a daily basis into Excel books and am responsible for payroll at my full time job.  I have excellent communication skills and work as an administrative assistant on a full time basis.  I understand the need for quality work and communication to get the job done right!",
  "jobSuccess": 0,
  "hourlyRate": {
    "currencyCode": "USD",
    "amount": 5
  },
  "earned": 477.48,
  "numberOfJobs": 10,
  "hoursWorked": 0.5,
  "profileUrl": "https://www.upwork.com/o/profiles/users/~XXXXX/",
}

Compute units consumption

Estimated ~0.06 CU per 100 requests

Extend output function

You can use this function to update the result output of this actor. You can choose what data from the page you want to scrape. The output from this will function will get merged with the result output.

The return value of this function has to be an object!

You can return fields to achive 3 different things:

  • Add a new field - Return object with a field that is not in the result output
  • Change a field - Return an existing field with a new value
  • Remove a field - Return an existing field with a value undefined
async () => {
  return {
        pageTitle: document.querySelecto('title').innerText,
    }
}

This example will add the title of the page to the final object:

{
  "name": "John Doe",
  "location": {
    "country": "United States",
    "city": "Goldfield",
    "state": "IA",
    "countryTimezone": "America/Chicago",
    "worldRegion": "Goldfield, United States (America/C)",
    "timezoneOffset": -18000,
    "countryCodeIso2": "US",
    "countryCodeIso3": "USA",
    "countryCode": "USA"
  },
  "title": "Fast, Friendly, Reliable!",
  "description": "I believe highly in perfection in my work.  I have written short articles, reviews, as well as blog posts for different companies using WordPress and have done website testing as well. I am a gifted technical writer and article spinner.  I have also been a ghostwriter for multiple clients on a variety of both fiction and non-fiction writing.  I also do data entry on a daily basis into Excel books and am responsible for payroll at my full time job.  I have excellent communication skills and work as an administrative assistant on a full time basis.  I understand the need for quality work and communication to get the job done right!",
  "jobSuccess": 0,
  "hourlyRate": {
    "currencyCode": "USD",
    "amount": 5
  },
  "earned": 477.48,
  "numberOfJobs": 10,
  "hoursWorked": 0.5,
  "profileUrl": "https://www.upwork.com/o/profiles/users/~XXXXX/",
  "pageTitle": "John Doe - Fast, Friendly, Reliable! - Upwork"
}

Upwork stores the full data of the frelancer in a javascript variable. You can access that variable to get more information from the freelancer profile. The code bellow adds the full profile variable to the output. You can use the extended output function to extract only the data that you need.

async () => {
    return { profileResponse: window.PROFILE_RESPONSE }
}

upwork-scraper's People

Contributors

gustavotr avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.