Code Monkey home page Code Monkey logo

selenium-twitter-scraper's Introduction

selenium-twitter-scraper

Setup

  1. Install dependencies
pip install -r requirements.txt

Authentication Options

Using Environment Variable

  1. Rename .env.example to .env.

  2. Open .env and update environment variables

TWITTER_USERNAME=# Your Twitter Handle (e.g. @username)
TWITTER_PASSWORD=# Your Twitter Password

Authentication in Terminal

  • Add a username and password to the command line.
python scraper --user=@elonmusk --password=password123

No Authentication Provided

  • If you didn't specify a username and password, the program will ask you to enter a username and password.
Twitter Username: @username
Password: password123

Authentication Sequence Priority

1. Authentication provided in terminal.
2. Authentication provided in environment variables.

Usage

  • Show Help
python scraper --help
  • Basic usage
python scraper
  • Setting maximum number of tweets. defaults to 50.
python scraper --tweets=500   # Scrape 500 Tweets
  • Options and Arguments
usage: python scraper [option] ... [arg] ...

authentication options  description
--user                  : Your twitter account Handle.
                          e.g.
                          --user=@username

--password              : Your twitter account password.
                          e.g.
                          --password=password123

options:                description
-t, --tweets            : Number of tweets to scrape (default: 50).
                          e.g.
                            -t 500
                            --tweets=500

-u, --username          : Twitter username.
                          Scrape tweets from a user's profile.
                          e.g.
                            -u elonmusk
                            --username=@elonmusk

-ht, --hashtag          : Twitter hashtag.
                          Scrape tweets from a hashtag.
                          e.g.
                            -ht javascript
                            --hashtag=javascript

-q, --query             : Twitter query or search.
                          Scrape tweets from a query or search.
                          e.g.
                            -q "Philippine Marites"
                            --query="Jak Roberto anti selos"

-a, --add               : Additional data to scrape and
                          save in the .csv file.

                          values:
                          pd - poster's followers and following

                          e.g.
                            -a "pd"
                            --add="pd"

                          NOTE: Values must be separated by commas.

--latest                : Twitter latest tweets (default: True).
                          Note: Only for hashtag-based
                          and query-based scraping.
                          usage:
                            python scraper -t 500 -ht=python --latest

--top                   : Twitter top tweets (default: False).
                          Note: Only for hashtag-based
                          and query-based scraping.
                          usage:
                            python scraper -t 500 -ht=python --top

Sample Scraping Commands

  • Custom Limit Scraping
python scraper -t 500
  • User Profile Scraping
python scraper -t 100 -u elonmusk
  • Hashtag Scraping

    • Latest

      python scraper -t 100 -ht python --latest
    • Top

      python scraper -t 100 -ht python --top
  • Query or Search Scraping (Also works with twitter's advanced search.)

    • Latest

      python scraper -t 100 -q "Jak Roberto Anti Selos" --latest
    • Top

      python scraper -t 100 -q "International News" --top
  • Advanced Search Scraping

    • For tweets mentioning @elonmusk:

      python scraper --query="(@elonmusk)"
    • For tweets that mentions @elonmusk with at least 1000 replies from January 01, 2020 - August 31, 2023:

      python scraper --query="(@elonmusk) min_replies:1000 until:2023-08-31 since:2020-01-01"
    • Perform more Advanced Search using Twitter's Advanced Search, just setup the advanced query and copy the resulting string query to the program:

    • Twitter Advanced Search Image

  • Scrape Additional Data

python scraper --add="pd"
Values Description
pd Tweet poster's id, followers, and following count.

selenium-twitter-scraper's People

Contributors

windsnow1025 avatar godkingjay avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.