Code Monkey home page Code Monkey logo

l1kw1d / ultimate-facebook-scraper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from harismuneer/ultimate-social-scrapers

0.0 1.0 0.0 26.66 MB

๐Ÿค– A bot which scrapes almost everything about a Facebook user's profile including all public posts/statuses available on the user's timeline, uploaded photos, tagged photos, videos, friends list and their profile photos (including Followers, Following, Work Friends, College Friends etc).

License: MIT License

Python 100.00%

ultimate-facebook-scraper's Introduction

Ultimate Facebook Scraper (UFS)

Tooling that automates your social media interactions to collect posts, photos, videos, friends, followers and much more on Facebook.


Contributors

Developers from following organizations have so far joined the quest and contributed to UFS.

Microsoft MIT Harvard NUCES
UCLA ACM LUMS


Features

A bot which scrapes almost everything about a user's Facebook profile including:

  • uploaded photos
  • tagged photos
  • videos
  • friends list and their profile photos (including Followers, Following, Work Friends, College Friends etc)
  • and all public posts/statuses available on the user's timeline.

Data is scraped in an organized format to be used for educational/research purposes by researchers. This scraper does not use Facebook's Graph API meaning there are no rate limiting issues.

This tool is being used by thousands of developers weekly and we are pretty amazed at this response! Thank you guys!๐ŸŽ‰

For citing/referencing this tool for your research, check the 'Citation' section below.

Note

This tool uses xpaths of 'divs' to extract data. Since Facebook updates its site frequently, the 'divs' get changed. Consequently, we have to update the divs accordingly to correctly scrape data.

The developers of this tool have devoted time and effort in developing, and maintaining this tool for a long time. In order to keep this amazing tool alive, we need support from you geeks.

The code is intuitive and easy to understand, so you can update the relevant xpaths in the code if you find data is not being scraped from profiles. Facebook has most likely updated their site, so please generate a pull request. Much appreciated!

Sample

Screenshot


Usage

Installation

You will need to:

$ git clone https://github.com/harismuneer/Ultimate-Facebook-Scraper.git
$ cd Ultimate-Facebook-Scraper

# Set up a virtual env
$ python3 -m venv venv
$ source venv/bin/activate

# Install Python requirements
$ pip install -e .

The code is multi-platform and is tested on both Windows and Linux. The tool uses latest version of Chrome Web Driver. I have placed the webdriver along with the code but if that version doesn't work then replace the chrome web driver with the latest one according to your platform and your Google Chrome version.

How to Run

  • Fill your Facebook credentials into credentials.yaml
  • Edit the input.txt file and add many profiles links as you want in the following format with each link on a new line:

Make sure the link only contains the username or id number at the end and not any other stuff. Make sure its in the format mentioned above.

Note: There are two modes to download Friends Profile Pics and the user's Photos: Large Size and Small Size. You can change the following variables in scraper/scraper.py. By default they are set to Small Sized Pics because its really quick while Large Size Mode takes time depending on the number of pictures to download

# whether to download the full image or its thumbnail (small size)
# if small size is True then it will be very quick else if its False then it will open each photo to download it
# and it will take much more time
friends_small_size = True
photos_small_size = True

Run the ultimate-facebook-scraper command ! ๐Ÿš€


Citation

If you use this tool for your research, then kindly cite it. Click the above badge for more information regarding the complete citation for this tool and diffferent citation formats like IEEE, APA etc.


Important Message

This tool is for research purposes only. Hence, the developers of this tool won't be responsible for any misuse of data collected using this tool. Used by many researchers and open source intelligence (OSINT) analysts.

This tool will not works if your account was set up with 2FA. You must disable it before using.


Authors

You can get in touch with us on our LinkedIn Profiles:

Haris Muneer

LinkedIn Link

You can also follow my GitHub Profile to stay updated about my latest projects: GitHub Follow

Hassaan Elahi

LinkedIn Link

You can also follow my GitHub Profile to stay updated about my latest projects:GitHub Follow

If you liked the repo then kindly support it by giving it a star โญ!

Contributions Welcome

forthebadge

If you find any bug in the code or have any improvements in mind then feel free to generate a pull request.

Note: Wee use Black to lint Python files. Please use it in order to have a valid pull request ๐Ÿ˜‰

Issues

GitHub Issues

If you face any issue, you can create a new issue in the Issues Tab and I will be glad to help you out.

License

MIT

Copyright (c) 2018-present, harismuneer, Hassaan-Elahi

ultimate-facebook-scraper's People

Contributors

harismuneer avatar hassaan-elahi avatar skynewz avatar yeouchiou avatar guylewin avatar dahlitzflorian avatar imgbotapp avatar szepnapot avatar holygame avatar compscikai avatar wagnernoise avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.