Code Monkey home page Code Monkey logo

ctfdscraper's Introduction

CTFdScraper

CTFdScraper is a simple scraper for automating challenges gathering from a CTFd platform. Generally, it's utilizing the CTFd endpoint on /api/v1/challengesor /chalsfor older version.

Getting Started

Prerequisites

In order to run this properly, you need a python2 or python3 environment installed. For a certain circumstance, termux user need to install several packages before installing lxml.

# Termux-only (lxml)
$ pkg install clang libxml2 libxslt
$ pip install cython

Installation

After fulfilling the prerequisites, you can directly clone this repository or download the source from our releases.

$ git clone https://github.com/hanasuru/CTFdScraper

After that, go install its dependencies by using pip.

$ cd CTFdScraper/
$ pip install -r requirements.txt

Usage

For instances, you can check helper section by passing -h or --help.

$ python ctfd.py --help
usage: ctfd.py [-h] [--data data] [--proxy proxy] [--path path]
               [--worker worker] [--scheme scheme] [--enable-cloud]
               [--override] [--no-download] [--export]
               [user] [passwd] [url]

Simple CTFd-based scraper for challenges gathering

positional arguments:
  user             Username/email
  passwd           Password
  url              CTFd platform url

optional arguments:
  -h, --help       show this help message and exit
  --data data      Populate from challs.json
  --proxy proxy    Request behind proxy server
  --path path      Target directory, default: CTF
  --worker worker  Number of threads, default: 10
  --scheme scheme  URL scheme, default: https
  --enable-cloud   Permit file download from a cloud drive, default=False
  --override       Override existed chall file
  --no-download    Don't download chall file
  --export         Export challenges directory as zip, default=False
                                                      

Collect challenges by credentials

By default, you need to define username, password, and CTFd url respectively (use HTTPS by default).

$ python ctfd.py user passwd https://some-domain

Enable Cloud-files download

By default, this option are disabled due to file size issues where user accidentally spend most of internet quotas from downloading a large file. Otherwise, you can enable this opting by passing --enable-cloud argument

$ python ctfd.py user passwd https://some-domain --enable-cloud

Currently, only Google-drive and Dropbox that are supported.

As a note, the review that shows size output may be wrong (false positive) due to the usage of request.get(url, stream=True) in order to get the Content-Length header

Override existed challenges file

During CTF competition, the organizer may update the current binary/file. Unfortunately, by default CTFdScraper couldn't override the existed challenges file. If you insist to modify the current binary/file, you need to pass --override argument

$ python ctfd.py user passwd https://some-domain --override

Collect challenges by existed challs.json

Alternatively, you can also populate only the challenges from an existed challs.json. Additionally, you can also combine with --enable-cloud & --override to download files from a cloud drive. Furthermore, check the examples.

# Populate challenges only from existed data
$ python ctfd.py --data challs.json

# Populate challenges alongwith files
$ python ctfd.py user passwd https://some-domain --data chall.json

# Populate challenges alongwith interal file (if public) and external file
$ python ctfd.py --data chall.json --enable-cloud --override

Enable request behind proxy

To be able to conduct a requests behind proxy you need to pass --proxy proxy-server argument or manually add export http_proxy="http://host:port".

$ python ctfd.py user passwd https://some-domain --proxy 127.0.0.1:8080

# or set up an environtment variable
$ export http_proxy = "http://127.0.0.1:8080"
$ export https_proxy = "http://127.0.0.1:8080"

Wrap up the CTFd folder into a Zip file

For archiving purpose, you can save the CTFd folder as a Zip file by passing the --export argument

$ python ctfd.py user passwd https://some-domain --export

Lastly, the collected challenge will be saved to CTF/${ctf_name} directory. You can customize the path by passing --path pathnameargument.

Examples

Challenges Hierarchy

$ tree
.
└── Online Playground CTF for Beginner
    ├── challs.json
    ├── Cryptography
    │   ├── Base64
    │   │   └── README.md
    │   └── Single-Byte XOR Cipher
    │       └── README.md
    ├── Forensic
    │   ├── Data Exfil
    │   │   └── README.md
    │   └── Volatility 4
    │       └── README.md
    ├── Pwn
    │   ├── cariuang
    │   │   └── README.md
    │   └── vault
    │       └── README.md
    ├── Reverse
    │   ├── IFEST-password
    │   │   └── README.md
    │   └── Password
    │       └── README.md
    └── Web
        ├── babyPHP
        │   └── README.md
        └── Optimus Prime
            └── README.md

Challenge README.md

# Optimus Prime [50 pts]

**Category:** Web
**Solves:** 23

## Description
>Optimus Prime is coming

http://18.139.8.91:3015/

author: Arkavidia5

**Hint**
* -

## Solution

### Flag

Demo

For instances, here is the demonstration of CTFdScraper. For simplicity, the svg file was removed

Scrape using credentials

▶ python ctfd.py test 12345 http://playgroundctf.xyz            
✔ Login Success
✔ Found 43 new challenges
✔ Found 24 files (0.5 MB downloaded)

[Summary]
✔ Web                           (3)
✔ Reverse                       (3)
✔ Cryptography                  (9)
✔ Forensic                      (10)
✔ Intro                         (7)
✔ Survey                        (1)
✔ Pwn                           (5)
✔ Hacktoday 2019 - Penyisihan   (5)

[Finished in 3.44 second]

Scrape using cloud drive support

▶ python ctfd.py test 12345 http://playgroundctf.xyz --enable-cloud
✔ Login Success
✔ Loaded 43 challs from challs.json
⚠ There are no new challenges
✔ Found 26 files (143.0 MB downloaded)

[Summary]
✔ Web                           (3)
✔ Reverse                       (3)
✔ Cryptography                  (9)
✔ Forensic                      (10)
✔ Intro                         (7)
✔ Survey                        (1)
✔ Pwn                           (5)
✔ Hacktoday 2019 - Penyisihan   (5)

[Finished in 9.56 second]                               

Scrape using existed chall.json

▶ python ctfd.py --data challs.json                                                       
✔ Loaded 43 challs from challs.json
✔ Found 24 files (0.0 MB downloaded)

[Summary]
✔ Web                           (3)
✔ Reverse                       (3)
✔ Cryptography                  (9)
✔ Forensic                      (10)
✔ Intro                         (7)
✔ Survey                        (1)
✔ Pwn                           (5)
✔ Hacktoday 2019 - Penyisihan   (5)

[Finished in 0.09 second]
                                         

Authors

  • hanasuru - Initial work

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE file for details

Credits

  • CTFd, a Capture The Flag framework focusing on ease of use and customizability.
  • markdown2png, a module to render John Gruber's markdown format to PNG (or other image formats) by using Pillow and Markdown dependencies.

ctfdscraper's People

Contributors

cacadosman avatar hanasuru avatar rajebdev avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.