Code Monkey home page Code Monkey logo

rotate-captcha-crack's Introduction

Rotate-Captcha-Crack

中文 | English

Predict the rotation angle of given picture through CNN. This project can be used for rotate-captcha cracking.

Test result:

test_result

Three kinds of model are implemented, as shown in the table below.

Name Backbone Cross-Domain Loss (less is better) Params MACs
RotNet ResNet50 75.6512° 24.246M 4.09G
RotNetR RegNetY 3.2GFLOPs 15.1818° 18.117M 3.18G
RCCNet_v0_5 RegNetY 3.2GFLOPs 56.8515° 20.212M 3.18G

RotNet is the implementation of d4nst/RotNet over PyTorch. RotNetR is based on RotNet, with RegNet as its backbone and class number of 128. The average prediction error is 15.1818°, obtained by 64 epochs of training (3 hours) on the Google Street View dataset.

The Cross-Domain Test uses Google Street View and Landscape-Dataset for training, and Captcha Pictures from Baidu (thanks to @xiangbei1997) for testing.

The captcha picture used in the demo above comes from RotateCaptchaBreak

Try it!

Prepare

  • Device supporting CUDA10+ (mem>=4G for training)

  • Python>=3.8,<3.13

  • PyTorch>=1.11

  • Clone the repository.

git clone https://github.com/Starry-OvO/rotate-captcha-crack.git --depth=1
cd ./rotate-captcha-crack
  • Install all requiring dependencies.

This project strongly suggest you to use rye for package management. Run the following commands if you already have the rye:

rye pin 3.12
rye sync

Or, if you prefer conda: The following steps will create a virtual env under the working directory. You can also use a named env.

conda create -p .conda
conda activate ./.conda
conda install matplotlib tqdm tomli
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia

Or, if you prefer a direct pip:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
pip install -e .

Download the Pretrained Models

Download the *.zip files in Release and unzip them all to the ./models dir.

The directory structure will be like ./models/RotNetR/230228_20_07_25_000/best.pth

The names of models will change frequently as the project is still in beta status. So, if any FileNotFoundError occurs, please try to rollback to the corresponding tag first.

Test the Rotation Effect by a Single Captcha Picture

If no GUI is presented, try to change the debugging behavior from showing images to saving them.

rye run python test_captcha.py

If you do not have the rye, just strip the prefix rye run.

Use HTTP Server

  • Install extra dependencies

With rye:

rye sync --features=server

or with conda:

conda install aiohttp httpx[cli]

or with pip:

pip install -e .[server]
  • Launch server
rye run python server.py
  • Another Shell to Send Images
rye run httpx -m POST http://127.0.0.1:4396 -f img ./test.jpg

Train Your Own Model

Prepare Datasets

  • For this project I'm using Google Street View and Landscape-Dataset for training. You can collect some photos and leave them in one directory. Without any size or shape requirement.

  • Modify the dataset_root variable in train.py, let it points to the directory containing images.

  • No manual labeling is required. All the cropping, rotation and resizing will be done soon after the image is loaded.

Train

rye run python train_RotNetR.py

Validate the Model on Test Set

rye run python test_RotNetR.py

Details of Design

Most of the rotate-captcha cracking methods are based on d4nst/RotNet, with ResNet50 as its backbone. RotNet treat the angle prediction as a classification task with 360 classes, then use cross entropy to compute the loss.

Yet CrossEntropyLoss over one-hot labels will bring a uniform metric distance between any angles (e.g. $\mathrm{dist}(1°, 2°) = \mathrm{dist}(1°, 180°)$ ), clearly defies our common sense. Arbitrary-Oriented Object Detection with Circular Smooth Label (ECCV'20) introduces an interesting trick, by smoothing the one-hot label, e.g. [0,1,0,0] -> [0.1,0.8,0.1,0], CSL provides a loss measurement closer to our intuition, such that $\mathrm{dist}(1°,180°) \gt \mathrm{dist}(1°,3°)$.

Meanwhile, the angle_error_regression proposed by d4nst/RotNet is less effective. That's because when dealing with outliers, the gradient leads to a non-convergence result. It's better to use a SmoothL1Loss for regression.

rotate-captcha-crack's People

Contributors

starry-ovo avatar controlnet avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.