Code Monkey home page Code Monkey logo

cygnusx1's Introduction

🕳️CygnusX1

Code by 🧑‍💻Trong-Dat Ngo.

Overviews

🕳️CygnusX1 is a multithreaded tool 🛠️, used to search and download images from popular search engines 🔎. It is straightforward to set up and run!

Key features

  • 🥰 No knowledge is required to set up and to run.
  • 🚀 Download image using customizable number of threads.
  • ⛏️Crawl all possible images (search results and recommendations).

Demo

Installation

This repository is tested on Python 3.6+ and PyTorch selenium 3.141.0+, as well as it works fine on macOS, Windows, Linux.

You should setup and run 🕳️CygnusX1 in a virtual environment. If you're unfamiliar with Python virtual environments, check out the user guide here.

First, create a virtual environment with the version of Python you're going to use and activate it. (Can be omitted if you want to set up directly on the OS environment)

source venv/bin/activate

Pip Insstallation

Install 🕳️CygnusX1 by pip:

pip install CygnusX1

Manual Installation

Download 🕳️CygnusX1 from Github:

git clone https://github.com/dat821168/CygnusX1.git

Finally install dependencies in requirements.txt:

pip install -r requirements.txt

Run

Use cygnusx1 command line:

cygnusx1  --keywords "keyword 1, keyword 2" --workers 8 --use_suggestions --headless

Use run.py to start the script:

python run.py  --keywords "keyword 1, keyword 2" --workers 8 --use_suggestions --headless

Argument details:

  • --keywords: Indicate the keywords/keyphrases you want to search. For multiple keywords, separate them with commas.
  • --out_dir: Path where to save results. Default = './IMAGES'.
  • --workers: The maximum number of workers used to crawl image. Default = 2.
  • --use_suggestions: Crawl search engine suggestions/recommendations. Default = False.
  • --headless: Hide browser during scraping. Default = False.

Future Releases

References

cygnusx1's People

Contributors

datnnt1997 avatar duckweeds7 avatar hoangperry avatar tasinttttttt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

cygnusx1's Issues

Chrome privacy settings popup prevents scrape

Google search doesn't work by default as the automation has to act on google's privacy popup.

Could be a regional issue.

My current hack is to launch the script NOT in headless mode and act on the google search privacy notice.
Works but not ideal if multiple queries

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.