Code Monkey home page Code Monkey logo

chinesedarkwebcrawler's Introduction

中文暗网爬虫

运行环境

 python2.7
 selenium
 tor浏览器
 geckodriver.exe

运行方式

  • 没有什么特殊的库,缺啥直接pip安装就行

页面爬取

darkweb.py 页面爬取保存及图片id保存脚本
示例: python darkweb.py keyword pagenum
keyword必须是其中一个:'sex','data','service','material','virtual_source','teach','cvv','other','basic','private'
pagenum是页数,随意

图片爬取

get_darkweb_pic_auto.py 根据保存的图片id进行图片定时爬取
python get_darkweb_pic_auto.py
时间间隔自行设定

前台显示

使用nginx.conf启动nginx
python manage.py runserver 127.0.0.1 port
修改配置文件连对应的端口号

预览

1

2

PS

如果要用前台展示,换个好看的前端,没空改,所以太丑了,顺便帮我更新下

chinesedarkwebcrawler's People

Contributors

c4o avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chinesedarkwebcrawler's Issues

DeepWeb

C4o 你有什么中文的 deep web 网址可以分享给我么

ImportError: No module named web.settings

执行命令python manage.py runserver 127.0.0.1 8080不成功,提示错误ImportError: No module named web.settings。
在网上查找了一些解决方法,比如在wsgi.py 中添加下列语句:
path = os.path.dirname(os.path.dirname(os.path.abspath(file)))
if path not in sys.path:
sys.path.append(path)
os.environ['DJANGO_SETTINGS_MODULE'] = 'web.settings'
但还是存在该问题。

up还会更新吗

市场网站入口和标签好像全都改了QAQ,主要是入口还加了验证码

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.