Code Monkey home page Code Monkey logo

pre-cache's Introduction

Pre-cache

网站预缓存脚本,全量拉取sitemap里面的网址来实现预缓存,支持使用CDN或本地有静态缓存的网站。

使用说明:

基于Docker运行(推荐)

# 不依赖本地环境
docker run --rm --net=host -ti jagerzhang/pre-cache:latest \
    --sitemap=https://zhang.ge/sitemap.xml \
    --cacheheader=cf-cache-status

直接运行脚本

环境初始化:

git clone https://github.com/jagerzhang/Pre-cache.git
cd Pre-cache
yum install -y python-pip
pip install --upgrade pip -i https://mirrors.tencent.com/pypi/simple/
pip install -r requirements.txt -i https://mirrors.tencent.com/pypi/simple/

打印帮助信息:

python pre_cache.py --help

usage: pre_cache.py [-h] -s SITEMAP [-S SIZE] [-t TIMEOUT] [-H HOST]
                    [-c CACHEHEADER] [-U USERAGENT] [-v VERIFY]

网站预缓存脚本,支持使用CDN或本地有静态缓存的网站.

optional arguments:
  -h, --help            show this help message and exit
  -s SITEMAP, --sitemap SITEMAP
                        网站地图sitemap地址
  -S SIZE, --size SIZE  并发请求数量,默认20
  -t TIMEOUT, --timeout TIMEOUT
                        单个请求的超时时间,默认10s
  -H HOST, --host HOST  指定真实主机,比如 127.0.0.1:8080
  -c CACHEHEADER, --cacheheader CACHEHEADER
                        缓存标识,比如: x-cache
  -U USERAGENT, --useragent USERAGENT
                        指定UA标识,默认 Pre-cache/python-
                        requests/__version__
  -v VERIFY, --verify VERIFY
                        是否校验SSL,默认不校验        
  -d, --debug           显示Debug信息, 默认关闭                                   

快速使用:

python pre_cache.py \
   --sitemap=https://zhang.ge/sitemap.xml \
   --cacheheader=cf-cache-status

指定真实主机:

# 可以指定IP+Host域名可以绕过CDN,直接请求源站,实现源站本地缓存
python pre_cache.py \
   --sitemap=https://zhang.ge/sitemap.xml \
   --host=127.0.0.1:8443 \
   --cacheheader=x-cache-redis

指定UA标识:

# 可以指定UA标识,伪装浏览器或其他客户端请求,避免被CDN拦截
python pre_cache.py \
   --sitemap=https://zhang.ge/sitemap.xml \
   --cacheheader=cf-cache-status \
   --useragent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"

对象引用:

from pre_cache import PreCache
pre = PreCache(sitemap="https://zhang.ge/sitemap.xml",
                   host=None,
                   size=10,
                   timeout=10,
                   cache_header="cf-cache-status",
                   user_agent="Pre-cache/python-requests/2.22.0",
                   verify=False)
pre.start()                 

pre-cache's People

Contributors

jagerzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pre-cache's Issues

File "/opt/pre_cache.py", line 236, in <module>

站点地图:https://www.progogame.com/sitemap.xml
并发数量:20
超时时间:10秒
缓存标识:cf-cache-status
UA 标识:Pre-cache/python-requests/2.22.0
预缓存开始:

Traceback (most recent call last):
File "/opt/pre_cache.py", line 236, in
pre.start()
File "/opt/pre_cache.py", line 122, in start
urls = self.get_urls()
File "/opt/pre_cache.py", line 92, in get_urls
for u in xmltodict.parse(sitemap)["urlset"]["url"]:
KeyError: 'urlset'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.