hacksman / spider_world Goto Github PK
View Code? Open in Web Editor NEW🕷spider world with me
🕷spider world with me
None
Traceback (most recent call last):
File "video_download_run.py", line 38, in
douyin_crawl.grab_user_media(sys.argv[-1], "USER_LIKE")
File "../www_douyin_com/spiders/douyin_crawl.py", line 110, in grab_user_media
hasmore, max_cursor = self.grab_video(user_id, action, content)
File "../www_douyin_com/spiders/douyin_crawl.py", line 141, in grab_video
for per_video in video_infos:
TypeError: 'NoneType' object is not iterable
除了demo提供的抖音号 别的都是不行的 作者方便给个群或者联系方式嘛
**Traceback (most recent call last):
File "video_download_run.py", line 32, in
douyin_crawl.grab_user_media(sys.argv[-1], "USER_POST")
File "../www_douyin_com/spiders/douyin_crawl.py", line 110, in grab_user_media
hasmore, max_cursor = self.grab_video(user_id, action, content)
File "../www_douyin_com/spiders/douyin_crawl.py", line 124, in grab_video
real_url = gen_url(self.token, url, query_params)
File "../www_douyin_com/common/utils.py", line 57, in gen_url
resp = requests.post(URL.api_sign(token), json={"url": url}).json()
File "/home/vts/anaconda3/lib/python3.6/site-packages/requests/models.py", line 892, in json
return complexjson.loads(self.text, kwargs)
File "/home/vts/anaconda3/lib/python3.6/json/init.py", line 354, in loads
return _default_decoder.decode(s)
File "/home/vts/anaconda3/lib/python3.6/json/decoder.py", line 342, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4)
Traceback (most recent call last):
File "video_download_run.py", line 13, in
douyin_crawl = DouyinCrawl()
File "..\www_douyin_com\spiders\douyin_crawl.py", line 71, in init
self.common_params = common_params(self)
File "..\www_douyin_com\common\utils.py", line 75, in common_params
device_info = getDevice(self)
File "..\www_douyin_com\common\utils.py", line 44, in getDevice
device_info = resp['data']
KeyError: 'data'
为什么我扫出来的ID是六位数字和字母的组合?
然后就:
2018-11-29 21:31:44,886 - utils.py[line:104] INFO - 请输入正确的用户id, 用户id为10,11,12或13位纯数字...
Traceback (most recent call last):
File "video_download_run.py", line 32, in
douyin_crawl.grab_user_media(sys.argv[-1], "USER_POST")
File "../www_douyin_com/common/utils.py", line 105, in wrapper
raise Exception
Exception
if not re.findall('^\d{11}$', user_id) or not re.findall('^\d{12}$', user_id): self.logger.info("请输入正确的用户id, 用户id为11或12位纯数字...") return
判断 改成
if not (re.findall('^\d{11}$', user_id) or re.findall('^\d{12}$', user_id)): self.logger.info("请输入正确的用户id, 用户id为11或12位纯数字...") return
PS D:\SourceCode\douyin\spider_world\www_douyin_com> python .\video_download_run.py -upost 66076741938
Traceback (most recent call last):
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 171, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw)
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\util\connection.py", line 56, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\socket.py", line 748, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 600, in urlopen
chunked=chunked)
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 343, in _make_request
self._validate_conn(conn)
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 849, in _validate_conn
conn.connect()
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 314, in connect
conn = self._new_conn()
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 180, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x036E0D10>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\adapters.py", line 445, in send
timeout=timeout
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\util\retry.py", line 398, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.appsign.vip', port=2688): Max retries exceeded with url: /douyin/device/new/version/2.7.0 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x036E0D10>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File ".\video_download_run.py", line 13, in <module>
douyin_crawl = DouyinCrawl()
File "..\www_douyin_com\spiders\douyin_crawl.py", line 69, in __init__
self.common_params = common_params()
File "..\www_douyin_com\common\utils.py", line 74, in common_params
device_info = getDevice()
File "..\www_douyin_com\common\utils.py", line 41, in getDevice
resp = requests.get(API + "/douyin/device/new/version/2.7.0").json()
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 512, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 622, in send
r = adapter.send(request, **kwargs)
File "C:\Users\quran\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\adapters.py", line 513, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.appsign.vip', port=2688): Max retries exceeded with url: /douyin/device/new/version/2.7.0 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x036E0D10>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
Traceback (most recent call last):
File "video_download_run.py", line 32, in
douyin_crawl.grab_user_media(sys.argv[-1], "USER_POST")
File "../www_douyin_com/spiders/douyin_crawl.py", line 127, in grab_user_media
hasmore, max_cursor = self.grab_video(user_id, action, content)
File "../www_douyin_com/spiders/douyin_crawl.py", line 166, in grab_video
self.download_user_video(aweme_id, **download_item)
File "../www_douyin_com/spiders/douyin_crawl.py", line 233, in download_user_video
video_content = self.download_video(aweme_id)
File "../www_douyin_com/spiders/douyin_crawl.py", line 277, in download_video
sign = getSign(self.__get_token(), query_params)
File "../www_douyin_com/common/utils.py", line 62, in getSign
sign = resp['data']
KeyError: 'data'
python3 douyin_crawl.py
Traceback (most recent call last):
File "douyin_crawl.py", line 342, in <module>
douyin.grab_comment_main(aweme_id, 0)
File "douyin_crawl.py", line 157, in grab_comment_main
has_more = self.__grab_comment(aweme_id, upvote_bound)
File "douyin_crawl.py", line 216, in __grab_comment
hasmore = int(comment_content.get("hasmore"))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
确认已经修改过Token和视频ID,查了一下很多项目都停止于今年五月,这个项目还在运作吗?
Traceback (most recent call last):
File "video_download_run.py", line 13, in
douyin_crawl = DouyinCrawl()
TypeError: init() missing 1 required positional argument: 'token'
resp = requests.get(API + "/douyin/device/new/version/2.7.0").json()
resp返回的信息是 {'message': 'Internal Server Error'},是不是你那边断网了?
执行
python3 douyin_crawl.py
报错:
Traceback (most recent call last):
File "douyin_crawl.py", line 12, in
from www_douyin_com.common.utils import *
ModuleNotFoundError: No module named 'www_douyin_com'
是我的配置问题吗,python小白请指教 :)
python版本3.7.0
如题?找不到getSign方法呢?
你好 这个爬虫爬下来的视频好像清晰度有损 一般的都是1280X720的 但是我下载下来的很多视频都是比这个低。。。
比如支持的python版本号,如何运行等
File "video_download_run.py", line 11, in
from www_douyin_com.spiders.douyin_crawl import DouyinCrawl
File "..\www_douyin_com\spiders\douyin_crawl.py", line 4, in
from backports import csv
ImportError: cannot import name 'csv' from 'backports'
关注公众号获取token值替换运行后,提示:TypeError: init() missing 1 required positional argument: 'token'
python video_download_run.py -m -upost 58065297584
Traceback (most recent call last):
File "video_download_run.py", line 11, in <module>
from www_douyin_com.spiders.douyin_crawl import DouyinCrawl
File "..\www_douyin_com\spiders\douyin_crawl.py", line 4, in <module>
from backports import csv
ModuleNotFoundError: No module named 'backports'
响应时间过长
您好大神,好多次在获取token的时候程序会断开,这个怎么破
init() missing 1 required positional argument: 'token'
张哥,又爬不了了
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.