rty813 / doc_downloader Goto Github PK
View Code? Open in Web Editor NEW下载豆丁、淘豆、道客巴巴、原创力、金锄头文档,并自动转换为PDF
下载豆丁、淘豆、道客巴巴、原创力、金锄头文档,并自动转换为PDF
下面是提示信息
Traceback (most recent call last):
File "docDownloader.py", line 44, in
File "fire\core.py", line 141, in Fire
File "fire\core.py", line 466, in _Fire
File "fire\core.py", line 681, in _CallAndUpdateTrace
File "docDownloader.py", line 16, in main
File "doc88.py", line 43, in download
File "selenium\webdriver\remote\webelement.py", line 80, in click
File "selenium\webdriver\remote\webelement.py", line 633, in _execute
File "selenium\webdriver\remote\webdriver.py", line 321, in execute
File "selenium\webdriver\remote\errorhandler.py", line 242, in check_response
selenium.common.exceptions.ElementClickInterceptedException: Message: element click intercepted: Element
[4656] Failed to execute script docDownloader
如题
一直 0% pending
http://www.docin.com/p-1745803343.html
nice to have requirements.txt
下了windows 打包的release 7z, 解压后尝试下载 https://www.doc88.com/p-8116099467414.html
闪退了
项目克隆后,本地运行出错。
Mac系统,Python版本:2.7。
需要的扩展都已经安装了。
python docDownloader.py
File "docDownloader.py", line 12
SyntaxError: Non-ASCII character '\xe8' in file docDownloader.py on line 12, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
您好,您的道客巴巴的下载代码应该由于网页格式变了,现在有一点问题,给您修改了一下。
链接:https://pan.baidu.com/s/1nuY2B7ymEVqNQSbooccgNQ
提取码:fv8c
网址 https://www.docin.com/p-2300126438.html
,出现以下报错:
PS D:\docDownloader> .\docDownloader.exe
请输入网址(输入exit退出):https://www.docin.com/p-2300126438.html
Traceback (most recent call last):
File "docDownloader.py", line 44, in <module>
File "fire\core.py", line 141, in Fire
File "fire\core.py", line 466, in _Fire
File "fire\core.py", line 681, in _CallAndUpdateTrace
File "docDownloader.py", line 29, in main
File "douding.py", line 13, in download
ValueError: substring not found
[27544] Failed to execute script docDownloader
Microsoft Windows [版本 6.1.7601]
版权所有 (c) 2009 Microsoft Corporation。保留所有权利。
E:>cd E:\chrome下载\docDownloader1.2.3\docDownloader
E:\chrome下载\docDownloader1.2.3\docDownloader>docdownloader.exe
Traceback (most recent call last):
File "C:\Program Files\Python38\Lib\site-packages\PyInstaller\hooks\rthooks\py
i_rth_multiprocessing.py", line 17, in
File "PyInstaller\loader\pyimod03_importers.py", line 531, in exec_module
File "multiprocessing_init_.py", line 16, in
File "PyInstaller\loader\pyimod03_importers.py", line 531, in exec_module
File "multiprocessing\context.py", line 6, in
File "PyInstaller\loader\pyimod03_importers.py", line 531, in exec_module
File "multiprocessing\reduction.py", line 16, in
File "PyInstaller\loader\pyimod03_importers.py", line 531, in exec_module
File "socket.py", line 49, in
ImportError: DLL load failed while importing _socket: 参数错误。
[4048] Failed to execute script pyi_rth_multiprocessing
E:\chrome下载\docDownloader1.2.3\docDownloader>
一、卡着不动
在我测试了几个文件之后,发现超过500页的都卡着了,一般会在300-500页卡着,然后loading条显示 51%|█████ | 470/746 [09:26<09:05, 1.51s/it]
就再也不update了
二、error
51%|█████ | 380/746 [09:26<09:05, 1.49s/it]
Traceback (most recent call last):
File "G:\Programwork\Python\doc_downloader\full_code\doc_downloader-master\docDownloader.py", line 44, in
fire.Fire(main)
File "C:\Users\billeyang\AppData\Local\Programs\Python\Python39\lib\site-packages\fire\core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "C:\Users\billeyang\AppData\Local\Programs\Python\Python39\lib\site-packages\fire\core.py", line 463, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "C:\Users\billeyang\AppData\Local\Programs\Python\Python39\lib\site-packages\fire\core.py", line 672, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "G:\Programwork\Python\doc_downloader\full_code\doc_downloader-master\docDownloader.py", line 16, in main
doc88.download(url)
File "G:\Programwork\Python\doc_downloader\full_code\doc_downloader-master\doc88.py", line 78, in download
img_data = driver.execute_script(js_cmd)
File "C:\Users\billeyang\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 634, in execute_script
return self.execute(command, {
File "C:\Users\billeyang\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\billeyang\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchWindowException: Message: no such window: target window already closed
from unknown error: web view not found
(Session info: headless chrome=99.0.4844.51)
Process finished with exit code 1
error显示target window already closed但并未对chrome界面有任何操作。
Traceback (most recent call last):
File "/Volumes/Downloads/doc_downloader-master 3/doc88.py", line 66, in download
element = driver.find_element_by_id(canvas_id)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 360, in find_element_by_id
return self.find_element(by=By.ID, value=id_)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="outer_page_1"]"}
(Session info: headless chrome=96.0.4664.55)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Volumes/Downloads/doc_downloader-master 3/docDownloader.py", line 44, in
fire.Fire(main)
File "/usr/local/lib/python3.9/site-packages/fire/core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/usr/local/lib/python3.9/site-packages/fire/core.py", line 463, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/usr/local/lib/python3.9/site-packages/fire/core.py", line 672, in CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/Volumes/Downloads/doc_downloader-master 3/docDownloader.py", line 16, in main
doc88.download(url)
File "/Volumes/Downloads/doc_downloader-master 3/doc88.py", line 69, in download
element = driver.find_element_by_id(canvas_id)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 360, in find_element_by_id
return self.find_element(by=By.ID, value=id)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="outer_page_1"]"}
(Session info: headless chrome=96.0.4664.55)
在线的我试了下下载不了
淘豆应该是改版了
请输入网址(输入exit退出):https://www.doc88.com/p-28761857292280.html
DevTools listening on ws://127.0.0.1:54380/devtools/browser/c41bbe33-bb5c-4220-ad2a-2049ab2a857c
道客巴巴: 《TB10097-2019 铁路房屋建筑设计标准 - 道客巴巴》
Traceback (most recent call last):
File "docDownloader.py", line 44, in <module>
File "fire\core.py", line 141, in Fire
File "fire\core.py", line 466, in _Fire
File "fire\core.py", line 681, in _CallAndUpdateTrace
File "docDownloader.py", line 16, in main
File "doc88.py", line 43, in download
File "selenium\webdriver\remote\webelement.py", line 80, in click
File "selenium\webdriver\remote\webelement.py", line 633, in _execute
File "selenium\webdriver\remote\webdriver.py", line 321, in execute
File "selenium\webdriver\remote\errorhandler.py", line 242, in check_response
selenium.common.exceptions.ElementClickInterceptedException: Message: element click intercepted: Element <div class="surplus-btn" id="continueButton">...</div> is not clickable at point (439, 9). Other element would receive the click: <a href="javascript:;" title="缩小" id="zoomOutButton">...</a>
(Session info: chrome=92.0.4515.159)
[28632] Failed to execute script docDownloader
pip 安装后运行哪个程序?
无法预览的页,能下吗?微信nlanguage
您好,豆丁的docin文件head 和 page分开的那种docin格式,会解压成swf吗?
微信 nlanguage
部分豆丁链接下载下来的jpg文件大小为22bytes:
图片里面内容为: file can not read now!
下载链接: https://www.docin.com/p-1643955256.html
感激不尽啊,如果能封装成exe文件的话
# 获取页数
num_of_pages = driver.find_element_by_id('readshop').find_element_by_class_name(
'mainpart').find_element_by_class_name('shop3').find_elements_by_class_name('text')[-1].get_attribute('innerHTML')
里面,shop3没了
所在位置 行:1 字符: 21
~~~~~~~~~~~~~~~~
module 'img2pdf' has no attribute 'conpdf'
D:\docDownloader>docDownloader.exe
请输入网址(输入exit退出):https://max.book118.com/html/2019/1210/6155102204002131.shtm
Traceback (most recent call last):
File "docDownloader.py", line 44, in
File "site-packages\fire\core.py", line 138, in Fire
File "site-packages\fire\core.py", line 463, in _Fire
File "site-packages\fire\core.py", line 672, in _CallAndUpdateTrace
File "docDownloader.py", line 20, in main
File "book118.py", line 22, in download
File "site-packages\selenium\webdriver\chrome\webdriver.py", line 76, in init
File "site-packages\selenium\webdriver\remote\webdriver.py", line 157, in init
File "site-packages\selenium\webdriver\remote\webdriver.py", line 252, in start_session
File "site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
File "site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
selenium.common.exceptions.WebDriverException: Message: unknown error: cannot find Chrome binary
[13500] Failed to execute script docDownloader
===================================================
已安装py,nodejs,R语言环境.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.