Code Monkey home page Code Monkey logo

teslaservicemanualscraper's Introduction

Tesla Service Manual Scraper

This script will download the Tesla Service Manual onto a local doc folder for offline access. Windows and MacOS (thanks to CollinHeist).

Setup

  1. Go into secrets.py and fill out tesla_account_email and tesla_account_password with your account and password.
  2. Go into scrape.py and enter the index URL of the manual you want saved by changing service_manual_index and base_url variables. It is defaulted to the Model S (2012-2020), but confirmed to work on the Model 3 too.
  3. If you have 2FA or other challenges with login, consider changing login_delay in secrets.py to 2 or 3 seconds so you can manually enter your credentials.
  4. Setup Python 3. See tutorial at: https://wiki.python.org/moin/BeginnersGuide/Download
  5. Setup selenium for Python. To use the required stealth module, you must use the Chromium webdriver. See tutorial at: https://blog.testproject.io/2019/07/16/installing-selenium-webdriver-using-python-chrome/
  6. Pip install the required packages (including requests, selenium, selenium-stealth, and beautifulsoup4). On windows, you run the following commands on command prompt (CMD):
    1. cd C:\Users\Anson\Desktop\TeslaServiceManualScraper [template, the path should go wherever you saved this readme]
    2. run pip install -r requirements.txt
  7. Before scraping, it is always a good idea to use a VPN of some sort to avoid any issues with your account. I didn't run into any issues personally, but you can never be too safe. It is also worthwhile to open a new account to claim the manuals instead of using a personal account.
  8. Run scrape.py by typing python scrape.py
  9. The browser will automatically restart when it encounters problems with the files or login status.

Viewing offline

Option 1: Easy Way

  1. Go into docs/ folder and open up index.html. You're going to get 99% of the service manual just like that, but no search functionality.

Option 2: HTTP Server (thanks to TheNexusAvenger)

  1. Run CMD on Windows, and change the directory to the docs folder. Something like this cd C:\Users\Anson\Desktop\TeslaServiceManualScraper
  2. Run the following command: python -m http.server (Python obviously needs to be installed)
  3. Use a web browser and navigate to: http://localhost:8000/ to see the full service manual including search.

Tips

  • A full scrape of the Model 3 service manual took over 30 minutes. This script is set up so that you can stop the script, and then continue later on.
  • Keep an eye out, Tesla's website seems to boot you out of logged in status after about 250 pages or 20 minutes of continuous refreshing. So it might be worthwhile to run this on the side while keeping an eye on your login status.
  • Total file size of the Model 3 service manual is roughly 2.2GB.
  • On your first run, Tesla might throw a Captcha or lead to an error page. Most of the time, just rerun and it'll work.
  • If you get an "Access Denied" page, Tesla has likely blocked you. Usually that block expires in 10 minutes.
  • If you need to reset for any reason (scraping another manual), you need to delete or move the docs/ folder and the dict.picle file.

teslaservicemanualscraper's People

Contributors

ansonlai avatar collinheist avatar hlarsen avatar rashkash103 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

teslaservicemanualscraper's Issues

AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

Chrome 103, is this a chrome driver problem?

C:\Program Files\Python310\TeslaServiceManualScraper-master>py scrape.py
C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py:1: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.command.clean import clean

DevTools listening on ws://127.0.0.1:54174/devtools/browser/3f8df9b0-340b-4578-ac26-aa680caabf37
****** SESSION LOADED ******
Traceback (most recent call last):
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 272, in
run()
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 261, in run
driver.get_index()
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 46, in get_index
driver = tesla_login(self.driver)
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\secrets.py", line 10, in tesla_login
driver.find_element_by_css_selector("#form-input-identity").send_keys(tesla_account_email)
AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

reCaptia issue, and then an "Access Denied" error

Screen Shot 2022-05-23 at 18 17 42

It passes the creds, but there's a recaptia box requiring checking, after it failed I tried again, where I checked the box and did the human test, and then I get access denied.

Logging in manually with my account works fine though.

New Tip: Search Functionality

Under the Tips section, there is a mention of getting other files for the styling of the website. While trying out a local download of the Model Y service manual, I noticed there is another index.json file that can be downloaded to get searching. The main caveat is that the search functionality requires fetching that file over an HTTP request, which necessitates running the manual with an HTTP server. python -m http.server works just fine from what I can tell. Adding these 2 notes may be beneficial.

AttributeError: 'NoneType' object has no attribute 'endswith'

images to be downloaded: 7740
visited: 952
upcoming: 3
images to be downloaded: 7752
visited: 953
upcoming: 2
images to be downloaded: 7753
visited: 954
upcoming: 1
images to be downloaded: 7754
visited: 955
upcoming: 0
images to be downloaded: 7759
****** SESSION SAVED ******
Traceback (most recent call last):
File "C:\TeslaServiceManual\ts1\scrape.py", line 270, in
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 265, in run
clean_img_urls()
File "C:\TeslaServiceManual\ts1\scrape.py", line 248, in clean_img_urls
if url.endswith('jpg'):
AttributeError: 'NoneType' object has no attribute 'endswith'

help me.....

Traceback issues with find_element return self.execute(Command.FIND_ELEMEN

Using Selenium 4.2.0 with the latest scrape.py script. Seems like an Access Denied issue, kick out immediately after with below error. Script getting stuck on ModelS Electrical/Harness section 1710 (large quantity of images).

C:\Program Files\Python310\TeslaServiceManualScraper-master>py scrape.py
C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py:1: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.command.clean import clean

DevTools listening on ws://127.0.0.1:65188/devtools/browser/35abb965-bbe7-49d4-a23f-84559bce2819
****** SESSION LOADED ******
Traceback (most recent call last):
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 272, in
run()
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 261, in run
driver.get_index()
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 46, in get_index
driver = tesla_login(self.driver)
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\secrets.py", line 10, in tesla_login
driver.find_element_by_css_selector("#form-input-identity").send_keys(tesla_account_email)
File "C:\Program Files\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 814, in find_element_by_css_selector
return self.find_element(by=By.CSS_SELECTOR, value=css_selector)
File "C:\Program Files\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 1251, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "C:\Program Files\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 430, in execute
self.error_handler.check_response(response)
File "C:\Program Files\Python310\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 247, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"#form-input-identity"}
(Session info: chrome=103.0.5060.114)
Stacktrace:
Backtrace:
Ordinal0 [0x00DF6463+2188387]
Ordinal0 [0x00D8E461+1762401]
Ordinal0 [0x00CA3D78+802168]
Ordinal0 [0x00CD1880+989312]
Ordinal0 [0x00CD1B1B+989979]
Ordinal0 [0x00CFE912+1173778]
Ordinal0 [0x00CEC824+1099812]
Ordinal0 [0x00CFCC22+1166370]
Ordinal0 [0x00CEC5F6+1099254]
Ordinal0 [0x00CC6BE0+945120]
Ordinal0 [0x00CC7AD6+948950]
GetHandleVerifier [0x010971F2+2712546]
GetHandleVerifier [0x0108886D+2652765]
GetHandleVerifier [0x00E8002A+520730]
GetHandleVerifier [0x00E7EE06+516086]
Ordinal0 [0x00D9468B+1787531]
Ordinal0 [0x00D98E88+1805960]
Ordinal0 [0x00D98F75+1806197]
Ordinal0 [0x00DA1DF1+1842673]
BaseThreadInitThunk [0x76D36739+25]
RtlGetFullPathName_UEx [0x77E88FEF+1215]
RtlGetFullPathName_UEx [0x77E88FBD+1165]

Browser sizing issue

fullscreen
partial screen
There are some issues with any browser sizing. As you can see it would work with full screen and some files are missing.
I can see the whole picture with only a partial browser.

indexerror (already downgraded to selenium 4.2.0

I am getting this same issue. Launches browser. Populates username. refreshes page. Populates password. submits information. page asks for passcode (2FA), refreshes, says you don't have access to the page and kicks you back to the 2FA, I enter the code, click submit (or wait) and the browser closes.

C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master>python scrape.py
C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master\scrape.py:1: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.command.clean import clean

DevTools listening on ws://127.0.0.1:53787/devtools/browser/e4778eca-0cef-4c56-9229-57f471a67201
****** SESSION LOADED ******
[7332:23804:0722/094217.385:ERROR:device_event_log_impl.cc(214)] [09:42:17.386] USB: usb_device_handle_win.cc:1048 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
[7332:23804:0722/094217.395:ERROR:device_event_log_impl.cc(214)] [09:42:17.396] USB: usb_device_handle_win.cc:1048 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
Traceback (most recent call last):
File "C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master\scrape.py", line 273, in
run()
File "C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master\scrape.py", line 262, in run
driver.get_index()
File "C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master\scrape.py", line 53, in get_index
window1 = driver.window_handles[1]
IndexError: list index out of range

C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master>

unable to scrap help

Can i get some help.
Had to make some tweaks to get it to even download. But now it is downloading and after a while the browser closes exactly at the same spot every time it is downloading the Model S manual it gets to section 12 and just closes the Brower, opens back up and I authenticate again but it starts at the very beginning of the manual. It does not pickup from section 12.

Any help is appreciated?

***Changes to the requirements document
beautifulsoup4==4.10.0
certifi==2020.11.8
chardet==3.0.4
future==0.18.2
idna==2.10
lxml==4.8.0
Pillow==9.1.1
pywin32-ctypes==0.2.0
requests==2.25.1
soupsieve==2.2.1
urllib3==1.26.2
zipp==3.4.0

Scraper errors

Hi, thanks for creating this script. I am receiving the error below. Any idea why?

Also, in case it mattered, I noticed that the script does not log in directly to service.tesla.com, rather, it logs into tesla.com. After it eventually gets to the manual I specified, it fails.

DevTools listening on ws://127.0.0.1:57338/devtools/browser/cd79339b-94d9-4d66-af54-37bfc30 ****** SESSION LOADED ****** [7452:19956:0129/225725.215:ERROR:device_event_log_impl.cc(215)] [22:57:25.215] USB: usb_device_handle_win.cc:1046 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F) [7452:19956:0129/225725.216:ERROR:device_event_log_impl.cc(215)] [22:57:25.216] USB: usb_device_handle_win.cc:1046 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F) Traceback (most recent call last): File "c:\users\j\downloads\ts\scrape.py", line 273, in <module> run() File "c:\users\j\downloads\ts\scrape.py", line 262, in run driver.get_index() File "c:\users\j\downloads\ts\scrape.py", line 53, in get_index window1 = driver.window_handles[1] IndexError: list index out of range

macOS + Python (

Installed selenium, selenium-stealth, beautifulsoup4 and also ran chromedriver

Traceback (most recent call last):
File "scrape.py", line 7, in
import requests
ModuleNotFoundError: No module named 'requests'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.