Code Monkey home page Code Monkey logo

oreilly-dl's Introduction

OReilly-DL

This is a fork from lorenzodifuccia's repo.

Download and generate EPUB of your favourite book from O'Reilly Books Online.
I'm not responsible for the use of this programme, which is for personal and educational purposes only.
Note that I've merged many PRs that work like magic but also performed some tweaks without thoughtful considerations, thus it may present buggy issues and few pesticide could mitigate.

Overview:

EPUB FORMAT:

The EPUB® format provides a means of representing, packaging and encoding structured and semantically enhanced Web content — including HTML, CSS, SVG and other resources — for distribution in a single-file container.

  • META-INF (container.xml)
  <?xml version="1.0" encoding="UTF-8"?>
  <container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
    <rootfiles>
      <rootfile full-path="content.opf" media-type="application/oebps-package+xml"/>
    </rootfiles>
  </container>
  • mimetype
 application/epub+zip
  • content.opf
  <?xml version="1.0" encoding="UTF-8"?>
  <package xmlns="http://www.idpf.org/2007/opf"
         xmlns:redirect="http://xml.apache.org/xalan/redirect"
         version="3.1"
         unique-identifier="bookid">
   <metadata xmlns:opf="http://www.idpf.org/2007/opf"
             xmlns:dc="http://purl.org/dc/elements/1.1/">
      <dc:identifier id="bookid">XXXXXXXXXXXXX</dc:identifier>
      ...
      <meta name="cover" content="cover-image"/>
      ...
   </metadata>
   <manifest>
      <item id="css" href="../css" media-type="text/css"/>
      <item id="cover" href="cover.xhtml" media-type="application/xhtml+xml"/>
      <item id="cover-image"
            href="images/cover.jpg"
            media-type="image/jpeg"
            properties="cover-image"/>
      ...
   </manifest>
   <spine toc="ncx">
    <itemref idref="f_0077" />
    ...
   </spine>
   <guide>
      <reference type="cover" title="Cover" href="cover.xhtml"/>
   </guide>
  </package>

And EPUB is basically a compressed combination of all files above plus other meta-data collections, however here comes another problem where python's shutil.make_archive(zip) works in apple's Books whilst unix's zip system doesn't. (There has to be some connections between there two, very peculiar yet I'm curious to know, gotta dig it deeper).

For more info on EPUB, please check here

Usage:

$ git clone https://github.com/lorenzodifuccia/safaribooks.git or
$ git clone https://github.com/leignshanie/oreilly-dl.git
Cloning into 'oreilly-dl'...

$ cd oreilly-dl
$ pip3 install -r requirements.txt

Programme options:

$ python3 oreilly-dl.py --help
usage: oreilly-dl.py [--cred <EMAIL:PASS>] [--no-cookies] [--no-kindle]
                      [--preserve-log] [--help]
                      <BOOK ID>

Download and generate EPUB of your favourite books from O'Reilly Books.

positional arguments:
  <BOOK ID>            Book digits ID that you want to download.
                       You can find it in the URL (X-es):
                       `https://learning.oreilly.com/library/view/book-
                       name/XXXXXXXXXXXXX/`

optional arguments:
  --cred <EMAIL:PASS>  Credentials used to perform the auth login on Safari
                       Books Online.
                       Es. ` --cred "[email protected]:password01" `.
  --no-cookies         Prevent your session data to be saved into
                       `cookies.json` file.
  --no-kindle          Remove some CSS rules that block overflow on `table`
                       and `pre` elements. Use this option if you're not going
                       to export the EPUB to E-Readers like Amazon Kindle.
  --preserve-log       Leave the `info_XXXXXXXXXXXXX.log` file even if there
                       isn't any error.
  --help               Show this help message.

For the first time users, you'll have to specify your O'Reilly Books Online account credentials, which is in the format of

$ python3 oreilly-dl.py --cred "[email protected]:password" XXXXXXXXXXXXX
  • Xs indicate the 13-digit ISBN number, which is available in the Book url, e.g. https://learning.oreilly.com/library/view/how_to_build_a_harem/6666666666666/ Notice Sometimes ISBN in the book description page doesn't correspond to the url, so always trust the latter.
  • email:password with your own. Notice Use a combination of alphanumerical characters.

Later, you're free to omit the --cred inputs using:

$ python3 oreilly-dl.py XXXXXXXXXXXXX

Cheers!

oreilly-dl's People

Contributors

lorenzodifuccia avatar maxromanovsky avatar xyl1null avatar

Stargazers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.