Code Monkey home page Code Monkey logo

epub's Introduction

epub Build Status

epub is a node.js module to parse EPUB electronic book files.

NB! Only ebooks in UTF-8 are currently supported!.

Installation

npm install epub

Or, if you want a pure-JS version (useful if used in a Node-Webkit app for example):

npm install epub --no-optional

Usage

import EPub from 'epub'
const epub = new EPub(pathToFile, imageWebRoot, chapterWebRoot)

Where

  • pathToFile is the file path to an EPUB file
  • imageWebRoot is the prefix for image URL's. If it's /images/ then the actual URL (inside chapter HTML <img> blocks) is going to be /images/IMG_ID/IMG_FILENAME, IMG_ID can be used to fetch the image form the ebook with getImage. Default: /images/
  • chapterWebRoot is the prefix for chapter URL's. If it's /chapter/ then the actual URL (inside chapter HTML <a> links) is going to be /chapters/CHAPTER_ID/CHAPTER_FILENAME, CHAPTER_ID can be used to fetch the image form the ebook with getChapter. Default: /links/

Before the contents of the ebook can be read, it must be opened (EPub is an EventEmitter).

epub.on('end', function() {
  // epub is initialized now
  console.log(epub.metadata.title)

  epub.getChapter('chapter_id', (err, text) => {})
})

epub.parse()

metadata

Property of the epub object that holds several metadata fields about the book.

epub.metadata

Available fields:

  • creator Author of the book (if multiple authors, then the first on the list) (Lewis Carroll)
  • creatorFileAs Author name on file (Carroll, Lewis)
  • title Title of the book (Alice's Adventures in Wonderland)
  • language Language code (en or en-us etc.)
  • subject Topic of the book (Fantasy)
  • date creation of the file (2006-08-12)
  • description

flow

flow is a property of the epub object and holds the actual list of chapters (TOC is just an indication and can link to a # url inside a chapter file)

epub.flow.forEach(chapter => {
    console.log(chapter.id)
})

Chapter id is needed to load the chapters getChapter

toc

toc is a property of the epub object and indicates a list of titles/urls for the TOC. Actual chapter and it's ID needs to be detected with the href property

getChapter(chapter_id, callback)

Load chapter text from the ebook.

epub.getChapter('chapter1', (error, text) => {})

getChapterRaw(chapter_id, callback)

Load raw chapter text from the ebook.

getImage(image_id, callback)

Load image (as a Buffer value) from the ebook.

epub.getImage('image1', (error, img, mimeType) => {})

getFile(file_id, callback)

Load any file (as a Buffer value) from the ebook.

epub.getFile('css1', (error, data, mimeType) => {})

epub's People

Contributors

abrady0 avatar andris9 avatar christianroy avatar craftsmaker avatar graimon avatar json02 avatar julien-c avatar mawi12345 avatar nemanjan00 avatar orthros avatar smarticles101 avatar wxclover avatar xamgore avatar xavdid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

epub's Issues

TypeScript definition errors

  • The type definition epub.d.ts contains /// <reference path="../node/node.d.ts" />, but that won't exist if installing normally via npm. /// <reference types="node" /> would work better. Note there should also be a dependency on the npm package @types/node or that won't work either.
  • Instead of declare module "epub" { ... }, it would be better to write everything directly at the top-level (by simply removing that wrapper) so this file can be imported.

how to open css?

epub.getFile(epub.manifest.style.href, function (err, data, mimeType) {
  if (err) throw err;
  console.log(data);
});

returns file not found. What am I doing wrong?

I've also tried messing around with unzipping it but i keep failing on that end too.

Move zipfile to peerDependencies

npm install also installs optional dependencies, zipfile in case of epub package. I suggest moving it to peerDependencies.

At the moment node-gyp is called to compile zipfile on macOS Catalina, Node 13/14, fails, and prints a huge error stack in the log. Every time I call npm install. It can be fixed with npm set optional false, but...

No rootfiles found

I'm running your project's example.js and I'm getting this error. Do I have to make a directory for images and content to be put on, or what? I don't really get what is the role of the rootfiles

TOC title properties are inaccurately defined as empty strings

The following ternary (lines 529-530 of epub.js) defines title as an empty string even when the given TOC title is truthy:

title = branch[i].navLabel && branch[i].navLabel.text || branch[i].navLabel===branch[i].navLabel ? '' : (branch[i].navLabel && branch[i].navLabel.text || branch[i].navLabel || "").trim();

Unless I'm mistaken, the solution is a simple matter of switching the expressions like so:

title = branch[i].navLabel && branch[i].navLabel.text || branch[i].navLabel===branch[i].navLabel ? (branch[i].navLabel && branch[i].navLabel.text || branch[i].navLabel || '').trim() : '';

readFile hangs forever when loading any file

I dont know if im doing something wrong, but when i try to load a file

var EPub = require("epub");
    var epub = new EPub('EPUBFILE', '/images/IMG_ID/IMG_FILENAME', '/chapters/');

    epub.on("end", function(){
        // epub is now usable

                epub.getFile("coverstyle", function(error, data, mimeType){

                    console.log(error);
                    console.log(data);


                });

                res.send('done')

    });

Nothing works, it hangs forever.

Extracting images of the books

Hi.

In a nutshell, I need to get the images of the books inside the contents.

Having the src attribute, get the image in base64, buffer it in a image/*, and set the src of the image with the new URL.

How can I do that?

Error parsing certain ePub files

Hi, I've been creating some ePub files with calibre, and have been reading the metadata of them using your ePub node module. It works great, but with a couple of files it throws exceptions (see stacktrace below).

The files aren't public domain, so i can't attach them, but i can email them to you if you'd like.

Here is the stacktrace:

/Users/jonathon/Documents/node/ebooks/node_modules/epub/node_modules/xml2js/lib/xml2js.js:216
throw ex;
^
TypeError: Cannot read property 'length' of undefined
at EPub. (/Users/jonathon/Documents/node/ebooks/populate.js:54:33)
at EPub.EventEmitter.emit (events.js:92:17)
at EPub. (/Users/jonathon/Documents/node/ebooks/node_modules/epub/epub.js:445:18)
at Parser.EventEmitter.emit (events.js:95:17)
at Object.saxParser.onclosetag (/Users/jonathon/Documents/node/ebooks/node_modules/epub/node_modules/xml2js/lib/xml2js.js:183:24)
at emit (/Users/jonathon/Documents/node/ebooks/node_modules/epub/node_modules/xml2js/node_modules/sax/lib/sax.js:615:33)
at emitNode (/Users/jonathon/Documents/node/ebooks/node_modules/epub/node_modules/xml2js/node_modules/sax/lib/sax.js:620:3)
at closeTag (/Users/jonathon/Documents/node/ebooks/node_modules/epub/node_modules/xml2js/node_modules/sax/lib/sax.js:861:5)
at Object.write (/Users/jonathon/Documents/node/ebooks/node_modules/epub/node_modules/xml2js/node_modules/sax/lib/sax.js:1294:29)
at Parser.exports.Parser.Parser.parseString (/Users/jonathon/Documents/node/ebooks/node_modules/epub/node_modules/xml2js/lib/xml2js.js:211:31)
at Parser.parseString (/Users/jonathon/Documents/node/ebooks/node_modules/epub/node_modules/xml2js/lib/xml2js.js:6:61)

Support for other eBook formats

Hi,

This project is incredibly useful and very easy to use.

I was gonna try to wrap the Calibre command line tool ebook-meta but having a pure Node.js implementation is way better.

In your opinion, how hard would it be to support other ebooks format?
I am particularly interested in the metadata and not so much in the content itself.

Thanks

New release

The latest release is from 2015 and contains breaking bugs that have since been fixed (for example the crash when navLabel text isnt a string). Could a new release be made, so npm users don't run into these issues?

Fixup 0862fd2

npm install epub --no-optional downloads the tag v0.1.5.

please publish a version with 87388fd (Fixup)

thx, great work

require .js

Hai,

I have tried your file ,but it shows me an error as require is undefined in example.js.can u provide me some ideas to overcome this issue.Thanks in advance.

Not loading images

Greetings, I want to congratulate you, because your project is very usefull, I just want to know why the images are not loaded in the getChapter function?

Upgrade xml2js?

Hi,

Would you be willing to accept a PR upgrading xml2js? Semantics have changed and I find it strenuous to refer to an old version of the documentation.

Cheers!

Julien

Error: Invalid/missing file

I use the code https://github.com/julien-c/epub/blob/master/example/example.js in my project

image

var EPub = require("epub");

var epub = new EPub("./111.epub", "/imagewebroot/", "/articlewebroot/");
epub.on("error", function(err){
    console.log("ERROR\n-----");
    throw err;
});
Error: Invalid/missing file
    at EPub.open (/Users/menglingyu/mly/anywhere-reader/node_modules/epub/epub.js:103:32)
    at EPub.parse (/Users/menglingyu/mly/anywhere-reader/node_modules/epub/epub.js:89:14)
    at Object.<anonymous> (/Users/menglingyu/mly/anywhere-reader/epub-dome/index.js:38:6)
    at Module._compile (module.js:652:30)
    at Object.Module._extensions..js (module.js:663:10)
    at Module.load (module.js:565:32)
    at tryModuleLoad (module.js:505:12)
    at Function.Module._load (module.js:497:3)
    at Function.Module.runMain (module.js:693:10)
    at startup (bootstrap_node.js:191:16)

Add callbacks

I am currently refactoring the whole lib to add callbacks.
One big benefit of this the the methods are now decoupled (getMimeType does not call getRootFiles...)

I have done this to be able to unit test one method (preparation for the xml2js option update).

Please have a look at mawi12345@a20e6fd and let me know what you think about the changes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.