Code Monkey home page Code Monkey logo

Comments (10)

achembarpu avatar achembarpu commented on June 21, 2024

Article Content API - Unfortunately, pocket does not provide extracted article content to api users without partner privileges.

I'm open to other ideas though. Maybe use a custom extraction method, via BeautifulSoup, or something?

from pockyt.

m040601 avatar m040601 commented on June 21, 2024

Thanks for your attention to this detail !
I see what you mean with the api issue, it makes sense.

But I'm still confused how there seems to be other ways to get the 'whole article' text directly from Pocket.For example with calibre, http://calibre-ebook.com , and it's python 'news recipe' scripts called 'readitlater.recipe' (1)

I'm no python expert, I can barely code some shell scripts and grasp a little bit of python.
I was wondering then,
how is it that using that script and calibre's command line tool 'ebook-convert' , http://manual.calibre-ebook.com/cli/ebook-convert.html I do get the entire text of my Pocket articles.

When i used this like for example,
ebook-convert ./readitlater.recipe outputfile.txt --username [email protected] --password my-pocket-account-password
or
ebook-convert ./readitlater.recipe output.OEB --username [email protected] --password my-pocket-account-password

I can get either a text file, or just a bunch of html files,
with all my articles exactly as they are rendered by Pocket

(1)
a. as it is distributed when you install calibre,
https://gist.github.com/m040601/a4258870759f9ad8a6ee
it works for me
b. another fork of the same script (that was not working for me)
tbunnyman/ReadItLater-Calibre-Plugin
https://github.com/tbunnyman/ReadItLater-Calibre-Plugin
This is an updated & modified version of the official Calibre plugin for Pocket (Formerly ReadItLater)

from pockyt.

achembarpu avatar achembarpu commented on June 21, 2024

Interesting. I'll check this out and think of a possible lightweight implementation.

Do you have the time to work on this, by any chance?

from pockyt.

m040601 avatar m040601 commented on June 21, 2024

Cool ! Thanks for your interest.

Do you have the time to work on this, by any chance?

Time yes, unfortunately not the skills to do it.
The only thing I can contribute is with research and feedback, as I like to thoroughly investigate and
compare all the available (python and others) solutions and implementations for this problem.

from pockyt.

achembarpu avatar achembarpu commented on June 21, 2024

Newspaper seems to provide Pocket-like functionality.

If this seems like a good enough alternative, I'm willing to integrate it. Thoughts?

EDIT: Actually, the PyPi distribution of newspaper is outdated, and depends on a lot of heavy libraries - see requirements.

Instead, a better alternative seems to be readability-lxml. Significantly lighter and simpler to use.

from pockyt.

achembarpu avatar achembarpu commented on June 21, 2024

I'm hacking away on this right now. Let's see how it goes.

EDIT: See #7.

from pockyt.

achembarpu avatar achembarpu commented on June 21, 2024

Oops, almost forgot. The reason I'm not considering the scripts you linked to is:

  • They scrape getpocket.com directly, which is forbidden by their ToS.
  • Since it's a scrape, the moment Pocket changes their html, it will fail.

However, if this solution isn't good enough, I might reconsider.

from pockyt.

billlyzhaoyh avatar billlyzhaoyh commented on June 21, 2024

Is there any hack to this? I am going back to a historic collection of articles and what I have found is that the articles have been taken down by the news sites... I would think even saving the HTML response of the article at that time and store it into the DB will help tremendously

from pockyt.

achembarpu avatar achembarpu commented on June 21, 2024

@billlyzhaoyh - Good use-case, I had a bit of time to hack on this today. Managed to get HTML archiving working in 1.4.0.

eg - Get all favorited items and save offline copies of them:
pockyt get -v 1 -a ./pocket

Let me know if it works for you.

from pockyt.

achembarpu avatar achembarpu commented on June 21, 2024

Closing as stale.

from pockyt.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.