Code Monkey home page Code Monkey logo

temboz's Introduction

The Temboz feed reader

Introduction

Temboz is a web-based RSS/Atom aggregator and feed reader that focuses on saving you time by letting you filter out articles you are not interested in.

It is inspired by FeedOnFeeds (web-based personal aggregator), Google News (two column layout) and TiVo (thumbs up and down).

Features

  • Two-column user interface for better readability and information density. Automatic reflow using CSS.
  • Information Hunter-gatherer user interface: items flagged with a "Thumbs down" disappear immediately off the screen (using Dynamic HTML), making room for new articles.
  • Extensive filtering capabilities:
    • By keyword or phrase
    • By tag
    • using Python expressions
  • Ratings system for articles
  • Share articles you flagged as "Thumbs Up" via Facebook or as an Atom feed
  • Built-in web server.
  • Ad filtering
  • Multithreaded, download feeds in parallel.

History

I have been using Temboz as my feed reader since 2004. I currently have over 500 feeds subscribed to, and my filtering rules get rid of around 1/3 of the incoming firehose of information.

Screen shots

Reader UI

The home page is the article reading interface, using a two-column layout. Clicking on the "Thumbs down" icon makes the article disappear, bringing a new one in its place (if available). Clicking on the "Thumbs up" icon highlights it in yellow and flags it as interesting in the database.

Feed summary

The feed summary page shows statistics on feeds, starting with feeds with unread articles, then by alphabetical order. Feeds can be sorted based on other metrics. You have the option of "catching up" with a feed (marking all the articles as read). Feeds with errors are highlighted in red (not shown). The default sort order is by feed signal-to-noise ratio.

Feed etails

Clicking on the "details" link for a feed brings up this page, which allows you to change title or feed URL, and shows the RSS or Atom fields accessible for filtering.

Filtering rules

Feeds can be filtered by keyword, phrase, tag, author or using Python expressions. Filtering out junk pop culture makes for tremendous time savings.

Known bugs

You can check outstanding bug reports, change requests and more on the GitHub issue tracker.

Installation

  • You will need Python 3.8+ installed on your machine, and a reasonably recent version of SQLite, ideally with the json1 and fts5 extensions enabled for optimum performance
  • If you do not have pip, install it by running python -m ensurepip (you may need to do this as root depending on how your Python installation is set up, or use a system package manager like apt-get).
  • If you do not have virtualenv installed, install it using pip install virtualenv (or use a package manager if required).
  • Create a directory and virtualenv to run Temboz, in this case tembozdir: virtualenv tembozdir
  • cd tembozdir
    • If you are a bash/ksh user: . bin/activate
    • If you are a tcsh/csh user: source bin/activate.csh
  • Install Temboz in the virtualenv: pip install temboz
  • When you run Temboz for the first time, it will prompt you for the network address/port it should listen on, and your login/password: ./bin/temboz
  • Optionally, you can import an OPML subscription file if you have one: ./bin/temboz --import foo.opml
  • If you imported subscriptions, you can trigger a manual refresh: ./bin/temboz --refresh
  • You can now start the Temboz server: ./bin/temboz --server

Keeping informed

I would highly recommend you subscribe to Temboz' RSS feed to be notified of security releases and other major announcements. It's less than one post a year, I promise...

Credits

Temboz is written in Python, and leverages Mark Pilgrim’s Ultra-liberal feed parser, SQLite, Flask.

Post scriptum

The name "Temboz" is a reference to Malima Temboz, "The mountain that walks", an elephant whose tormented spirit is the object of Mike Resnick’s excellent SF novel, Ivory.

temboz's People

Contributors

fazalmajid avatar sea163 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

temboz's Issues

category element regression in feedparser.py

The new CVS version of feedparser dropped the category element and replaced it with 'tags' that has different semantics. This breaks filtering rules that depended on the old semantic. normalize.py should supply the old semantics for backwards compatibility.

unit tests for tarball

The initial deployment of release 0.7 was missing degunk.py because it hadn't been commmitted into CVS. We need an automated acceptance test to verify tarballs in "make dist".

<br/> in feed contents not handled correctly

There is a bug in feedparser that causes
to be replaced by semi-random combinations of
>
or
.
. This is particularly visible with Weblogs Inc. blogs. Best to fix this in feedparser.py and contribute the fix back, rather than just a work-around.

See also this bug entry on SourceForge

regression testing against SQLite 3.x/pysqlite 2.x

Make sure Temboz is compatible with the latest versions of SQLite and pysqlite.

This requires a version of pysqlite that can simultaneously access a 2.x or 3.x database (SQLite itself happily lets 2.x and 3.x coexist on the same system).

add filtering functions for tags

Newer versions of feedparser support an element "tags". A filtering function tag_any(|_lc|_words) similar to title_any_words would be useful, if only to get rid of Tim Bray's boring flower photos.

TypeError in singleton.py

I tried upgrading to 0.8 today and I'm getting the following error message. (I upgraded my database to sqlite3 as you noted).

./temboz --server
46912504432336 1141586308.34 ("select sql from sqlite_master where name='v_feeds'",)
46912504432336 1141586308.37 done
Traceback (most recent call last):
File "./temboz", line 69, in ?
server.run()
File "/home/bdp/Downloads/Temboz/temboz-0.8/server.py", line 376, in run
from singleton import db
File "/home/bdp/Downloads/Temboz/temboz-0.8/singleton.py", line 203, in ?
sql = c.fetchone()[0]
TypeError: unsubscriptable object

This also happens when I try to run with a fresh db.

Feed details update of description with accented characters fails

Traceback (most recent call last):
File "/home/majid/temboz/server.py", line 370, in process_request
self.use_template(tmpl, [self.input])
File "/home/majid/temboz/server.py", line 299, in use_template
tmpl.respond(trans=self)
File "pages/feed_info.py", line 154, in respond
if VFFSL(SL,"getVar",False)('feed_desc', None) and VFFSL(SL,"getVar",False)('feed_desc', None) != feed_desc:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc0 in position 0: ordinal not in range(128)

sort by SNR broken with SQLite 3.x

The sort by SNR option in the AllFeedsPage and the default sort order for that page (feeds with unread articles first, sorted by SNR, then feeds without unread articles, sorted by SNR) are broken when Temboz is running SQLite 3.x/PySQLite 2.x

Follow feed URL redirects

Feeds sometimes move and the old feed redirects to the new one. If the redirection is a permanent one (HTTP status code 301), Temboz should save the new URL as the XML URL for the feed.

Often when this happens, the permalinks/guids in the feed change as well (even though they shouldn't), leading to duplicate posts. An optimization would be to filter out items with duplicate titles after the feed has changed locations.

We should probably create a pseudo-article as a service notification so the user is aware something changed with the feed, along with a list of duplicate items suppressed.

There is a potential privacy issue with this: a feed owner might redirect the user to a feed URL with some tracking code embedded in it for tracking purposes. Putting the URLs in the feed URL change service notification would allow the user to be aware of this if it happens.

run as a cgi-script

Hey, temboz looks great. However, I'd like the ability to run it as a cgi-script (or mod_python) like alot of typical server side aggregators, as I use external hosting and am not enabled to run daemons on that machine.

Thanks

Server hanging during an update

Temboz is hanging during an update. The server will accept connections then hang. The log shows the following error message:

Traceback (most recent call last):
  File "/usr/local/lib/python2.4/threading.py", line 442, in __bootstrap
    self.run()
  File "/home/majid/temboz/update.py", line 476, in run
    update()
  File "/home/majid/temboz/update.py", line 463, in update
    update_feed(db, c, *feed_info)
  File "/home/majid/temboz/update.py", line 261, in update_feed
    process_parsed_feed(f, c, feed_uid, feed_dupcheck)
  File "/home/majid/temboz/update.py", line 378, in process_parsed_feed
    c.execute(sql)
  File "/home/majid/temboz/singleton.py", line 16, in execute
    result = self.c.execute(*args, **kwargs)
  File "/usr/local/lib/python2.4/site-packages/sqlite/main.py", line 244, in exe
cute
    self.rs = self.con.db.execute(SQL)
TypeError: execute() argument 1 must be string without null bytes, not str

Exception exceptions.AssertionError: <exceptions.AssertionError instance at 0x85
7f58c> in  ignored

deadlocks with SQLite 3.x

When Temboz is run on SQLite 3.x/PySQLite 2.x, sometimes when you do an action that updates the database like clicking on "Thumbs down" while the hourly fetch is done, Temboz will deadlock, and has to be killed.

create a feed SNR trends report

It would be good to have a report that shows trends in the SignalNoiseRatio, e.g. calculated over the last month, last three months, entire history, like a stock chart, to identify feeds that are losing their relevance and are thus possible candidates for culling.

support PostgreSQL as a database backend

SQLite's handling of concurrency is not very good. Things crawl to a halt when trying to read and write at the same time, e.g. when trying to read feeds as the update thread kicks in. A database with multi-version concurrency control like PostgreSQL would do a much better job.

Item annotations

Users should be able to enter annotations on an item, so they can share an annotated list of interesting links rather than one without comments as today.

Feeds Not Being Polled

I updated my Temboz install to the CVS version (CVS-2006-03-06) to try the new buttons, but now it appears that it is not updating my feeds. I can force an update with --refresh, but --server mode doesn't seem to be doing update polling automatically. My interval in param.py is set at the install default of 3600.

I poked around a wee bit, and tried turning on the Debug mode in param.py but I couldn't figure out a workaround. I tried running CVS-2006-03-10 but it didn't seem to make a difference. I have not tries switching back to 0.8 yet. I'm wondering if I missed something important when changing versions. I just copied my 0.8 db file over.

support a "deferred" article status

Peter Janes reports he uses "Thumbs up" to sideline articles for future consumption, rather than interesting/uninteresting articles. This is a perfectly common use case that should be fully supported.

add a site-specific password generator

I am lazy and use the same password for most non-sensitive sites (and another password for sensitive sites like online banking). This is clearly insecure and subject to catastrophic failure.

Temboz is as good a place as any for me to implement a single-sign-on system that would generate a domain-specific password for a site, that way if my password is compromised, the damage is contained within that site.

Of course, this feature has nothing to do with feed reading...

http://www.russellbeattie.com/notebook/1008751.html

log why an article was filtered

Sometimes a filtering rule is over-zealous. It would help debug such rules to link the article to the rule that caused it to be filtered, either in the log, or better yet in the fm_articles table itself.

This is not perfect, as the filtering rule text may have changed between the moment it was evaluated to filter out the article and the moment the user checks it, but it's better than nothing.

improve packaging

Packaging Temboz for easier installation using Py2app or Py2exe with all dependencies included would make it easier to use for non-Pythonistas.

add time dimension to SNR

The SignalNoiseRatio report and sorting do not currently take time into consideration. This is problematic, as feeds can lose relevance over time. Implementing something like an exponentially decaying SNR (like that used to compute UNIX load averages) would improve the usefulness of this feature without unduly penalizing high-quality, ultra-low-volume feeds the way an arbitrary cut-off to the last N days would.

style tag contents are not stripped properly

Photos exported from Flickr to Typepad blogs have embedded <style> tags in them. Feedparser strips them out for security reasons (although they could probably be sanitized the same way style attributes are). However the tags themselves are stripped but not the contents, so the CSS code appears in the entries instead of being filtered or sanitized.

{image: screenshot.gif}

Filtering rule titles

Filtering rules should have a title field so we can give a complex rule an easier to scan for label like "annoying memes".

feedparser debug output not escaped properly

Peter Janes reports the feedparser debug output in the FeedDetailsPage is sometimes not escaped correctly and shows images, etc.

This is probably due to feeds with a literal in the feed text. Any occurrences of such should be escaped to prevent a potential security risk, not to mention UI breakage.

Rating Buttons at Bottom

I recently installed Temboz. I like it so far, but one thing I noticed was that I end up scrolling around a lot. To be more specific, I read an article and then I have to scroll back up to click on the Interesting or Uninteresting buttons.

It would be nice to either have those display at the bottom and the top or to have an option of having them display at the bottom.

editing rules risks losing changes

There is a potential race condition when editing rules - you might have modified the rule you are editing in another window, and thus unwittingly edit a stale entry. The rules submission should use something like a MD5 hash of the old rule content so that attempts to submit an edited version of a stale rule can be detected and the conflict resolved by the user.

email alerts

One CreativeUseForTemboz is as a filter for feeds like Craigslist apartment search or Techbargains.com. When time is of the essence, sending an email notification of new articles could make the difference between getting the deal or not.

feed update frequency controls

Feeds are polled hourly by Temboz. It does use HTTP mechanisms like Etags or If-Modified-Since to avoid downloading an entire feed if it hasn't changed (not sure about mod_gzip support, though), but for many feeds a daily or even weekly update would be just fine.

One way would be to use statistics like item inter-arrival times, but we probably should have a user control that sets hourly/daily/weekly/monthly/automatic updates.

Sort articles in the main view by SNR

It would be very valuable to have articles in the MainView be sorted by SignalNoiseRatio, so articles from high-quality feeds stand out from the rest when you are in a hurry.

Possibly, instead of a simple priority sort based on SNR, do something like a weighed sort like order by (current_date - date_created)/snr

Handle multiple feeds better in autodiscovery

Some sites have multiple feeds listed in autodiscovery: main feed, feed with excepts, comments, and so on. Temboz should offer a way to let the user select between them.

For simplicity's sake, when both ATOM and RSS are offered, don't consider these as different variants, only when you have multiple ATOM or multiple RSS feeds listed.

add feed-level filtering rules

FilteringRules are powerful, but difficult to manage if there are too many. It would make sense to reduce clutter on the filtering rules page by moving feed-specific rules to the FeedDetailsPage instead.

articles purged after 6 months but still in feeds reappear

Temboz purges uninteresting articles after 6 months. Normally, this is after 6 months or the oldest article in a feed, but for slowly updated feeds (including my own), the detection of which article is oldest is not working correctly as purged articles reappear.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.