Code Monkey home page Code Monkey logo

Comments (7)

thp avatar thp commented on September 2, 2024 2

It might be useful to provide this as a config option. For now, I don't do it automatically so we can at some point maybe have a version browser and detect pages that change between multiple states (see #22), although this is not yet implemented.

Might also make sense to have some default cache clearing (e.g. clear entries older than X days) and just make the timeout configurable, with the value 0 meaning "clear all old entries".

from urlwatch.

brbsix avatar brbsix commented on September 2, 2024

cache.db is a SQLite 3.x database. You can open it up and review previous versions of any particular page. I don't know off the top of my head whether anything is ever removed though I can check.

from urlwatch.

Immortalin avatar Immortalin commented on September 2, 2024

@brbsix this function seems to suggest auto clean up but my python skills are not good enough to figure out exactly where it is implemented.

from urlwatch.

brbsix avatar brbsix commented on September 2, 2024

I just checked, everything is kept indefinitely unless you call the command-line option --gc-cache to remove old cache entries.

from urlwatch.

Immortalin avatar Immortalin commented on September 2, 2024

@brbsix can a feature be added to only retain the latest version of the page?

from urlwatch.

brbsix avatar brbsix commented on September 2, 2024

Sure... See the following diff of urlwatch if you want to add it. Essentially it just runs --gc-cache upon completion.

diff --git a/urlwatch b/urlwatch
index 4878c13..328cb6d 100755
--- a/urlwatch
+++ b/urlwatch
@@ -341,6 +341,9 @@ def main(args):
     # Output everything
     report.finish()

+    # Remove old cache entries
+    cache_storage.gc([job.get_guid() for job in jobs])
+
     # Close cache
     cache_storage.close()

Also if you don't want it to squawk at you about removed entries, you can incorporate the following:

diff --git a/lib/urlwatch/storage.py b/lib/urlwatch/storage.py
index 43463c7..20fa415 100644
--- a/lib/urlwatch/storage.py
+++ b/lib/urlwatch/storage.py
@@ -240,13 +240,10 @@ class CacheStorage(object):

     def gc(self, known_guids):
         for guid in set(self.get_guids()) - set(known_guids):
-            print('Removing: {guid}'.format(guid=guid))
             self.delete(guid)

         for guid in known_guids:
-            count = self.clean(guid)
-            if count > 0:
-                print('Removed {count} old versions of {guid}'.format(count=count, guid=guid))
+            self.clean(guid)


 class CacheDirStorage(CacheStorage):

from urlwatch.

Immortalin avatar Immortalin commented on September 2, 2024

@brbsix @thp thanks!

from urlwatch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.