Code Monkey home page Code Monkey logo

Comments (8)

grantjenks avatar grantjenks commented on May 18, 2024 1

#61 merged and updated at c8aecac. I changed None to diskcache.UNKNOWN since None is a valid key/value. I also modified the test to use the first 32 characters of the sha256 hash and added testing to fanout cache.

from python-diskcache.

elistevens avatar elistevens commented on May 18, 2024

See also: #61

Thanks!

from python-diskcache.

grantjenks avatar grantjenks commented on May 18, 2024

@elistevens when you do "rsync" backups take careful note of the switches used. The test_rsync test was failing intermittently when using "rsync" without "--checksum". As I understand it, rsync checks file sizes and modification times to heuristically determine which files have been modified (this is the default behavior). For a little database like sqlite which is quickly edited and uses an underlying block storage, it's possible to modify the database but retain the same size and modification time.

You can pass "--checksum" to rsync to tell it not to use it's default heuristic and instead compare checksums. This way, you'll still benefit from incremental transfers. But if you were exceedingly paranoid, you might not trust the checksums and choose "--ignore-times" which will transfer all files and behave like a copy.

The switches I'm using in testing are:

rsync -a --checksum --delete --stats source destination

The purpose of each switch:

  • "-a" -- Copy everything.
  • "--checksum" -- Detect file changes using a checksum.
  • "--delete" -- Delete files in destination that are not in source.
  • "--stats" -- View the total number of bytes transferred; useful to see it's working incrementally.

from python-diskcache.

elistevens avatar elistevens commented on May 18, 2024

I'm surprised that the DB could be modified, but the modification time on the file not be updated.

For our use case, 99% of the content won't have been changed on a day-to-day basis, so our nightly backups will probably do something like using time+size for rsync, and then making sure the DB has been copied every time.

Thanks for letting me know there might be issues there.

from python-diskcache.

grantjenks avatar grantjenks commented on May 18, 2024

It's not so much that the modification time is not updated but that the resolution of the modification time is less than necessary. I develop on a MacBook Pro with an OS X Extended filesystem. According to Wikipedia and an Ars Technica article (both cited by a Stack Overflow answer), the resolution of the modification time on HFS+ file systems is 1 second. I think it's easy to imagine in testing that multiple modifications to the database could occur within the same second and the database could remain the same size.

Considering that you will likely do nightly backups, I doubt it could ever be an issue. I just want you to be aware that rsync uses heuristics (like file size and modification time) and those may be inaccurate.

You may also want to use the "-z" option to compress the transfer over rsync.

from python-diskcache.

elistevens avatar elistevens commented on May 18, 2024

Ahh, that makes much more sense. Yeah, that won't be an issue for our use case. Great!

from python-diskcache.

grantjenks avatar grantjenks commented on May 18, 2024

V3 tagged in git and deployed to PyPI. I'm waiting now to see Travis and AppVeyor come back green.

The new diskcache is faster than the old one. Yaay! Always a good sign.

I added test_core.py:test_custom_eviction_policy for your custom eviction scenario.

I also think the new design (new-style format strings) would allow you to update the expire_time on every get/incr. Something like:

    dc.EVICTION_POLICY['lru-gt-1s'] = {
        'init': None,
        'get': 'expire_time = {now} + 90 * 24 * 60 * 60',
        'cull': None,
    }

And then just use cache.expire() in your nightly job. I haven't tested that yet but you might want to look into it.

Also note that with your custom eviction policy, you can exceed the cache's size limit. If the culling query returns no rows then the culling stops regardless of the cache's volume.

from python-diskcache.

grantjenks avatar grantjenks commented on May 18, 2024

All green in Travis and AppVeyor. I think that meets all the v3 milestones.

from python-diskcache.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.