Comments (8)
#61 merged and updated at c8aecac. I changed None
to diskcache.UNKNOWN
since None
is a valid key/value. I also modified the test to use the first 32 characters of the sha256 hash and added testing to fanout cache.
from python-diskcache.
See also: #61
Thanks!
from python-diskcache.
@elistevens when you do "rsync" backups take careful note of the switches used. The test_rsync test was failing intermittently when using "rsync" without "--checksum". As I understand it, rsync checks file sizes and modification times to heuristically determine which files have been modified (this is the default behavior). For a little database like sqlite which is quickly edited and uses an underlying block storage, it's possible to modify the database but retain the same size and modification time.
You can pass "--checksum" to rsync to tell it not to use it's default heuristic and instead compare checksums. This way, you'll still benefit from incremental transfers. But if you were exceedingly paranoid, you might not trust the checksums and choose "--ignore-times" which will transfer all files and behave like a copy.
The switches I'm using in testing are:
rsync -a --checksum --delete --stats source destination
The purpose of each switch:
- "-a" -- Copy everything.
- "--checksum" -- Detect file changes using a checksum.
- "--delete" -- Delete files in destination that are not in source.
- "--stats" -- View the total number of bytes transferred; useful to see it's working incrementally.
from python-diskcache.
I'm surprised that the DB could be modified, but the modification time on the file not be updated.
For our use case, 99% of the content won't have been changed on a day-to-day basis, so our nightly backups will probably do something like using time+size for rsync, and then making sure the DB has been copied every time.
Thanks for letting me know there might be issues there.
from python-diskcache.
It's not so much that the modification time is not updated but that the resolution of the modification time is less than necessary. I develop on a MacBook Pro with an OS X Extended filesystem. According to Wikipedia and an Ars Technica article (both cited by a Stack Overflow answer), the resolution of the modification time on HFS+ file systems is 1 second. I think it's easy to imagine in testing that multiple modifications to the database could occur within the same second and the database could remain the same size.
Considering that you will likely do nightly backups, I doubt it could ever be an issue. I just want you to be aware that rsync uses heuristics (like file size and modification time) and those may be inaccurate.
You may also want to use the "-z" option to compress the transfer over rsync.
from python-diskcache.
Ahh, that makes much more sense. Yeah, that won't be an issue for our use case. Great!
from python-diskcache.
V3 tagged in git and deployed to PyPI. I'm waiting now to see Travis and AppVeyor come back green.
The new diskcache is faster than the old one. Yaay! Always a good sign.
I added test_core.py:test_custom_eviction_policy for your custom eviction scenario.
I also think the new design (new-style format strings) would allow you to update the expire_time on every get/incr. Something like:
dc.EVICTION_POLICY['lru-gt-1s'] = {
'init': None,
'get': 'expire_time = {now} + 90 * 24 * 60 * 60',
'cull': None,
}
And then just use cache.expire()
in your nightly job. I haven't tested that yet but you might want to look into it.
Also note that with your custom eviction policy, you can exceed the cache's size limit. If the culling query returns no rows then the culling stops regardless of the cache's volume.
from python-diskcache.
All green in Travis and AppVeyor. I think that meets all the v3 milestones.
from python-diskcache.
Related Issues (20)
- Storage class NFS-rwx HOT 2
- Support pathlib.Path objects as first parameter HOT 3
- DiskCache in Flask question HOT 1
- JSONDisk example not working HOT 3
- Should cache be closed or not? HOT 2
- It would be nice to provide an in-memory METADATA/statistics HOT 3
- Losing key/value pairs after time=??? HOT 3
- Why not Android HOT 1
- Cache.get() ocassionally returns None when writing from multiple threads HOT 3
- [Feature Request] Support dill HOT 2
- django 4.1 incompatibility HOT 4
- TypeError: cannot pickle 'memoryview' object HOT 3
- 1 test fails: AttributeError: can't set attribute HOT 1
- Use case? Persistent concurrent set. HOT 3
- Memoize with defaulted parameters HOT 1
- Question about process safety HOT 1
- Cache __init__ is not thread/process safe HOT 5
- TypeError: __new__() missing 1 required positional argument: 'type_str' HOT 5
- Add a method to get the output file of a cached file HOT 3
- `diskcache.BoundedSemaphore` malfunctions on key eviction HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-diskcache.