Any value I retrieve from the database is a bytes object: <div class="snippet-clip

Retrieved values are byte strings (Python 3.7) about unqlite-python HOT 6 CLOSED

coleifer commented on June 1, 2024

Retrieved values are byte strings (Python 3.7)

from unqlite-python.

Comments (6)

coleifer commented on June 1, 2024

UnQLite doesn't differentiate between bytes and utf8-encoded unicode strings. unqlite-python exposes a Python interface over the C library, and to ensure the greatest flexibility we treat stuff as bytestrings. For example, I think one of the unqlite examples uses unqlite to store mp3 files -- if we treated stuff a unicode strings there would be no way to store binary file data.

It might make sense to provide an option, however, to decode strings... I'll think about it.

from unqlite-python.

james-carpenter commented on June 1, 2024

If possible, I would like to bump this issue. In using this on a larger scale as a JSON document store, the filter queries become quite convoluted when having to convert all strings used for comparison to bytes.

Here is a trivial example. The complexity compounds with more deeply nested documents.

Python 3.8.0 (v3.8.0:fa919fdf25, Oct 14 2019, 10:23:27) 
[Clang 6.0 (clang-600.0.57)] on darwin
>>> from unqlite import UnQLite
>>> db = UnQLite()
>>> users = db.collection('users')
>>> users.create()
>>> users.store({'name': 'Donald Duck'})
0
>>> users.filter(lambda u: u['name'].startswith('Donald'))
[]
>>> users.filter(lambda u: u['name'].startswith(b'Donald'))
[{'name': b'Donald Duck', '__id': 0}]

An example of the round-tripping issue this presents.

>>> doc = json.loads('{"name": "Fred Flintstone"}')
>>> users.store(doc)
1
>>> doc = users.filter(lambda u: u['name'].startswith(b'Fred'))[0]
>>> json.dumps(doc)
TypeError: Object of type bytes is not JSON serializable

Certainly not sure about the implementation implications, but the naive expectation would be that the data comes out the way it went in. Since UnQLite doesn't differentiate between bytes and utf8-encoded unicode strings, perhaps there is some way to apply the same concept during filtering, even if the consumer would have to fix the content of all the documents once retrieved.
Thank you.

from unqlite-python.

coleifer commented on June 1, 2024

Since UnQLite doesn't differentiate between bytes and utf8-encoded unicode strings, perhaps there is some way to apply the same concept during filtering, even if the consumer would have to fix the content of all the documents once retrieved.

That's the crux of the issue. UnQLite doesn't have a text type - it's just bytes / int / double / bool / array / object / null. When the filter callback is called by unqlite, we receive an array of unqlite values which have to be converted to Python types. Note that we do convert the keys of dictionaries to unicode where possible (see the final function, unqlite_value_to_dict), but values are left as-is. If you're dead set on this change, I'd suggest just forking and patching that function to decode your values as well.

Also I consider that changing the way this behaves at this point could break existing code.

from unqlite-python.

coleifer commented on June 1, 2024

After reading the unqlite maintainers comments, though, I think I may actually change this... They suggest that storing binary data should be done using the regular kv interface and treat the jx9 stuff as text.

I will reopen this and make the change.

from unqlite-python.

coleifer commented on June 1, 2024

I've changed the behavior so that all Jx9 / VM / Collection interfaces return string data as unicode (python3 str), so it is no longer necessary to mess with encoding/decoding.

For those users who wish to store binary data in the collections, the unqlite developers recommend either:

use the kv store apis directly
encode it using base64

I will be tagging a new release, 0.8.0, which will also be dropping "official" support for Python 2.

from unqlite-python.

james-carpenter commented on June 1, 2024

That's fantastic! Thank you for being so responsive.

from unqlite-python.

Retrieved values are byte strings (Python 3.7) about unqlite-python HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent