Comments (6)
Thanks for opening this! This seems like a useful case to handle.
Aside: I'm curious to learn more about your use case here (if you're willing to share). Is this an API server-like-thing? What kind of storage are you planning on storing the msgpack'd payloads in? What benefits are you hoping to get out of using msgspec this way?
I'm also having a hard time finding a DIY workaround to make this work. Any tips would be appreciated!
There unfortunately isn't really a good way to hack around this. The best thing I can recommend for now is to define a second struct subclass that defines array_like=True
, manually convert the payload to use the new types, then encode:
>>> import msgspec
>>> class User(msgspec.Struct, array_like=False):
... name: str
>>> class UserArrayLike(User, array_like=True):
... # no need to duplicate the field definition, this is inherited
... pass
>>> u = msgspec.json.decode(b'{"name": "john"}', type=User)
>>> u2 = UserArrayLike(u.name) # manually convert
>>> msgspec.msgpack.encode(u2)
b'\x91\xa4john'
I can see a few different ways to make this work. The main change here is mostly at the config level, making this work in the backends should be fairly straight forward. Note that due to msgspec's design the configuration needs to be applied to the struct class, not to the encoder/decoder.
Option 1: Make array_like
accept a dict
This would make array_like
(and probably later omit_defaults
) accept a dict mapping protocol
to value. In your case you'd have:
class User(msgspec.Struct, array_like={"msgpack": True}):
name: str
The current case of array_like=True
would be shorthand for array_like={"json": True, "msgpack": True}
.
Option 2: Add protocol-specific config options
This would add new json
/msgpack
kwargs (or maybe json_options
/msgpack_options
? idk what a good name would be) that each take a dict mapping config options to apply to their relevant protocols. The config inheritance would then be:
- value in protocol-specific dict, if present (e.g.
json_options={"array_like": True}
) - value in top-level config, if present (e.g.
array_like=True
) - value in base class, if present
So your example would be:
class User(msgspec.Struct, msgpack_options={"array_like": True}):
name: str
I'm torn between these - right now I'm leaning towards option 2 if only that it makes it easier to add additional protocol-specific options later.
If you have any thoughts on the apis presented here, I'd love to hear them.
from msgspec.
I'm receiving a stream of json data that I want to store in a nosql database. I want to reduce the size of (a large part of) the payload to reduce network usage and storage size while reading/writing. Much data is infrequently read and does not need to be (human) readable/indexable while at rest. Still in the process of comparing this vs compression, but I like the added validation and having the schemas defined, with very good performance. Might also try doing both.
Manually converting is annoying because we have nested structs, with optional values, within lists, within dicts, etc. So would need to do some custom hacky parsing it feels like. Any suggestion for this?
In terms of support,it feels most natural to me to have the ability to do this during encoding/decoding, as I had above.
msgspec.msgpack.encode(u, array_like=True, omit_defaults=True)
But I guess this is a much more fundamental/unwanted change. As for options 1/2, maybe option 1 makes more sense because you don't have to deal with:
class User(msgspec.Struct, array_like=True, msgpack_options={"array_like": False}):
name: str
But really either seem good :).
from msgspec.
So would need to do some custom hacky parsing it feels like. Any suggestion for this?
Yeah, this seems unpleasant. The good news is the feature I outlined above should be pretty quick to implement. If you're fine waiting a week or two for the next release then there should be no need to hack around this.
from msgspec.
I made a workaround for now to be able to test the scenario, but would be great to have this implemented!
from msgspec.
I made a workaround for now to be able to test the scenario
Glad to hear it! I'm curions - how'd the test turn out? Do you still think this feature would be useful for you?
This is proving a little more complicated to implement than I would have hoped (mostly due to error handling). I have a plan laid out, but it'll require some internal refactoring. I still think this feature makes sense, but don't want to spend the effort (yet) if you don't think you'll still want it.
from msgspec.
I still think it would be useful, but so far we haven't given implementing this priority, so maybe if the 'pretty quick to implement' didn't turn out to be true, this doesn't need to be priority here either.
from msgspec.
Related Issues (20)
- Inherit __init_subclass__ args to subclass HOT 1
- ormsgpack benchmark comparison HOT 1
- encode sort_keys argument HOT 16
- Allow more than one str-like type in unions HOT 5
- decode's strict=False does not cast floats to ints HOT 3
- Typed encoding HOT 1
- Validation on serialization HOT 5
- Make msgspec more aware of large data and other serialization protocols
- Recommendation for Efficiently Decoding BSON from PyMongo with msgspec
- leading underscores and `"camel"` rename strategy HOT 2
- Constraint check for typing.Optional field HOT 1
- Automatically support `functools.cached_property` without requiring setting `dict=True` HOT 2
- UnboundLocalError for `new_scope` HOT 3
- Sign release tags when making future releases? HOT 4
- How can I do data normalization on a frozen instance in `__post_init__`? HOT 1
- Cannot convert with `from_attributes` when using a rename convention HOT 3
- Cannot set `gc=False` on Generic structs? HOT 5
- Support types.MappingProxyType HOT 3
- Add either `init_omit_defaults` or `omit_none` HOT 5
- Consider making `DecodeError` and `ValidationError` inherit from `ValueError` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from msgspec.