Code Monkey home page Code Monkey logo

Comments (6)

jcrist avatar jcrist commented on May 16, 2024

Thanks for opening this! This seems like a useful case to handle.

Aside: I'm curious to learn more about your use case here (if you're willing to share). Is this an API server-like-thing? What kind of storage are you planning on storing the msgpack'd payloads in? What benefits are you hoping to get out of using msgspec this way?

I'm also having a hard time finding a DIY workaround to make this work. Any tips would be appreciated!

There unfortunately isn't really a good way to hack around this. The best thing I can recommend for now is to define a second struct subclass that defines array_like=True, manually convert the payload to use the new types, then encode:

>>> import msgspec

>>> class User(msgspec.Struct, array_like=False):
...     name: str

>>> class UserArrayLike(User, array_like=True):
...     # no need to duplicate the field definition, this is inherited
...     pass

>>> u = msgspec.json.decode(b'{"name": "john"}', type=User)
>>> u2 = UserArrayLike(u.name)  # manually convert
>>> msgspec.msgpack.encode(u2)
b'\x91\xa4john'

I can see a few different ways to make this work. The main change here is mostly at the config level, making this work in the backends should be fairly straight forward. Note that due to msgspec's design the configuration needs to be applied to the struct class, not to the encoder/decoder.

Option 1: Make array_like accept a dict

This would make array_like (and probably later omit_defaults) accept a dict mapping protocol to value. In your case you'd have:

class User(msgspec.Struct, array_like={"msgpack": True}):
    name: str

The current case of array_like=True would be shorthand for array_like={"json": True, "msgpack": True}.

Option 2: Add protocol-specific config options

This would add new json/msgpack kwargs (or maybe json_options/msgpack_options? idk what a good name would be) that each take a dict mapping config options to apply to their relevant protocols. The config inheritance would then be:

  • value in protocol-specific dict, if present (e.g. json_options={"array_like": True})
  • value in top-level config, if present (e.g. array_like=True)
  • value in base class, if present

So your example would be:

class User(msgspec.Struct, msgpack_options={"array_like": True}):
    name: str

I'm torn between these - right now I'm leaning towards option 2 if only that it makes it easier to add additional protocol-specific options later.

If you have any thoughts on the apis presented here, I'd love to hear them.

from msgspec.

RynoM avatar RynoM commented on May 16, 2024

I'm receiving a stream of json data that I want to store in a nosql database. I want to reduce the size of (a large part of) the payload to reduce network usage and storage size while reading/writing. Much data is infrequently read and does not need to be (human) readable/indexable while at rest. Still in the process of comparing this vs compression, but I like the added validation and having the schemas defined, with very good performance. Might also try doing both.

Manually converting is annoying because we have nested structs, with optional values, within lists, within dicts, etc. So would need to do some custom hacky parsing it feels like. Any suggestion for this?

In terms of support,it feels most natural to me to have the ability to do this during encoding/decoding, as I had above.

msgspec.msgpack.encode(u, array_like=True, omit_defaults=True)

But I guess this is a much more fundamental/unwanted change. As for options 1/2, maybe option 1 makes more sense because you don't have to deal with:

class User(msgspec.Struct, array_like=True, msgpack_options={"array_like": False}):
    name: str

But really either seem good :).

from msgspec.

jcrist avatar jcrist commented on May 16, 2024

So would need to do some custom hacky parsing it feels like. Any suggestion for this?

Yeah, this seems unpleasant. The good news is the feature I outlined above should be pretty quick to implement. If you're fine waiting a week or two for the next release then there should be no need to hack around this.

from msgspec.

RynoM avatar RynoM commented on May 16, 2024

I made a workaround for now to be able to test the scenario, but would be great to have this implemented!

from msgspec.

jcrist avatar jcrist commented on May 16, 2024

I made a workaround for now to be able to test the scenario

Glad to hear it! I'm curions - how'd the test turn out? Do you still think this feature would be useful for you?

This is proving a little more complicated to implement than I would have hoped (mostly due to error handling). I have a plan laid out, but it'll require some internal refactoring. I still think this feature makes sense, but don't want to spend the effort (yet) if you don't think you'll still want it.

from msgspec.

RynoM avatar RynoM commented on May 16, 2024

I still think it would be useful, but so far we haven't given implementing this priority, so maybe if the 'pretty quick to implement' didn't turn out to be true, this doesn't need to be priority here either.

from msgspec.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.