Code Monkey home page Code Monkey logo

Comments (29)

DrAWolf avatar DrAWolf commented on July 16, 2024

I also tried to call
archive.saveValue( ToString() )
instead of
archive( ToString() );
directly, but this results in a crash
the former should either be private or it should have a well defined behaviour...

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

Making enum a single level should be no problem. As far as your first issue goes, I'm not sure I have a good way of solving it that doesn't complicate the API unnecessarily. You can accomplish what you want by moving the actual toString call up a level, e.g.:

archive( cereal::make_nvp("some cool name", myObject.toString()) ); // saving

std::string stringrep; // load the string
archive( cereal::make_nvp("some cool name", stringrep) ); // NVP not strictly required here, but we'll soon support out of order loading
myObject = MyObject( stringrep ); // construct from string, or whatever

The way our serialization code works right now, by the time we've entered into the save function we've already opened up a node in the JSON and started an object or array. I don't like the idea of bringing in more Boost functionality here just to get rid of one word in the JSON.

There are a few functions in the JSON (and likely in the XML) archives that are not intended for public use (they have to be coupled with other functions for setting up/tearing down state) and those will get changed to be private.

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

Actually I just thought of how you can solve your issue quite easily, through the same mechanism I'll be using with the enum types. If you look into the JSON archive header, you'll notice that we have various prologue/epilogue functions that get called immediately before and immediately after the serialize function is called for a specific type.

You can use these to control when the JSON archive creates/ends nodes and get the functionality you desire. It's a little more effort than us providing a macro, but the prologue/epilogue functions provide a lot more power than fixed functionality.

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

Thanks for your answers.

+1 in advance for putting in the std::is_enum or !std::is_enum in the appropriate enable_ifs for JSON and XML.

(As I read from your second thoughts, you agree that the "one level up" approach is quite unsatisfactory, because it breaks encapsulation/unity etc. )

I was indeed playing around with those enable_if lines (tried the enum first).
The only trait I found useful: is_convertible. Got that working easily. But I really do not want to have implicit conversions to string.
I'd have to modify your code anyway.
Your last suggestion also requires that.

I still hold the opinion that you should offer a trait for this (e.g. 'is_cereal_primitive') that the user can set for his class if he needs/wants to and that your archive impls use in all the places you now use 'is_enum'.

I also think you will someday need/want to use the same technique for 'is_bitwise_serializable'

As I am into traits for no longer than a total of 3-4 hours I cannot figure that out atm.
One could use is_convertible<T,cereal_primitive> where cereal_primitive would be a dummy class.
The user would then add a dummy operator cereal_primitive const { return cereal_primitive(); } (using a Macro)
to make that conversion available...
But this idea seems a little non-standard and misleading...

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

It's actually simpler than expected:

I added the following lines to cereal.hpp

  template< typename T >
  struct is_primitive
  {
      static const bool value = false;  // default is non-primitve
  };

  #define CEREAL_PRIMITVE_TYPE( Type )                            \
  namespace cereal                                                \
  {                                                               \
      template<>                                                  \
      struct is_primitive< Type >                                 \
      {                                                           \
          static const bool value = true;                         \
      };                                                          \
  } 

and used cereal::is_primitive in the appropriate enable_if lines of the archive
e.g.

  template <class T>
  typename std::enable_if<!std::is_arithmetic<T>::value && !std::is_enum<T>::value && !cereal::is_primitive<T>::value , void>::type
  prologue( JSONOutputArchive & ar, T const & )
  {
    ar.startNode();
  }

if I now put

CEREAL_PRIMITVE_TYPE( testclass )

after my class definition, it serializes as expected.

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

That's a fairly clean solution but I'm still not sure about introducing it into cereal, since it can be accomplished by just writing your own prologue/epilogue functions (which can be anywhere - these don't need to reside in json.hpp for example), which is what I would prefer to see someone do.

So basically your solution minus the traits, which is more verbose in general but keeps the interface cleaner. It would look like, for your example:

(in one of your headers, so long as it knew about JSONOutputArchive)

void prologue( JSONOutputArchive & ar, testclass const & t )
{
  // whatever is needed/not needed
}

void epilogue( JSONOutputArchive & ar, testclass const & t )
{
  // whatever is needed/not needed
}

You could then use this overload, which would be preferred over the generic template version, to control whether your class needed to do setup/teardown operations for the archive.

I'm not sure if there is going to be a trivial way to make the various internal functionality of the JSON/XML archives private, since unfortunately a friend declaration like template <class U> friend void prologue( JSONOutputArchive &, U const & ); will not catch things such as template <class T> void prologue( JSONOutputArchive &, SomeType<T, MaybeOtherT> const & ); and I don't want peole writing their own epi/prologue or save/load functions to have to manually add themselves as friends.

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

Valid solution, too, but...

I think we are now entering the arena of clean code discussion, always leading to a fight :o)
I'm not a fighter, so I will just state my points, maybe they are 'wrong', maybe you will come back to them in a later version/refactoring.
Do not feel obliged to answer/defend.(except for point 5)
I think you have done a great job so far.

1
IMHO the prologue/epilogue/save/load interface - clean and simple inside cereal.hpp - is already too messy inside e.g. json.hpp. With the cascades of enable_ifs and other templates, it's quite hard to figure out what is really happening. Coders greater than me are advising not to use enable_if.

2
Additional confusion (in me) arises from the fact, that prologue/epilogue/save/load can be called with or without a NVP and even with a SizeTag. In this respect, the interface could be cleaned up a little. I would probably add 'makeName' and 'makeSize' as interface functions and call pro/epi/save/load only with the objects that are serialized.

3
Your definition of the archive-interface is too vague for my taste.
Much is done through calling ar( something ), but then at multiple points along the way it is spread into save/load (for types) an the into pro/epi/save/load (for archives) This is going too deep, I am not your code-analyst, but I just feel it could be cleaner there.

4
All that is needed in json.cpp and other archives (and somehow done by your SFINAE code) is to discriminate
primitive types (int,float,enum, AND user defined) => do not open a new node/object
array (or array like / SizeTag) => open an array
other => open a new node
(binary => ... )

5
Coming back to your proposed solution: Writing custom pro/epi functions will only add to the mess adressed in 1.
If a type/user class is considered/defined 'primitive', this applies to all archives, even the ones not written yet.
I want my primitive classes (as well as enums) not to open a new node in any of those archives.

from cereal.

randvoorhies avatar randvoorhies commented on July 16, 2024

The solution I like is to introduce another set of serialization functions, e.g.

struct party_goer
{
  bool is_fun;
  bool is_male;

  std::string save_string()
  {
     return std::string(is_fun ? "amusing" : "boring") + " " + std::string(is_male ? "guy" : "girl");
  }

  void load_string(std::string const & s)
  {
     std::vector<std::string> vals;
     boost::split(vals,s,boost::is_any_of(" "));

     if(vals[0] == "amusing") is_fun = true;
     else if(vals[0] == "boring") is_fun = false;
     else throw "Bad fun value in string";

     if(vals[1] == "guy") is_male = true;
     else if(vals[1] == "girl") is_male = false;
     else throw "Bad gender value in string";
  }
};

We would obviously add non-member versions, and a load_and_allocate version as well.

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

Initially this was only going to make it in as a change to enums but the current plan is to do something more along your request for something like is_primitive. We can't allow users to blindly write their own serialization functions if they mark something as primitive however, since attempting to serialize more than one item would cause issues. So this will likely be paired with some sort of special set of serialization functions, either in the form of string representations or something else to ensure that only a single value is serialized.

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

Nice to hear.
The randvoorhies way covers 80% of my use cases - classes that simply serialize to/from a string.

1
I guess, it is understood (but you did not mention) that for binary archives, ordinary serialization takes place (to make it smaller and faster), while for text archives save_string and load_string are used if available.

2
You did not mention enums. I guess they would be covered by non-member versions I could define for my enums like this

template<class Archive>  
void save_string( Archive& archive, const myEnum& m ) 
{ 
    std::string a = myEnumToString( m );
    archive( a );
} 

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

My second intention concerning is_primitive was to remove clutter from the JSON(XML) files.
So if a class just saves as - say just one int (or double or string) the result should just be

"testobject": 42,

if a class is flagged as 'primitive'. Maybe this could even happen automatically!?
Also covering this case, would indeed have all my use cases covered. Thx in advance...

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

I do want it to happen automatically if possible, my only concern is if someone tries to serialize something with either an NVP or more than one thing within their serialization function for a class declared as "primitive", it will lead to undefined behavior that we won't catch at compile time. This leads to the potential need for specialized "minimal" serialization functions for this case. Use case from a user would be something like adding either serialize_minimal, load_minimal, (etc), which we would automatically detect and use for a text archive. Still haven't fully decided on the interface for all of this.

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

Quick update on this - the current design, which you can follow here, is to add a new pair of serialization functions to cereal: load_primitive and save_primitive (both member and non-member versions).

The interface will be something like

struct A
{
  template <class Archive>
  std::string save_minimal( Archive & ar )
  { /* somehow bundle self up into a string, or any primitive type */ }
  template <class Archive>
  void load_minimal( Archive & ar, std::string const & data )
  { /* loaded data contained in data, type must match return type of save_minimal */ }
};

You won't be able to mix these with other serialization types unless you tell cereal to explicitly use one type. You can pick the return type of save so that it outputs any primitive type or a string. It may be that we end up not passing the archive so you can't try and save or load extra stuff here, though we'll probably keep the template type to allow specialization.

enums will default to a minimal numerical representation after all of this.

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

Sounds reasonable, but a little confusing to me.

Because we are talking about primitive types, supported by every archive, I don't think, template specialization will ever be needed or make sense.

Possible issues (apart from templatization).
If enums default to int, will it be possible to override that by defining non-member std::string save_minimal( const myEnum& m )?
What if the user defined 'std::string save_minimal()' AND 'int save_minimal()'?
What if the user defined different types in save and load?
What if the user defined 'std::string save_minimal()' AND standard 'save'?
Seems like an awful lot of compile time testing to me...

Maybe you will be better off using just save_string and load_string applicable to text archives only.

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

The problem of 'automatical primitive' saving/loading a class with only one entry could probably better be solved at runtime by just NOT inserting "value0:..." if only one entry is saved. Only if a second value is saved within the same context, the first entry is corrected to "value0:..." and every subesequent one is then named "valueN:...". Just a thought of a non-meta programmer...

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

You are right about the enums, that slipped my mind. The simplest solution is just the one you mentioned earlier in this discussion by marking a class as primitive and not opening a node. However, without storing any state in the archive itself, there is nothing stopping the user from trampling over this promise to only serialize one thing. If we add state for this we add runtime overhead, which leads to our preference for a compile time solution. It also adds a requirement that anyone implementing a text archive needs to implement this excess state machinery, and altering parse trees could potentially be really annoying.

I think the best way to go about this will be to add an internal use only specifier that acts just like the CEREAL_PRIMITIVE_TYPE you described, and then let users define the minimal versions of serialization functions.

It will be impossible to have two of them with differing return types as signatures for functions depend only on name + parameters.

Having a save with a different return value than the load accepts as a parameter will be a compile time error.

If the user defines a standard save and a save_minimal, that will be an error just as if they defined a serialize and a save/load pair. You have to pick one type of serialization, which can be either in or out of the class: serialization function, load/save pair, minimal load/save pair.

The only way to get around this single serialization function type is to use the specialization functionality in access.hpp (see bottom of http://uscilab.github.io/cereal/serialization_functions.html for an example).

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

Your last 2 paragraphs sound very simple and clear. You have too keep it that way.

My only concern is my following use-case (think of files in the >> 1GB range):
I want to serialize binary for the most time, but sometimes, for debugging/testing purposes I want to serialize to JSON or a self-made NVP archive.
The binary case should use the standard load/save pair for performace reasons, NOT strings.
The text archive case should automatically use the save_string/save_text (or maybe std::string save_minimal) functions.

Implementing this might well be in 'my responsibility'. But mentioning save_string made me think, you are going to support exactly that case.

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

Opened a stack overflow post related to this issue: http://stackoverflow.com/questions/22464225/c-detecting-free-function-existence-with-explicit-parameters

For non-member save_minimal/load_minimal pairs, we want to enforce that the return type of the save matches the constant reference parameter of the load (this is the same as in the member case), e.g.:

template <class Archive>
double save_minimal( MyType const & mt );

template <class Archive>
void load_minimal( MyType & mt, double const & value );

The tricky part here is writing a trait that, knowing Archive, MyType, and the fact that save_minimal returns a double, can verify that an exactly matching (no implicit conversion allowed) load_minimal exists.

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

The current implementation of save/load minimal have signatures like (note the lack of actually taking the archive as a parameter):

template <class Archive>
double save_minimal( MyType const & mt );

template <class Archive>
void load_minimal( MyType & mt, double const & value );

The archive template is so that users can specialize on different archive types (e.g. only do minimal for JSON or something). We don't currently pass archive as an actual parameter since there is nothing the user could potentially be doing with it in this situation.

This design is presenting problems with name resolution for various reasons. Essentially if we have a load_minimal (or save_minimal) non-member function whose type does not exist in the same namespace as the function itself, we can't resolve it. Normally types are defined in the same namespace as their serialization functions, but if any serialization function is written in the cereal namespace (such as the built in functions we provide), we'll run into name resolution errors.

The easiest way I can think of solving this is to change the signatures to look like:

template <class Archive>
double save_minimal( Archive const &, MyType const & mt );

template <class Archive>
void load_minimal( Archive const &, MyType & mt, double const & value );

Note that there is still nothing a user can actually do with the archive in these situations since it will be passed by constant reference, which is different from every other serialization function, which accepts its archive by reference.

This doesn't look as elegant but it is probably necessary. Any objections? This is probably the last roadblock in getting this feature finished.

Ultimately things declared with minimal functions will just end up wrapping their types in something similar to NameValuePair, except it will be something like MinimalWrapper, which archives can detect and handle appropriately (e.g. not opening new nodes).

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

Despite the good work you have put in, I feel we are somehow over-complicating a simple matter.

The fact that there (would) exist two different ways to serialize is a big warning sign to me.
Duplication is the root of all evil...

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

As a much simpler alternative, I suggest the following:

  1. There only exists ONE serialization template (In fact, three: serialize, save, load)
  2. If the user wants to serialize minimalistic, he uses the same template function as he always did.
    But inside, he does not call the standard operator() of the archive, he rather calls a function archive.serialize_minimal( ... ), with the following properties:
  • Only one argument is allowed that can either be a primitive type (int, double, string) or a NVP of such a type.
  • The function need not be archive specific. Only one function for OutputArchive and one for InputArchive must be implemented to do the minimal save/load.

I cannot assess all the implications of this right now, but it seems much simpler and natural to have the choice at this level.

The advanced user (see my use case above) can also choose at this stage, whether he wants to save as a string (for text-archives) or as raw/optimized/compressed/whatever (for binary archives).

The compile time checking now takes place one level "deeper", and should simplify a lot because checking for minimal vs. non-minimal templates drops out.

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

I'd love to go with a simpler solution. Here are the potential issues I see with an archive.serialize_minimal() style:

  1. Once we are inside of a type's serialize/save/load function, it is too late to not open a node in the archive unless we make the archives have some kind of ability to unwind the nodes they make. This is probably possible, but a little annoying.
  2. The other issue is what stops someone from issuing any other operation on the archive in the same context in which they perform the serialize_minimal? We'd have to have some kind of state we'd be keeping in the archive that would give a runtime error in this situation, since there's no way for us to check at compile time what someone is doing in some function.

This is what led me towards the implementation I've been working on - it's more complicated to implement behind the scenes, but easier for an archive to deal with and probably much easier for a user to deal with. The only concern is the interface for the function - it will be different regardless, but the API I'm leaning towards is the one I mentioned earlier (with the const & archive).

Here's what a use case would look like:

struct MyType
{
  // some class variables

  template <class Archive>
  std::string save_minimal( Archive const & ) // no need to use the archive yet, maybe in the future
  {
    return some_complicated_way_of_making_a_string();
  }

  template <class Archive>
  void load_minimal( Archive const &, std::string const & data )
  {
    some_complicated_way_of_loading_from_a_string( data );
  }

  // compile time error if load_minimal's second parameter is not a constant
  // reference to the return type of save_minimal

  // compile time error if a user also defines a serialize, or load/save pair
  // for the same archive type for the same data type (e.g. MyType)

  // user can also define the above with version parameter or
  // outside of a class just like the other serialization functions
};

int main()
{
  {
    cereal::JSONOutputArchive ar( std::cout );
    MyType mt( some, type, of, constructor );
    ar( mt ); // executes the save_minimal with the name being autogenerated
    ar( CEREAL_NVP(mt) ); // provide some NVP

  // note how the interface to the user is identical to the other types of serialization
  // there is also no way for a user to insert more information than one single
  // primitive (arithmetic or string)
  }
}

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

True, I missed your point 2 in my (incomplete) consideration.

What to do if the user calls serialize_minimal(...) and operator() (=standard serialize)?
I now come to the conclusion that I was just trying to move the duplication to another level.

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

Your use case looks pretty simple.

For my purposes, I would like to add a specialization for Binary Archives roughly like this:

template <>
void serialize( BinaryArchive& archive )
{
    archive( my_raw_data );
}

because I want the minimal (to/from string) functions to be applied by default, except for binary archives, where I want to use raw data without the string conversions.

How would I do this correctly?

Possible issues:

  • function template specialization is a little tricky
  • BinaryArchive does not exist, only Binary(In/Out)putArchive.
  • Will the compile time checking handle this correctly?

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

You'd do something like this:

struct MyType
{
  template <class Archive>
  typename std::enable_if<std::is_same<Archive, cereal::BinaryInputArchive>::value ||
                          std::is_same<Archive, cereal::BinaryOutputArchive>::value, void>::type
  serialize( Archive & ar )
  {
    // whatever
  }

  // or alternatively, use this macro already in traits:
  template <class Archive>
  CEREAL_ARCHIVE_RESTRICT(cereal::BinaryInputArchive, cereal::BinaryOutputArchive)
  serialize( Archive & ar )
  {
    // whatever
  }

  template <class Archive>
  typename std::enable_if<!std::is_same<Archive, cereal::BinaryOutputArchive>::value,
                          std::string>::type // replace string with whatever minimal type you need
  save_minimal( Archive const & ) const
  {

  }

  template <class Archive>
  typename std::enable_if<!std::is_same<Archive, cereal::BinaryInputArchive>::value,
                          void>::type 
  load_minimal( Archive const &, std::string const & data )
  {

  }
};

You have to use enable_if in this situation because you are using serialize, which as you identified needs to be active for both the input and output types, meaning it must accept its parameter as a template and cannot specialize it. If you were using load or save instead, you could simply remove the template and hard-code the archive type.

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

99% done with this, all that is left is making sure VS plays nicely and updating comments, then 1.0 is basically done.

from cereal.

Devacor avatar Devacor commented on July 16, 2024

That's exciting! <3 This is really one of the best libraries I've come across.

from cereal.

AzothAmmo avatar AzothAmmo commented on July 16, 2024

Finally closing this - looking good from the testing I've done, but juggling four compilers is getting annoying. In the future I think it is highly likely we will drop support for GCC 4.7.3.

from cereal.

DrAWolf avatar DrAWolf commented on July 16, 2024

Thx for the hints.
Looking forward to release, too...

from cereal.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.