Comments (29)
I also tried to call
archive.saveValue( ToString() )
instead of
archive( ToString() );
directly, but this results in a crash
the former should either be private or it should have a well defined behaviour...
from cereal.
Making enum a single level should be no problem. As far as your first issue goes, I'm not sure I have a good way of solving it that doesn't complicate the API unnecessarily. You can accomplish what you want by moving the actual toString call up a level, e.g.:
archive( cereal::make_nvp("some cool name", myObject.toString()) ); // saving
std::string stringrep; // load the string
archive( cereal::make_nvp("some cool name", stringrep) ); // NVP not strictly required here, but we'll soon support out of order loading
myObject = MyObject( stringrep ); // construct from string, or whatever
The way our serialization code works right now, by the time we've entered into the save function we've already opened up a node in the JSON and started an object or array. I don't like the idea of bringing in more Boost functionality here just to get rid of one word in the JSON.
There are a few functions in the JSON (and likely in the XML) archives that are not intended for public use (they have to be coupled with other functions for setting up/tearing down state) and those will get changed to be private.
from cereal.
Actually I just thought of how you can solve your issue quite easily, through the same mechanism I'll be using with the enum types. If you look into the JSON archive header, you'll notice that we have various prologue/epilogue functions that get called immediately before and immediately after the serialize function is called for a specific type.
You can use these to control when the JSON archive creates/ends nodes and get the functionality you desire. It's a little more effort than us providing a macro, but the prologue/epilogue functions provide a lot more power than fixed functionality.
from cereal.
Thanks for your answers.
+1 in advance for putting in the std::is_enum or !std::is_enum in the appropriate enable_ifs for JSON and XML.
(As I read from your second thoughts, you agree that the "one level up" approach is quite unsatisfactory, because it breaks encapsulation/unity etc. )
I was indeed playing around with those enable_if lines (tried the enum first).
The only trait I found useful: is_convertible. Got that working easily. But I really do not want to have implicit conversions to string.
I'd have to modify your code anyway.
Your last suggestion also requires that.
I still hold the opinion that you should offer a trait for this (e.g. 'is_cereal_primitive') that the user can set for his class if he needs/wants to and that your archive impls use in all the places you now use 'is_enum'.
I also think you will someday need/want to use the same technique for 'is_bitwise_serializable'
As I am into traits for no longer than a total of 3-4 hours I cannot figure that out atm.
One could use is_convertible<T,cereal_primitive> where cereal_primitive would be a dummy class.
The user would then add a dummy operator cereal_primitive const { return cereal_primitive(); } (using a Macro)
to make that conversion available...
But this idea seems a little non-standard and misleading...
from cereal.
It's actually simpler than expected:
I added the following lines to cereal.hpp
template< typename T >
struct is_primitive
{
static const bool value = false; // default is non-primitve
};
#define CEREAL_PRIMITVE_TYPE( Type ) \
namespace cereal \
{ \
template<> \
struct is_primitive< Type > \
{ \
static const bool value = true; \
}; \
}
and used cereal::is_primitive in the appropriate enable_if lines of the archive
e.g.
template <class T>
typename std::enable_if<!std::is_arithmetic<T>::value && !std::is_enum<T>::value && !cereal::is_primitive<T>::value , void>::type
prologue( JSONOutputArchive & ar, T const & )
{
ar.startNode();
}
if I now put
CEREAL_PRIMITVE_TYPE( testclass )
after my class definition, it serializes as expected.
from cereal.
That's a fairly clean solution but I'm still not sure about introducing it into cereal, since it can be accomplished by just writing your own prologue/epilogue functions (which can be anywhere - these don't need to reside in json.hpp for example), which is what I would prefer to see someone do.
So basically your solution minus the traits, which is more verbose in general but keeps the interface cleaner. It would look like, for your example:
(in one of your headers, so long as it knew about JSONOutputArchive)
void prologue( JSONOutputArchive & ar, testclass const & t )
{
// whatever is needed/not needed
}
void epilogue( JSONOutputArchive & ar, testclass const & t )
{
// whatever is needed/not needed
}
You could then use this overload, which would be preferred over the generic template version, to control whether your class needed to do setup/teardown operations for the archive.
I'm not sure if there is going to be a trivial way to make the various internal functionality of the JSON/XML archives private, since unfortunately a friend declaration like template <class U> friend void prologue( JSONOutputArchive &, U const & );
will not catch things such as template <class T> void prologue( JSONOutputArchive &, SomeType<T, MaybeOtherT> const & );
and I don't want peole writing their own epi/prologue or save/load functions to have to manually add themselves as friends.
from cereal.
Valid solution, too, but...
I think we are now entering the arena of clean code discussion, always leading to a fight :o)
I'm not a fighter, so I will just state my points, maybe they are 'wrong', maybe you will come back to them in a later version/refactoring.
Do not feel obliged to answer/defend.(except for point 5)
I think you have done a great job so far.
1
IMHO the prologue/epilogue/save/load interface - clean and simple inside cereal.hpp - is already too messy inside e.g. json.hpp. With the cascades of enable_ifs and other templates, it's quite hard to figure out what is really happening. Coders greater than me are advising not to use enable_if.
2
Additional confusion (in me) arises from the fact, that prologue/epilogue/save/load can be called with or without a NVP and even with a SizeTag. In this respect, the interface could be cleaned up a little. I would probably add 'makeName' and 'makeSize' as interface functions and call pro/epi/save/load only with the objects that are serialized.
3
Your definition of the archive-interface is too vague for my taste.
Much is done through calling ar( something ), but then at multiple points along the way it is spread into save/load (for types) an the into pro/epi/save/load (for archives) This is going too deep, I am not your code-analyst, but I just feel it could be cleaner there.
4
All that is needed in json.cpp and other archives (and somehow done by your SFINAE code) is to discriminate
primitive types (int,float,enum, AND user defined) => do not open a new node/object
array (or array like / SizeTag) => open an array
other => open a new node
(binary => ... )
5
Coming back to your proposed solution: Writing custom pro/epi functions will only add to the mess adressed in 1.
If a type/user class is considered/defined 'primitive', this applies to all archives, even the ones not written yet.
I want my primitive classes (as well as enums) not to open a new node in any of those archives.
from cereal.
The solution I like is to introduce another set of serialization functions, e.g.
struct party_goer
{
bool is_fun;
bool is_male;
std::string save_string()
{
return std::string(is_fun ? "amusing" : "boring") + " " + std::string(is_male ? "guy" : "girl");
}
void load_string(std::string const & s)
{
std::vector<std::string> vals;
boost::split(vals,s,boost::is_any_of(" "));
if(vals[0] == "amusing") is_fun = true;
else if(vals[0] == "boring") is_fun = false;
else throw "Bad fun value in string";
if(vals[1] == "guy") is_male = true;
else if(vals[1] == "girl") is_male = false;
else throw "Bad gender value in string";
}
};
We would obviously add non-member versions, and a load_and_allocate version as well.
from cereal.
Initially this was only going to make it in as a change to enums but the current plan is to do something more along your request for something like is_primitive
. We can't allow users to blindly write their own serialization functions if they mark something as primitive however, since attempting to serialize more than one item would cause issues. So this will likely be paired with some sort of special set of serialization functions, either in the form of string representations or something else to ensure that only a single value is serialized.
from cereal.
Nice to hear.
The randvoorhies way covers 80% of my use cases - classes that simply serialize to/from a string.
1
I guess, it is understood (but you did not mention) that for binary archives, ordinary serialization takes place (to make it smaller and faster), while for text archives save_string
and load_string
are used if available.
2
You did not mention enums. I guess they would be covered by non-member versions I could define for my enums like this
template<class Archive>
void save_string( Archive& archive, const myEnum& m )
{
std::string a = myEnumToString( m );
archive( a );
}
from cereal.
My second intention concerning is_primitive
was to remove clutter from the JSON(XML) files.
So if a class just saves as - say just one int (or double or string) the result should just be
"testobject": 42,
if a class is flagged as 'primitive'. Maybe this could even happen automatically!?
Also covering this case, would indeed have all my use cases covered. Thx in advance...
from cereal.
I do want it to happen automatically if possible, my only concern is if someone tries to serialize something with either an NVP or more than one thing within their serialization function for a class declared as "primitive", it will lead to undefined behavior that we won't catch at compile time. This leads to the potential need for specialized "minimal" serialization functions for this case. Use case from a user would be something like adding either serialize_minimal, load_minimal, (etc), which we would automatically detect and use for a text archive. Still haven't fully decided on the interface for all of this.
from cereal.
Quick update on this - the current design, which you can follow here, is to add a new pair of serialization functions to cereal: load_primitive and save_primitive (both member and non-member versions).
The interface will be something like
struct A
{
template <class Archive>
std::string save_minimal( Archive & ar )
{ /* somehow bundle self up into a string, or any primitive type */ }
template <class Archive>
void load_minimal( Archive & ar, std::string const & data )
{ /* loaded data contained in data, type must match return type of save_minimal */ }
};
You won't be able to mix these with other serialization types unless you tell cereal to explicitly use one type. You can pick the return type of save so that it outputs any primitive type or a string. It may be that we end up not passing the archive so you can't try and save or load extra stuff here, though we'll probably keep the template type to allow specialization.
enums will default to a minimal numerical representation after all of this.
from cereal.
Sounds reasonable, but a little confusing to me.
Because we are talking about primitive types, supported by every archive, I don't think, template specialization will ever be needed or make sense.
Possible issues (apart from templatization).
If enums default to int, will it be possible to override that by defining non-member std::string save_minimal( const myEnum& m )
?
What if the user defined 'std::string save_minimal()' AND 'int save_minimal()'?
What if the user defined different types in save and load?
What if the user defined 'std::string save_minimal()' AND standard 'save'?
Seems like an awful lot of compile time testing to me...
Maybe you will be better off using just save_string
and load_string
applicable to text archives only.
from cereal.
The problem of 'automatical primitive' saving/loading a class with only one entry could probably better be solved at runtime by just NOT inserting "value0:..." if only one entry is saved. Only if a second value is saved within the same context, the first entry is corrected to "value0:..." and every subesequent one is then named "valueN:...". Just a thought of a non-meta programmer...
from cereal.
You are right about the enums, that slipped my mind. The simplest solution is just the one you mentioned earlier in this discussion by marking a class as primitive and not opening a node. However, without storing any state in the archive itself, there is nothing stopping the user from trampling over this promise to only serialize one thing. If we add state for this we add runtime overhead, which leads to our preference for a compile time solution. It also adds a requirement that anyone implementing a text archive needs to implement this excess state machinery, and altering parse trees could potentially be really annoying.
I think the best way to go about this will be to add an internal use only specifier that acts just like the CEREAL_PRIMITIVE_TYPE
you described, and then let users define the minimal versions of serialization functions.
It will be impossible to have two of them with differing return types as signatures for functions depend only on name + parameters.
Having a save with a different return value than the load accepts as a parameter will be a compile time error.
If the user defines a standard save and a save_minimal, that will be an error just as if they defined a serialize and a save/load pair. You have to pick one type of serialization, which can be either in or out of the class: serialization function, load/save pair, minimal load/save pair.
The only way to get around this single serialization function type is to use the specialization functionality in access.hpp (see bottom of http://uscilab.github.io/cereal/serialization_functions.html for an example).
from cereal.
Your last 2 paragraphs sound very simple and clear. You have too keep it that way.
My only concern is my following use-case (think of files in the >> 1GB range):
I want to serialize binary for the most time, but sometimes, for debugging/testing purposes I want to serialize to JSON or a self-made NVP archive.
The binary case should use the standard load/save pair for performace reasons, NOT strings.
The text archive case should automatically use the save_string/save_text (or maybe std::string save_minimal) functions.
Implementing this might well be in 'my responsibility'. But mentioning save_string
made me think, you are going to support exactly that case.
from cereal.
Opened a stack overflow post related to this issue: http://stackoverflow.com/questions/22464225/c-detecting-free-function-existence-with-explicit-parameters
For non-member save_minimal
/load_minimal
pairs, we want to enforce that the return type of the save matches the constant reference parameter of the load (this is the same as in the member case), e.g.:
template <class Archive>
double save_minimal( MyType const & mt );
template <class Archive>
void load_minimal( MyType & mt, double const & value );
The tricky part here is writing a trait that, knowing Archive
, MyType
, and the fact that save_minimal
returns a double
, can verify that an exactly matching (no implicit conversion allowed) load_minimal
exists.
from cereal.
The current implementation of save/load minimal have signatures like (note the lack of actually taking the archive as a parameter):
template <class Archive>
double save_minimal( MyType const & mt );
template <class Archive>
void load_minimal( MyType & mt, double const & value );
The archive template is so that users can specialize on different archive types (e.g. only do minimal for JSON or something). We don't currently pass archive as an actual parameter since there is nothing the user could potentially be doing with it in this situation.
This design is presenting problems with name resolution for various reasons. Essentially if we have a load_minimal (or save_minimal) non-member function whose type does not exist in the same namespace as the function itself, we can't resolve it. Normally types are defined in the same namespace as their serialization functions, but if any serialization function is written in the cereal namespace (such as the built in functions we provide), we'll run into name resolution errors.
The easiest way I can think of solving this is to change the signatures to look like:
template <class Archive>
double save_minimal( Archive const &, MyType const & mt );
template <class Archive>
void load_minimal( Archive const &, MyType & mt, double const & value );
Note that there is still nothing a user can actually do with the archive in these situations since it will be passed by constant reference, which is different from every other serialization function, which accepts its archive by reference.
This doesn't look as elegant but it is probably necessary. Any objections? This is probably the last roadblock in getting this feature finished.
Ultimately things declared with minimal functions will just end up wrapping their types in something similar to NameValuePair, except it will be something like MinimalWrapper, which archives can detect and handle appropriately (e.g. not opening new nodes).
from cereal.
Despite the good work you have put in, I feel we are somehow over-complicating a simple matter.
The fact that there (would) exist two different ways to serialize is a big warning sign to me.
Duplication is the root of all evil...
from cereal.
As a much simpler alternative, I suggest the following:
- There only exists ONE serialization template (In fact, three: serialize, save, load)
- If the user wants to serialize minimalistic, he uses the same template function as he always did.
But inside, he does not call the standardoperator()
of the archive, he rather calls a functionarchive.serialize_minimal( ... )
, with the following properties:
- Only one argument is allowed that can either be a primitive type (int, double, string) or a NVP of such a type.
- The function need not be archive specific. Only one function for OutputArchive and one for InputArchive must be implemented to do the minimal save/load.
I cannot assess all the implications of this right now, but it seems much simpler and natural to have the choice at this level.
The advanced user (see my use case above) can also choose at this stage, whether he wants to save as a string (for text-archives) or as raw/optimized/compressed/whatever (for binary archives).
The compile time checking now takes place one level "deeper", and should simplify a lot because checking for minimal vs. non-minimal templates drops out.
from cereal.
I'd love to go with a simpler solution. Here are the potential issues I see with an archive.serialize_minimal()
style:
- Once we are inside of a type's serialize/save/load function, it is too late to not open a node in the archive unless we make the archives have some kind of ability to unwind the nodes they make. This is probably possible, but a little annoying.
- The other issue is what stops someone from issuing any other operation on the archive in the same context in which they perform the
serialize_minimal
? We'd have to have some kind of state we'd be keeping in the archive that would give a runtime error in this situation, since there's no way for us to check at compile time what someone is doing in some function.
This is what led me towards the implementation I've been working on - it's more complicated to implement behind the scenes, but easier for an archive to deal with and probably much easier for a user to deal with. The only concern is the interface for the function - it will be different regardless, but the API I'm leaning towards is the one I mentioned earlier (with the const & archive).
Here's what a use case would look like:
struct MyType
{
// some class variables
template <class Archive>
std::string save_minimal( Archive const & ) // no need to use the archive yet, maybe in the future
{
return some_complicated_way_of_making_a_string();
}
template <class Archive>
void load_minimal( Archive const &, std::string const & data )
{
some_complicated_way_of_loading_from_a_string( data );
}
// compile time error if load_minimal's second parameter is not a constant
// reference to the return type of save_minimal
// compile time error if a user also defines a serialize, or load/save pair
// for the same archive type for the same data type (e.g. MyType)
// user can also define the above with version parameter or
// outside of a class just like the other serialization functions
};
int main()
{
{
cereal::JSONOutputArchive ar( std::cout );
MyType mt( some, type, of, constructor );
ar( mt ); // executes the save_minimal with the name being autogenerated
ar( CEREAL_NVP(mt) ); // provide some NVP
// note how the interface to the user is identical to the other types of serialization
// there is also no way for a user to insert more information than one single
// primitive (arithmetic or string)
}
}
from cereal.
True, I missed your point 2 in my (incomplete) consideration.
What to do if the user calls serialize_minimal(...)
and operator()
(=standard serialize)?
I now come to the conclusion that I was just trying to move the duplication to another level.
from cereal.
Your use case looks pretty simple.
For my purposes, I would like to add a specialization for Binary Archives roughly like this:
template <>
void serialize( BinaryArchive& archive )
{
archive( my_raw_data );
}
because I want the minimal (to/from string) functions to be applied by default, except for binary archives, where I want to use raw data without the string conversions.
How would I do this correctly?
Possible issues:
- function template specialization is a little tricky
BinaryArchive
does not exist, only Binary(In/Out)putArchive.- Will the compile time checking handle this correctly?
from cereal.
You'd do something like this:
struct MyType
{
template <class Archive>
typename std::enable_if<std::is_same<Archive, cereal::BinaryInputArchive>::value ||
std::is_same<Archive, cereal::BinaryOutputArchive>::value, void>::type
serialize( Archive & ar )
{
// whatever
}
// or alternatively, use this macro already in traits:
template <class Archive>
CEREAL_ARCHIVE_RESTRICT(cereal::BinaryInputArchive, cereal::BinaryOutputArchive)
serialize( Archive & ar )
{
// whatever
}
template <class Archive>
typename std::enable_if<!std::is_same<Archive, cereal::BinaryOutputArchive>::value,
std::string>::type // replace string with whatever minimal type you need
save_minimal( Archive const & ) const
{
}
template <class Archive>
typename std::enable_if<!std::is_same<Archive, cereal::BinaryInputArchive>::value,
void>::type
load_minimal( Archive const &, std::string const & data )
{
}
};
You have to use enable_if
in this situation because you are using serialize
, which as you identified needs to be active for both the input and output types, meaning it must accept its parameter as a template and cannot specialize it. If you were using load or save instead, you could simply remove the template and hard-code the archive type.
from cereal.
99% done with this, all that is left is making sure VS plays nicely and updating comments, then 1.0 is basically done.
from cereal.
That's exciting! <3 This is really one of the best libraries I've come across.
from cereal.
Finally closing this - looking good from the testing I've done, but juggling four compilers is getting annoying. In the future I think it is highly likely we will drop support for GCC 4.7.3.
from cereal.
Thx for the hints.
Looking forward to release, too...
from cereal.
Related Issues (20)
- c++20 std::span support
- Binary Serialization of String is still a readable string HOT 3
- Embedded RapidJSON can conflict with another RapidJSON used by the client (ODR violation!)
- Compilation error with struct as smat_pointer
- How to deserialize std::vector with XMLInputArchive
- Macos Build error: cannot bind lvalue of type unsigned long long to value of unrelated type unsigned long HOT 1
- I encountered a segmentation fault when processing XML file data with version 1.3.2 of Cereal. Could this be a security vulnerability
- pkg-config file
- std::aligned_storage is deprecated in C++23 HOT 1
- GCC 13.2 build error: possible dangling reference to a temporary HOT 2
- About the use of CEREAL_CLASS_VERSION HOT 1
- types/tuple.hpp does not compile on MSVC2019
- copy constructor is implicitly deleted because 'OutputArchiveBase' has a user-declared move constructor HOT 1
- Issue with closing braces in JSON using std::stringstream HOT 1
- Any plans for a new version?
- Is there a way to register polymorphic types during program execution? HOT 3
- How do I remove excess layers HOT 1
- RapidJson causes C5054 on MSVC
- JSONOutputArchive destructor is not really noexcept(true) despite it's declared noexcept(true)
- cereal unable to export xml comments
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cereal.