xml-avro-converter
provides a framework for translating XML schemas and data into an equivalent, yet more efficient, Avro format. This enables transmission and storage of the same data while using less bandwidth and disk space. The Avro formatted-data
can also be translated back into the equivalent XML data if desired. Data being converted from XML to Avro or vice versa is mediated through creation of Java objects from a single set of Java classes generated by the XML schema, so no additional
translation software or XSLT-style sheets are needed to map data values between the two formats.
xml-avro-converter
uses Avro's ReflectData
class to generate a schema from a class on the classpath. ReflectData
does not natively support adding inherited types to an Avro schema. xml-avro-converter
resolves this by providing an interface to
automatically modify the schema to accommodate inherited types. All that is required is a one-line declaration for each inherited type, and xml-avro-converter
will replace all instances of the base type in the schema with a union for that type and all
the subtypes which have been declared. This enables developers to quickly create an Avro schema from an existing Java class hierarchy, even when the Java class hierarchy uses polymorphic types.
This enables xml-avro-converter
to generate a full Avro schema from a a Java class hierarchy which is created using JAXB. JAXB can be used to generate a Java class hierarchy from a set of XML schema definitions, and this Java class hierarchy can be
used to create an Avro schema - thus enabling creating an Avro schema from an XML schema. To convert data, JAXB can be used to deserialize XML documents into Java objects, and then these Java objects can be serialized using the Avro library and the
generated Avro schema. The reverse process can be achieved as well, converting Avro data into Java objects and then into XML.
Since this process uses the same Java class hierarchy, schema and data conversion can often be achieved without the need to write any translation logic at all. However, there are certain situations where users must guide the conversion process. Where JAXB
takes advantage of Java's inheritance model, users must find and declare inherited types to guide the schema to accommodate these inherited types. And in some cases, JAXB will generate generic JAXBElement<>
types to wrap certain member variables when
Java's object model cannot capture the full expressiveness of XML's element model - in these cases, users must manually define the portions of the schema and translation logic of these specific elements. Several examples follow which demonstrate this
functionality.
For more information on how to use this package, consult the documentation.
xml-avro-converter
's functional source code resides entirely in xml-avro-converter-core
. The three main classes that users interact with are AvroSchemaGenerator
, AvroSerializer
, and XmlSerializer
.
-
AvroSchemaGenerator.java implements the logic for adding inherited types to the Avro schema. It exposes the API for declaring inherited types and re-generating the schema with those types. Users are expected to use this class to create an Avro schema which will then be used to create an
AvroSerializer
. -
AvroSerializer.java is instantiated with this schema and, optionally, a
ReflectData
instance. When writing data to Avro, it uses theReflectData
instance to create aDatumWriter
, and then uses theDatumWriter
to write the data to the providedOutputStream
. It does the same when reading Avro data from anInputStream
, using theReflectData
instance to create aDatumReader
. -
XmlSerializer.java is instantiated with a
JAXBContext
and XML schema which the user must provide. It has utility methods to create JAXBMarshaller
s andUnmarshaller
s.XmlSerializer
also provides methods for specifying aNamespacePrefixMapper
, which allows the user to specify more convenient XML namespace prefixes, such as xsd:schema/ instead of<ns2:schema/>
. -
AvroSchemaIterator.java is an internal utility class which is used by
AvroSchemaGenerator
to iterate over all theSchema
s in an AvroSchema
. -
SchemaGenerationException.java is a specific class for raising
RuntimeException
s. It is thrown when an unexpected input occurs such as when users try to set fields for a non-named schema or declare an inherited type which doesn't exist.
Copyright 2016-2017 MIT Lincoln Laboratory, Massachusetts Institute of Technology
Licensed under the Apache License, Version 2.0 (the "License"); you may not use these files except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This material is based upon work supported by the Federal Aviation Administration under Air Force Contract No. FA8721-05-C-0002 and/or FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Federal Aviation Administration.
Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.