Code Monkey home page Code Monkey logo

xml-avro-converter's Introduction

xml-avro-converter

xml-avro-converter provides a framework for translating XML schemas and data into an equivalent, yet more efficient, Avro format. This enables transmission and storage of the same data while using less bandwidth and disk space. The Avro formatted-data can also be translated back into the equivalent XML data if desired. Data being converted from XML to Avro or vice versa is mediated through creation of Java objects from a single set of Java classes generated by the XML schema, so no additional translation software or XSLT-style sheets are needed to map data values between the two formats.

xml-avro-converter uses Avro's ReflectData class to generate a schema from a class on the classpath. ReflectData does not natively support adding inherited types to an Avro schema. xml-avro-converter resolves this by providing an interface to automatically modify the schema to accommodate inherited types. All that is required is a one-line declaration for each inherited type, and xml-avro-converter will replace all instances of the base type in the schema with a union for that type and all the subtypes which have been declared. This enables developers to quickly create an Avro schema from an existing Java class hierarchy, even when the Java class hierarchy uses polymorphic types.

This enables xml-avro-converter to generate a full Avro schema from a a Java class hierarchy which is created using JAXB. JAXB can be used to generate a Java class hierarchy from a set of XML schema definitions, and this Java class hierarchy can be used to create an Avro schema - thus enabling creating an Avro schema from an XML schema. To convert data, JAXB can be used to deserialize XML documents into Java objects, and then these Java objects can be serialized using the Avro library and the generated Avro schema. The reverse process can be achieved as well, converting Avro data into Java objects and then into XML.

Since this process uses the same Java class hierarchy, schema and data conversion can often be achieved without the need to write any translation logic at all. However, there are certain situations where users must guide the conversion process. Where JAXB takes advantage of Java's inheritance model, users must find and declare inherited types to guide the schema to accommodate these inherited types. And in some cases, JAXB will generate generic JAXBElement<> types to wrap certain member variables when Java's object model cannot capture the full expressiveness of XML's element model - in these cases, users must manually define the portions of the schema and translation logic of these specific elements. Several examples follow which demonstrate this functionality.

For more information on how to use this package, consult the documentation.

Navigating the Source Code

xml-avro-converter's functional source code resides entirely in xml-avro-converter-core. The three main classes that users interact with are AvroSchemaGenerator, AvroSerializer, and XmlSerializer.

  • AvroSchemaGenerator.java implements the logic for adding inherited types to the Avro schema. It exposes the API for declaring inherited types and re-generating the schema with those types. Users are expected to use this class to create an Avro schema which will then be used to create an AvroSerializer.

  • AvroSerializer.java is instantiated with this schema and, optionally, a ReflectData instance. When writing data to Avro, it uses the ReflectData instance to create a DatumWriter, and then uses the DatumWriter to write the data to the provided OutputStream. It does the same when reading Avro data from an InputStream, using the ReflectData instance to create a DatumReader.

  • XmlSerializer.java is instantiated with a JAXBContext and XML schema which the user must provide. It has utility methods to create JAXB Marshallers and Unmarshallers. XmlSerializer also provides methods for specifying a NamespacePrefixMapper, which allows the user to specify more convenient XML namespace prefixes, such as xsd:schema/ instead of <ns2:schema/>.

  • AvroSchemaIterator.java is an internal utility class which is used by AvroSchemaGenerator to iterate over all the Schemas in an Avro Schema.

  • SchemaGenerationException.java is a specific class for raising RuntimeExceptions. It is thrown when an unexpected input occurs such as when users try to set fields for a non-named schema or declare an inherited type which doesn't exist.

License

Copyright 2016-2017 MIT Lincoln Laboratory, Massachusetts Institute of Technology

Licensed under the Apache License, Version 2.0 (the "License"); you may not use these files except in compliance with the License.

You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

This material is based upon work supported by the Federal Aviation Administration under Air Force Contract No. FA8721-05-C-0002 and/or FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Federal Aviation Administration.

Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

xml-avro-converter's People

Contributors

ashish-banerjee-mitll avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.