Code Monkey home page Code Monkey logo

avrodite's Introduction

Avrodite image alt >

Build Status Reliability Rating Maintainability Rating Security Rating Coverage Vulnerabilities

A fast, lightweight, POJO driven, Apache Avro serialization/deserialization library.

Motivation

While designing this library, the following concerns were in mind:

  1. POJO Driven: This is particularly useful for an already existing application that have a set of plain JAVA objects as API. We tried the avro core implementation, but the library adds and additional layer of indirection/de-referencing. Moreover, the mapping between the Record objects/dictionaries and the existing api can quickly turn into a fastidious boilerplate to maintain.

  2. Generics and complex types: Existing JAVA API models often leverage the typing features that the core language offers. We need a library that would support any type, including types with generics.

  3. Fast: Latency is a critical requirement for modern applications and real-time event processors. Having a library that performs at high standards, is definitely a criteria for its adoption.

  4. Low memory footprint: Holding pressure from the heap by minimizing the created objects during the serialization / deserialization process support applications towards meeting their SLA. We need to have lowest possible memory footprint and offer an API for re-using existing objects when applicable.

Current Status

Currently the library fulfills most of its design objectives and is performing better than many widely adopted libraries. It has full support for Java generics and complex types (We've stress-tested it with fields of this kind List<Map<String, Event<Map<String, String>, Model<A,B>>>> and it passes).

From a performance perspective, Avrodite performs 3x-4x better than its closest mate, between 5x and 15x than Avro core Record API (See benchmarks reports).

Regarding usability, the interaction with Avrodite API would consist of something like so:

//API configuration would happen in your dependency injection layer 
Avrodite<AvroStandard, AvroCodec<?>> avrodite = AvroStandardV19.avrodite()
                                                      .build();                                                               
//once you catch an avrodite instance you can get a codec for a given target as follows:
AvroCodec<Model> codec = avrodite.getCodec(Model.class);

//serialization and deserialization is then trivial
byte[] avroData = codec.encode(new Model());
Model decodedModel = codec.decode(avroData);

//When needed, you can get your schema from the codec instance like so:
Schema schema = codec.getSchema();

Modules description

The library is modular and was designed with the idea of implementing other serialization formats in the future (JSON for example). Currently the modules consists of :

  1. avrodite-api: The global API that projects would depends on for serialization/deserialization.
  2. avrodite-avro: An extension and implementation of the public API that provides AVRO custom API (api access to Schema instances for example).
  3. avrodite-tools: Mainly a compile-time/build phase dependency that contains the necessary logic for introspecting your beans and plain objects API.
  4. avrodite-tools-avro: An avrodite-tools plugin that generate custom classes for the AVRO binary format.
  5. avrodite-avro-maven-plugin: A maven plugin that abstracts away from you the avrodite-tools stack to generate your codecs classes during the build phase.
  6. avrodite-avro-benchmarks: A test module that benchmarks the AVRO implementation against other libraries.

Benchmark Results Snapshot

Refer to this document for detailed results.

Throughput

(Higher is better)

Framework T1 throughput [ ops/ms ] T1 relative perf. T2 throughput [ ops/ms ] T2 relative perf.
avrodite 1858 100.00% 2208 100.00%
protocolBuffers 585 31.47% 589 26.67%
avroCoreNoHydration 132 7.08% 501 22.67%
avroCoreWithHydration 102 5.48% 239 10.82%
jacksonAvro 87 4.68% 158 7.14%
jacksonJSON 88 4.74% 89 4.02%

Alt text

Heap Usage

(Lower is better)

Framework T1 Heap Allocation Rate [ Byte/op ] T1 relative perf. T2 Heap Allocation Rate [ Byte/op ] T2 relative perf.
avrodite 1024 100.00% 1024 100.00%
avroCoreNoHydration 2248 219.53% 2248 219.53%
avroCoreWithHydration 4840 472.66% 4840 472.66%
protocolBuffers 6088 594.53% 6088 594.53%
jacksonJSON 9472 925.00% 9472 925.00%
jacksonAvro 16264 1588.28% 16296 1591.41%

Alt text

Roadmap

The following features are planned for the future:

  • Enums support (easy).
  • Avro Union types (can be useful when you have a supertype with various children) (easy/medium).
  • Map a model API evolution to Schema migration (most likely hard).
  • Support other serialization formats: A good portion of the library (types introspection, codec compilation) can be re-used to support other formats such as JSON. (average difficulty, depends on format)

Licence

Copyright (c) 2020 Yassine Echabbi

Licensed under the Apache License, Version 2.0

avrodite's People

Contributors

yassine avatar

Stargazers

Chen Chenglong avatar

Watchers

James Cloos avatar  avatar

avrodite's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.