Code Monkey home page Code Monkey logo

avromatic's Introduction

Avromatic

Build Status Gem Version

Avromatic generates Ruby models from Avro schemas and provides utilities to encode and decode them.

This README reflects Avromatic 2.0. Please see the 1-0-stable branch for Avromatic 1.0.

Installation

Add this line to your application's Gemfile:

gem 'avromatic'

And then execute:

$ bundle

Or install it yourself as:

$ gem install avromatic

See the Logical Types section below for details on using Avromatic with unreleased Avro features.

Usage

Configuration

Avromatic supports the following configuration:

Model Generation

  • schema_store: A schema store is required to load Avro schemas from the filesystem. It should be an object that responds to find(name, namespace = nil) and returns an Avro::Schema object. An AvroTurf::SchemaStore can be used. The schema_store is unnecessary if models are generated directly from Avro::Schema objects. See Models.
  • nested_models: An optional ModelRegistry that is used to store, by full schema name, the generated models that are embedded within top-level models. By default a new Avromatic::ModelRegistry is created.
  • eager_load_models: An optional array of models, or strings with class names for models, that are added to nested_models at the end of Avromatic.configure and during code reloading in Rails applications. This option is useful for defining models that will be extended when the load order is important.
  • allow_unknown_attributes: Optionally allow model constructors to silently ignore unknown attributes. Defaults to false. WARNING: Setting this to true will result in incorrect union member coercions if an earlier union member is satisfied by a subset of the latter union member's attributes.

Custom Types

See the section below on configuring Custom Types.

Using a Schema Registry/Messaging API

The configuration options below are required when using a schema registry (see Confluent Schema Registry) and the Messaging API.

  • schema_registry: An AvroSchemaRegistry::Client or AvroTurf::ConfluentSchemaRegistry object used to store Avro schemas so that they can be referenced by id. Either schema_registry or registry_url must be configured. If using build_schema_registry!, only registry_url is required. See example below.
  • registry_url: URL for the schema registry. This must be configured when using build_schema_registry!. The build_schema_registry! method may be used to create a caching schema registry client instance based on other configuration values.
  • use_schema_fingerprint_lookup: Avromatic supports a Schema Registry extension that provides an endpoint to lookup existing schema ids by fingerprint. A successful response from this GET request can be cached indefinitely. The use of this additional endpoint can be disabled by setting this option to false and this is recommended if using a Schema Registry that does not support the endpoint.
  • messaging: An AvroTurf::Messaging object to be shared by all generated models The build_messaging! method may be used to create a Avromatic::Messaging instance based on the other configuration values.
  • logger: The logger to use for the schema registry client.

Example using a schema registry:

Avromatic.configure do |config|
  config.schema_store = AvroTurf::SchemaStore.new(path: 'avro/schemas')
  config.registry_url = Rails.configuration.x.avro_schema_registry_url

  config.build_messaging!
end

NOTE: build_messaging! ultimately calls build_schema_registry! so you don't have to call both.

Decoding

  • use_custom_datum_reader: Avromatic includes a modified subclass of Avro::IO::DatumReader. This subclass returns additional information about the index of union members when decoding Avro messages. This information is used to optimize model creation when decoding. By default this information is included in the hash returned by the DatumReader but can be omitted by setting this option to false.

Encoding

  • use_custom_datum_writer: Avromatic includes a modified subclass of Avro::IO::DatumWriter. This subclass supports caching avro encodings for immutable models and uses additional information about the index of union members to optimize the encoding of Avro messages. By default this information is included in the hash passed to the encoder but can be omitted by setting this option to false.

Models

Models are defined based on an Avro schema for a record.

The Avro schema can be specified by name and loaded using the schema store:

class MyModel
  include Avromatic::Model.build(schema_name: :my_model)
end

# Construct instances by passing in a hash of attributes
instance = MyModel.new(id: 123, name: 'Tesla Model 3', enabled: true)

# Access attribute values with readers
instance.name # => "Tesla Model 3"

# Models are immutable by default
instance.name = 'Tesla Model X' # => NoMethodError (private method `name=' called for #<MyModel:0x00007ff711e64e60>)

# Booleans can also be accessed by '?' readers that coerce nil to false
instance.enabled? # => true

# Models implement ===, eql? and hash
instance == MyModel.new(id: 123, name: 'Tesla Model 3', enabled: true) # => true
instance.eql?(MyModel.new(id: 123, name: 'Tesla Model 3', enabled: true)) # => true
instance.hash # => -1279155042741869898

# Retrieve a hash of the model's attributes via to_h, to_hash or attributes
instance.to_h # => {:id=>123, :name=>"Tesla Model 3", :enabled=>true}

Or an Avro::Schema object can be specified directly:

class MyModel
  include Avromatic::Model.build(schema: schema_object)
end

A specific subject name can be associated with the schema:

class MyModel
  include Avromatic::Model.build(schema_name: 'my_model',
                                 schema_subject: 'my_model-value')
end

Models are generated as immutable value objects by default, but can optionally be defined as mutable:

class MyModel
  include Avromatic::Model.build(schema_name: :my_model, mutable: true)
end

Generated models include attributes for each field in the Avro schema including any default values defined in the schema.

A model may be defined with both a key and a value schema:

class MyTopic
  include Avromatic::Model.build(value_schema_name: :topic_value,
                                 key_schema_name: :topic_key)
end

When key and value schemas are both specified, attributes are added to the model for the union of the fields in the two schemas.

By default, optional fields are not allowed in key schemas since their values may be accidentally omitted leading to problems if data is partitioned based on the key values.

This behavior can be overridden by specifying the :allow_optional_key_fields option for the model:

class MyTopic
  include Avromatic::Model.build(value_schema_name: :topic_value,
                                 key_schema_name: :topic_key,
                                 allow_optional_key_fields: true)
end

A specific subject name can be associated with both the value and key schemas:

class MyTopic
  include Avromatic::Model.build(value_schema_name: :topic_value,
                                 value_schema_subject: 'topic_value-value',
                                 key_schema_name: :topic_key,
                                 key_schema_subject: 'topic_key-value')
end

A model can also be generated as an anonymous class that can be assigned to a constant:

MyModel = Avromatic::Model.model(schema_name :my_model)

Experimental: Union Support

Avromatic contains experimental support for unions containing more than one non-null member type. This feature is experimental because Avromatic may attempt to coerce between types too aggressively.

For now, if a union contains nested models then it is recommended that you assign model instances.

Some combination of the ordering of member types in the union and relying on model validation may be required so that the correct member is selected, especially when deserializing from Avro.

In the future, the type coercion used in the gem will be enhanced to better support the union use case.

Nested Models

Nested models are models that are embedded within top-level models generated using Avromatic. Normally these nested models are automatically generated.

By default, nested models are stored in Avromatic.nested_models. This is an Avromatic::ModelRegistry instance that provides access to previously generated nested models by the full name of their Avro schema.

Avromatic.nested_models['com.my_company.test.example']
#=> <model class>

The ModelRegistry can be customized to remove a namespace prefix:

Avromatic.nested_models =
  Avromatic::ModelRegistry.new(remove_namespace_prefix: 'com.my_company')

The :remove_namespace_prefix value can be a string or a regexp.

By default, top-level generated models reuse Avromatic.nested_models. This allows nested models to be shared across different generated models. A :nested_models option can be specified when generating a model. This allows the reuse of nested models to be scoped:

Avromatic::Model.model(schema_name, :my_model
                       nested_models: ModelRegistry.new)

Only models without a key schema can be used as nested models. When a model is generated with just a value schema then it is automatically registered so that it can be used as a nested model.

To extend a model that will be used as a nested model, you must ensure that it is defined, which will register it, prior it being referenced by another model.

Using the Avromatic.eager_load_models option allows models that are extended and will be used as nested models to be defined at the end of the .configure block. In Rails applications, these models are also re-registered after nested_models is cleared when code reloads to ensure that classes load in the correct order:

Avromatic.configure do |config|
  config.eager_load_models = [
    # reference any extended models that should be defined first
    'MyNestedModel'
  ]
end

Custom Type Configuration

Custom types can be configured for fields of named types (record, enum, fixed). These customizations are registered on the Avromatic module. Once a custom type is registered, it is used for all models with a schema that references that type. It is recommended to register types within a block passed to Avromatic.configure:

Avromatic.configure do |config|
  config.register_type('com.example.my_string', MyString)
end

The full name of the type and an optional class may be specified. When a class is provided then values for attributes of that type are defined using the specified class.

If the provided class responds to the class methods from_avro and to_avro then those methods are used to convert values when assigning to the model and before encoding using Avro respectively.

from_avro and to_avro methods may be also be specified as Procs when registering the type:

Avromatic.configure do |config|
  config.register_type('com.example.updown_string') do |type|
    type.from_avro = ->(value) { value.upcase }
    type.to_avro = ->(value) { value.downcase }
  end
end

Nil handling is not required as the conversion methods are not be called if the inbound or outbound value is nil.

If a custom type is registered for a record-type field, then any to_avro method/Proc should return a Hash with string keys for encoding using Avro.

Encoding and Decoding

Avromatic provides two different interfaces for encoding the key (optional) and value associated with a model.

Manually Managed Schemas

The attributes for the value schema used to define a model can be encoded using:

encoded_value = model.avro_raw_value

In order to decode this data, a copy of the value schema is required.

If a model also has an Avro schema for a key, then the key attributes can be encoded using:

encoded_key = model.avro_raw_key

If attributes were encoded using the same schema(s) used to define a model, then the data can be decoded to create a new model instance:

MyModel.avro_raw_decode(key: encoded_key, value: encoded_value)

If the attributes where encoded using a different version of the model's schemas, then a new model instance can be created by also providing the schemas used to encode the data:

MyModel.avro_raw_decode(key: encoded_key,
                        key_schema: writers_key_schema,
                        value: encoded_value,
                        value_schema: writers_value_schema)

Messaging API

The other interface for encoding and decoding attributes uses the AvroTurf::Messaging API. This interface leverages a schema registry and prefixes the encoded data with an id to identify the schema. In this approach, a schema registry is used to ensure that the correct schemas are available during decoding.

The attributes for the value schema can be encoded with a schema id prefix using:

message_value = model.avro_message_value

If a model has an Avro schema for a key, then those attributes can also be encoded prefixed with a schema id:

message_key = model.avro_message_key

A model instance can be created from a key and value encoded in this manner:

MyTopic.avro_message_decode(message_key, message_value)

Or just a value if only one schema is used:

MyValue.avro_message_decode(message_value)

The schemas associated with a model can also be added to a schema registry without encoding a message:

MyTopic.register_schemas!

Avromatic::Model::MessageDecoder

A stream of messages encoded from various models using the messaging approach can be decoded using Avromatic::Model::MessageDecoder. The decoder must be initialized with the list of models to decode:

decoder = Avromatic::Model::MessageDecoder.new(MyModel1, MyModel2)

decoder.decode(model1_messge_key, model1_message_value)
# => instance of MyModel1
decoder.decode(model2_message_value)
# => instance of MyModel2

Validations and Coercions

An exception will be thrown if an attribute value cannot be coerced to the corresponding Avro schema field's type. The following coercions are supported:

Ruby Type Avro Type
String, Symbol string
Array array
Hash map
Integer int
Integer long
Float, Integer float
Float, Integer double
String bytes
Date, Time, DateTime date
Time, DateTime timestamp-millis
Time, DateTime timestamp-micros
Float, Integer, BigDecimal decimal
TrueClass, FalseClass boolean
NilClass null
Hash record

Validation of required fields is done automatically when serializing a model to Avro. It can also be done explicitly by calling the valid? or invalid? methods from the ActiveModel::Validations interface.

RSpec Support

This gem also includes an "avromatic/rspec" file that can be required to support using Avromatic with a fake schema registry during tests.

Requiring this file configures a RSpec before hook that directs any schema registry requests to a fake, in-memory schema registry (instead of port 21001) and rebuilds the Avromatic::Messaging object for each example.

Note: Use of avromatic/rspec requires installing the sinatra gem for the in-memory schema registry to work properly.

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/salsify/avromatic.

License

The gem is available as open source under the terms of the MIT License.

avromatic's People

Contributors

atsheehan avatar fgarces avatar georgesheppardbanked avatar gremerritt avatar jkapell avatar joshbranham avatar jturkel avatar kbarrette avatar kphelps avatar marcoserrato avatar mkrisher avatar opti avatar robindaugherty avatar skarger avatar tjwp avatar tripwar avatar will89 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

avromatic's Issues

Possible to include module in anonymous nested model?

I have a nested schema that looks much like the schema in spec/avro/schema/test/nested_nested_record.avsc - i.e. multiple levels of nested records. I would like to manually define a class for each of the generated nested models, instead of the anonymous generated classes, so that I can include some validation methods that are written in a separate module. Concretely,

{
  "type": "record",
  "name": "nested_nested_record",
  "namespace": "test",
  "fields": [
    {
      "name": "sub",
      "type": {
        "type": "record",
        "name": "__nested_nested_record_sub_record",
        "namespace": "test",
        "fields": [
          {
            "name": "subsub",
            "type": {
              "type": "record",
              "name": "__nested_nested_record_sub_subsub_record",
              "namespace": "test",
              "fields": [
                {
                  "name": "n",
                  "type": "null"
                }
              ]
            }
          }
        ]
      }
    }
  ]
}

I would like to write an Avromatic model for the sub and subsub records so I can include custom validations. It's unclear from the docs and the code how I would go about doing this. If I write a model for the top-level record like this:

class NestedNestedRecord
  include Avromatic::Model.build(schema_name: 'test.nested_nested_record')
end

and then a model for the sub record like this:

class SubRecord
  include Avromatic::Model.build(schema_name: 'test.__nested_nested_record_sub_record')  
  validates :foo, numericality: { greater_than_or_equal_to: 0 }
end

Avromatic throws an exception saying it can't find the schema file test/__nested_nested_record_sub_record.avsc on disk (I am using a schema_store).

I do see that the TypeFactory will always build an anonymous class for record types, unless I register a type using config.register_type('test.__nested_nested_record_sub_record', SubRecord), so I'm guessing this is the correct way to go about getting what I need. What I need help with is building an Avromatic Model for that Subrecord. Any tips?

Edit, I was able to get it working like this, more or less:

Avromatic.configure do |config|
  config.eager_load_models = [
    "NestedNestedRecord"
  ]

  config.prepare!

  config.register_type('test.__nested_nested_record_sub_record', SubRecord)
end

class SubRecord < Avromatic.nested_models['test.__nested_nested_record_sub_record']
  include MyValidations
end

but this feels strange. Still wondering if this is the recommended way.

support models for nested schemas

Sometimes it is useful to be able to define a model for a schema that is embedded within another definition.

For this purpose, we could walk a parsed Avro schema to find all the referenced schema. This would support something like the following to define a model for a nested/referenced schema:

Avromatic::Model.build(schema_name: 'com.example.outer_record', nested_schema_name: 'com.example.inner_record')

The model would be defined based on the fields for the inner_record schema.

#avro_message_value should always return binary encoded string

This almost certainly depends on environmental factors, however in a Rails app where we have a fairly vanilla set-up with Avromatic calling #avro_message_value yields an UTF-8 encoded string.

That's pretty harmless, except that when comparing strings (for e.g in tests) which this project does to guard against other teams sneakily changing Avro schemas the UTF-8 has pretty variable encoding e.g:

[0, 0, 0, 0, 0, 128, 248, 192, 198, 221, 87, 40, 130, 218, 196, 9].pack("c*").force_encoding(Encoding::UTF_8)
"\u0000\u0000\u0000\u0000\u0000\x80\xF8\xC0\xC6\xDDW(\x82\xDA\xC4\t"

[0, 0, 0, 0, 0, 128, 248, 192, 198, 221, 87, 40, 130, 218, 196, 9].pack("c*").force_encoding(Encoding::ASCII_8BIT)
"\x00\x00\x00\x00\x00\x80\xF8\xC0\xC6\xDDW(\x82\xDA\xC4\t"

It should be as simple as forcing the encoding on that method return value before returning it.

The value for new strings can (I believe) depend on Encoding.default_internal and .default_external which can be controlled by env vars and other systemwide configuration.

Are Avro Union fields supported?

Trying to generate a model for this example schema that uses a Union of Foo and Bar...

require 'avromatic'

json = '[{
    "type": "record",
    "name": "Foo",
    "fields": [
      {"name": "fooMessage", "type": "string"}
    ]
  },

  {
    "type": "record",
    "name": "Bar",
    "fields": [
      {"name": "barMessage", "type": "string"}
    ]
  },

  {
    "type": "record",
    "name": "Root",
    "fields": [
      {"name": "header", "type": "string"},
      {"name": "message", "type": ["Foo", "Bar"]}
    ]
  }
]
'
Avromatic::Model.model(schema: Avro::Schema.parse(json))

...I encounter an exception:

/Users/ascassidy/.rvm/gems/ruby-2.3.0/gems/avromatic-0.7.0/lib/avromatic/model/attributes.rb:54:in `define_avro_attributes': undefined method `fields' for #<Avro::S
chema::UnionSchema:0x007f9b789eb040> (NoMethodError)
        from /Users/ascassidy/.rvm/gems/ruby-2.3.0/gems/avromatic-0.7.0/lib/avromatic/model/attributes.rb:27:in `add_avro_fields'
        from /Users/ascassidy/.rvm/gems/ruby-2.3.0/gems/avromatic-0.7.0/lib/avromatic/model/builder.rb:57:in `block (2 levels) in define_included_method'
        from /Users/ascassidy/.rvm/gems/ruby-2.3.0/gems/avromatic-0.7.0/lib/avromatic/model/builder.rb:22:in `include'
        from /Users/ascassidy/.rvm/gems/ruby-2.3.0/gems/avromatic-0.7.0/lib/avromatic/model/builder.rb:22:in `block in model'
        from /Users/ascassidy/.rvm/gems/ruby-2.3.0/gems/avromatic-0.7.0/lib/avromatic/model/builder.rb:21:in `initialize'
        from /Users/ascassidy/.rvm/gems/ruby-2.3.0/gems/avromatic-0.7.0/lib/avromatic/model/builder.rb:21:in `new'
        from /Users/ascassidy/.rvm/gems/ruby-2.3.0/gems/avromatic-0.7.0/lib/avromatic/model/builder.rb:21:in `model'
        from /Users/ascassidy/.rvm/gems/ruby-2.3.0/gems/avromatic-0.7.0/lib/avromatic/model.rb:47:in `model'
        from lib/test/avro/generate.rb:32:in `<main>'

Avromatic Support for Zeitwerk

Avromatic currently doesn't support the zeitwerk classloader when using eager loaded nested modules. Attempts to do so result in an error like this:

Attempted to replace existing Avromatic model Emails::Events::EmailRecipient with new model Emails::Events::EmailRecipient as 'emails.email_recipient'. Perhaps 'Emails::Events::EmailRecipient' needs to be eager loaded via the Avromatic eager_load_models setting?

I'm working on a fix...

Registered schemas end up with the incorrect name

Hello! I'm working on some code that uses both Avromatc (4.0.0) and AvroTurf (1.5.0) and I've run in to a problem that I think is on the Avromatic side of things.

I have a Avromatic model

class LogEntry
    include Avromatic::Model.build(value_schema_name: "uachieve-etl.application_log")
end

And I'm publishing that to a Kafka cluster that uses Confluent's schema registry. When I create a LogEntry and call avro_message_value it registers the schema, as I'd expect.

But the schema gets registered under the subject name of "uachieve-etl.application_log". I'd expect it to be registered under "uachieve-etl.application_log-value"

This change in subject name leads to some problems. ksqDB expects the subject name to end in -value so I can't work with the data. And Kafka Connect adapters have trouble with it too, for the same reason.

I looked at AvroTurf and saw that their messaging encode method supports an optional subject param. If passed the schema is registered under that subject. If not, then the schema is registered under the full schema name.

It looks like Avromatic's avro_message_value is calling this encode method but not providing the subject param.

Is there something I'm doing wrong here? I'd like to be able to use Avromatic here and have my schemas registered under the expected subject name of uachieve-etl.application_log-value.

Thanks!

Create model from remote schema registry

Please provide an example how to configure gem to create models only from remote schema registry.

The provided examples based on local-stored schemas while all my schemas are stored remotely in Confluent Cloud with password-protected access. I have no any schemas locally. Which attributes should I pass to configure method?

LoadError: cannot load such file -- avro_turf/messaging

Because of some other gem dependencies, I am locked at version 0.5.0 of avromatic and avro_turf. From Gemfile.lock:

    avro (1.7.7)
      multi_json
    avro_turf (0.5.0)
      avro (~> 1.7.7)
    avromatic (0.5.0)
      activemodel
      activesupport
      avro (>= 1.7.7)
      avro_turf
      virtus

Version 0.5.0 of avromatic calls require 'avro_turf/messaging' in avromatic.rb, but that module doesn't actually exist in v0.5.0 of avro_turf.

RuntimeError: a null type in a union must be the first member

Hi. thanks for the nice gem!

I'd like to report an issue.

Description

We have the following Avro schema:

{
  "type": "record",
  "namespace": "xyz",
  "name": "failureExample",
  "version": 1,
  "fields": [
    {
      "name": "position",
      "type": ["string", "null"],
      "default": "1",
      "logicalType": "standardDecimal"
    }
  ]
}

That results into the following error on attempt to build model from it:

RuntimeError: a null type in a union must be the first member

This exception comes from here: https://github.com/salsify/avromatic/blob/master/lib/avromatic/model/attributes.rb#L180

The problem is, that if we swap string and null types in the array, the the schema becomes invalid, because we need to use default value, and according to the avro specification default value type must match the first type in the union.

https://avro.apache.org/docs/1.8.2/spec.html#schema_record

default: A default value for this field, used when reading instances that lack this field (optional). Permitted values depend on the field's schema type, according to the table below. Default values for union fields correspond to the first schema in the union Default values for bytes and fixed fields are JSON strings, where Unicode code points 0-255 are mapped to unsigned 8-bit byte values 0-255.

Expected behaviour

The error should not be raised

value_attributes_for_avro doesn't behave as expected on 0.28.1

It appears that classes aren't preserved for some reason, I haven't had time to investigate:

- [{"__avromatic_encoding_provider"=>#<Core::Events::BooleanPropertyValueCollection property: #<Reference id: SalsifyUuid(s-00007360-0000-0000-0000-000000000000), type: "property", external_id: "boolean property">, value_data_type: "boolean", values: [true]>, "__avromatic_member_index"=>2}],
+ [{"__avromatic_encoding_provider"=>{"property"=>{"id"=>"s-00007360-0000-0000-0000-000000000000", "type"=>"property", "external_id"=>"boolean property"}, "value_data_type"=>"boolean", "values"=>[true]}, "__avromatic_member_index"=>2}],

https://circleci.com/gh/salsify/content-flow-service/1451

Seems to be related to 94b4904 and 18f6c69 because reverting to 0.27.0 fixes the issue @jkapell .

[2.0] Provider more context in coercion errors

Coercion errors include details on the value and the target type (e.g. Avromatic::Model::CoercionError: Could not coerce '1' to a String) but they don't provide context on where these attributes appear in a model which makes tracking down errors difficult.

Validation passes for required fields that are nil

When calling .valid? on an avromatic model with a missing field that is required, the validation passes with no errors.

However, when we then try to call avro_message_value on the object, it raises a Avro::IO::AvroTypeError (as expected).

Error handling when working with a registry

I am trying to understand potential errors when working with a registry (only for decoding).

If I understand decoding correctly, we have a reader's schema that comes from the configured schema store (static, file system), and a writer's schema that is fetched as needed from the registry, and cached.

If I am right, that means that the registry isn't hit to fetch any writer schema until a message arrives (makes sense by definition), and after that you only hit the registry if a message with a different version arrives.

If that is correct, would like to understand these things:

  1. Is there a way to fail fast when booting (before messages arrive) in order to detect a misconfigured URL?

  2. Which is the behavior of the gem if the registry is unreachable? Does it have a retry mechanism or does it raise right away? If it raises, I guess you should be ready for that with any message, right? Because of schema evolution and the need to fetch future schemas.

  3. If a writer's schema is incompatible with the reader's schema you get Avro::IO::SchemaMatchException, right?

  4. If a schema is invalid Avro::SchemaValidator::ValidationError seems to be raised, right? What is an invalid schema? Does the registry accept invalid schemas?

Anything else that could be relevant, please share!

Also, if you'd like to have any of this information included in the docs, would be glad to volunteer a patch.

Sub-schema for Avro::Schema::UnionSchema not a valid Avro schema

Avromatic version 0.32.0

Generating the avsc

We're using Avromatic to load .avsc files generated by

avro-tools idl2schemata

The avro-tools version is 1.8.2.

The Avromatic error

The error and stack trace from Avromatic are:

Gem Load Error is: Sub-schema for Avro::Schema::UnionSchema not a valid Avro schema. Bad schema: {"type"=>"record", "name"=>"JobCreated", "doc"=>"* Represents the fact that a new job has been created in the platform and that\n     * the system will start looking for drivers to fulfill that job whenever\n     * possible (See ScheduledJob)", "fields"=>[{"name"=>"job_id", "type"=>"int"}, {"name"=>"client_id", "type"=>"int"}, {"name"=>"deliveries", "type"=>{"type"=>"array", "items"=>{"type"=>"record", "name"=>"Delivery", "doc"=>"* Represents a Delivery being the concept of bringing package X from point A to\n     * point B", "fields"=>[{"name"=>"delivery_id", "type"=>"int"}, {"name"=>"pickup", "type"=>{"type"=>"record", "name"=>"Location", "doc"=>"* Describes a point in a map", "fields"=>[{"name"=>"latitude", "type"=>"double"}, {"name"=>"longitude", "type"=>"double"}]}}, {"name"=>"dropoff", "type"=>"Location"}, {"name"=>"package_type", "type"=>{"type"=>"enum", "name"=>"PackageType", "doc"=>"* The set of standarized sizes for packages", "symbols"=>["SIZE_UNKNOWN", "XSMALL", "SMALL", "MEDIUM", "LARGE", "XLARGE"]}}, {"name"=>"time_constraints", "type"=>{"type"=>"array", "items"=>{"type"=>"record", "name"=>"TimeConstraint", "doc"=>"* Represents a time constraint on a delivery action", "fields"=>[{"name"=>"action", "type"=>{"type"=>"enum", "name"=>"DeliveryAction", "doc"=>"* The set of available actions for a delivery", "symbols"=>["PICKUP", "DROPOFF"]}}, {"name"=>"time_window", "type"=>{"type"=>"record", "name"=>"TimeWindow", "doc"=>"* Describes a time window closed at the beginning and open at the end \"[from, to[\"", "fields"=>[{"name"=>"from", "type"=>{"type"=>"long", "logicalType"=>"timestamp-millis"}}, {"name"=>"to", "type"=>{"type"=>"long", "logicalType"=>"timestamp-millis"}}]}}]}}}]}}}, {"name"=>"requested_transport", "type"=>["null", {"type"=>"enum", "name"=>"TransportType", "doc"=>"* The set of available Transport Types", "symbols"=>["WALK", "BIKE", "MOTORBIKE", "CAR", "CARGOBIKE", "VAN", "CARGOBIKEXL", "MOTORBIKEXL", "STEERWAGON"]}]}, {"name"=>"priority", "type"=>"int", "default"=>0}, {"name"=>"fleets", "type"=>["null", {"type"=>"array", "items"=>"int"}], "default"=>nil}]}
Backtrace for gem load error is:
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-1.8.2/lib/avro/schema.rb:170:in `rescue in subparse'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-1.8.2/lib/avro/schema.rb:166:in `subparse'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-patches-0.4.0/lib/avro-patches/schema_compatibility/schema.rb:36:in `block in initialize'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-patches-0.4.0/lib/avro-patches/schema_compatibility/schema.rb:35:in `each'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-patches-0.4.0/lib/avro-patches/schema_compatibility/schema.rb:35:in `each_with_object'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-patches-0.4.0/lib/avro-patches/schema_compatibility/schema.rb:35:in `initialize'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-patches-0.4.0/lib/avro-patches/logical_types/schema.rb:49:in `new'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-patches-0.4.0/lib/avro-patches/logical_types/schema.rb:49:in `real_parse'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-1.8.2/lib/avro/schema.rb:167:in `subparse'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-1.8.2/lib/avro/schema.rb:367:in `initialize'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro_turf-0.8.1/lib/avro_turf/schema_to_avro_patch.rb:27:in `initialize'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-patches-0.4.0/lib/avro-patches/default_validation/schema.rb:5:in `initialize'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-1.8.2/lib/avro/schema.rb:216:in `new'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-1.8.2/lib/avro/schema.rb:216:in `block in make_field_objects'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-1.8.2/lib/avro/schema.rb:210:in `each'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-1.8.2/lib/avro/schema.rb:210:in `each_with_index'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-1.8.2/lib/avro/schema.rb:210:in `make_field_objects'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro_turf-0.8.1/lib/avro_turf/schema_to_avro_patch.rb:14:in `make_field_objects'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-patches-0.4.0/lib/avro-patches/schema_compatibility/schema.rb:24:in `initialize'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-patches-0.4.0/lib/avro-patches/logical_types/schema.rb:33:in `new'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro-patches-0.4.0/lib/avro-patches/logical_types/schema.rb:33:in `real_parse'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avro_turf-0.8.1/lib/avro_turf/schema_store.rb:21:in `find'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model/configuration.rb:48:in `find_schema_by_option'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model/configuration.rb:42:in `find_avro_schema'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model/configuration.rb:25:in `initialize'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model/builder.rb:39:in `new'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model/builder.rb:39:in `initialize'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model/builder.rb:27:in `new'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model/builder.rb:27:in `block in model'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model/builder.rb:26:in `initialize'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model/builder.rb:26:in `new'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model/builder.rb:26:in `model'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/avromatic-0.32.0/lib/avromatic/model.rb:48:in `model'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/business_events-1.8.0/lib/business_events/events.rb:2:in `<module:BusinessEvent>'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/business_events-1.8.0/lib/business_events/events.rb:1:in `<main>'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/activesupport-4.2.10/lib/active_support/dependencies.rb:274:in `require'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/activesupport-4.2.10/lib/active_support/dependencies.rb:274:in `block in require'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/activesupport-4.2.10/lib/active_support/dependencies.rb:240:in `load_dependency'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/activesupport-4.2.10/lib/active_support/dependencies.rb:274:in `require'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/business_events-1.8.0/lib/business_events.rb:4:in `<main>'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/bundler-1.16.1/lib/bundler/runtime.rb:81:in `require'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/bundler-1.16.1/lib/bundler/runtime.rb:81:in `block (2 levels) in require'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/bundler-1.16.1/lib/bundler/runtime.rb:76:in `each'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/bundler-1.16.1/lib/bundler/runtime.rb:76:in `block in require'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/bundler-1.16.1/lib/bundler/runtime.rb:65:in `each'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/bundler-1.16.1/lib/bundler/runtime.rb:65:in `require'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/bundler-1.16.1/lib/bundler.rb:114:in `require'
/Users/sean/src/stuart/stuart-api/config/application.rb:8:in `<top (required)>'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/spring-2.0.1/lib/spring/application.rb:82:in `require'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/spring-2.0.1/lib/spring/application.rb:82:in `preload'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/spring-2.0.1/lib/spring/application.rb:143:in `serve'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/spring-2.0.1/lib/spring/application.rb:131:in `block in run'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/spring-2.0.1/lib/spring/application.rb:125:in `loop'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/spring-2.0.1/lib/spring/application.rb:125:in `run'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/spring-2.0.1/lib/spring/application/boot.rb:19:in `<top (required)>'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
/Users/sean/.rbenv/versions/2.4.4/lib/ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
-e:1:in `<main>'

Summary

This started happening after we introduced the following union type in the avdl.

union { null, array<int> } fleets = null;

which seems to generate the corresponding avsc:

{"name"=>"fleets", "type"=>["null", {"type"=>"array", "items"=>"int"}], "default"=>nil}

We're pretty stumped as to why Avromatic would fail to validate this. The generated avsc is being used by our Scala-based toolset without any issues.

Is this something you've seen before? Any ideas what the root of the problem could be?

Nested Records are Not Instantiated

When I define an Avromatic class, and instantiate it, I expect all the nested records to also be instantiated. Instead, only the top-most record is instantiated, and any sub-records are nil.

Here's a simple reproduction:

Avro Schema:

{
  "type": "record",
  "name": "top_rec",
  "namespace": "com.example",
  "fields": [
    {
      "name": "sub",
      "type": {
        "type": "record",
        "name": "sub_rec",
        "namespace": "com.example",
        "fields": [
          {
            "name": "i",
            "type": "int"
          }
        ]
      }
    }
  ]
}

Then we load the schema and create a class:

schema_file = File.read('top_rec.avsc')
schema = Avro::Schema.parse(schema_file)

nestedClass = Avromatic::Model.model(schema: schema)

myClass = nestedClass.new
puts myClass.inspect

Actual Result:

#<TopRec sub: nil>

Expected Result:

#<TopRec sub: #<SubRec i: nil>>

Obviously, it's possible to manually instantiate the entire class hierarchy, but this quickly becomes difficult as the level of nesting increases.

Forced Immutability?

I was hoping to use avromatic models to replace hashes however the writers for fields are private and I can't seem to figure out how to turn off immutability so I can actually use the model.

Avro array insert

Hello,

I have a couple of avro messages - User & UserList

{
  "type": "record",
  "name": "user",
  "fields": [
    {
      "name": "id",
      "type": "int"
    },
    {
      "name": "name",
      "type": "string"
    },
    {
      "name": "email",
      "type": "string"
    }
  ]
}
{
  "type": "record",
  "name": "user_list",
  "fields": [
      {
        "name": "users",
        "type": {
            "type": "array",
            "items": "user"
        }
      }
  ]
}

I am using Avromatic to build the Ruby model. However, When I try to insert in to the UserList, the users field is NilClass. Hence, the insert in to array fails for a new User.

user = Avro::User.new(id: 1, name: "Joe", email: "[email protected]")
userList = Avro::User.new
userList.users << user

Why users field on UserList is not an array? What am I doing wrong?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.