Code Monkey home page Code Monkey logo

wor-batchifier's Introduction

Wolox on Rails - Batchifier

Gem Version Dependency Status Build Status Code Climate Test Coverage

Gem that allows you to easily divide processing or requests that work with a lot of information into several batches with smaller chunks of data, taking care of the result from each one and providing it joint together based on different strategies for parsing said response.

Installation

Add this line to your application's Gemfile:

gem 'wor-batchifier'

And then execute:

$ bundle

Or install it yourself as:

$ gem install wor-batchifier

Usage

Basic use:

The first step is to include the Wor::Batchifier in the Class or Module you intend to use it:

class MyClass
  include Wor::Batchifier
end

If you're going to use the gem in your controllers, a good practice would be to define a parent controller from which all other controllers will have to extend to have access to the batchifier's methods. So, let's do that in our ApplicationController.rb:

class ApplicationController < ActionController::Base
  include Wor::Batchifier
end

You could also include the batchifier just in the controllers you intend to use it in.

The final step, is to find any request or process you wish to perform with smaller chunks of data and utilize the batchifier's methods to divide it into smaller tasks.

For example, let's pretend we have an endpoint called bulk_request that communicates with a third API and sends a lot of information to be utilized.

def bulk_request
  ThirdAPI.bulk_request(params[:information])
end

Now we will partition that request into chunks using the batchifier as to not overburden the ThirdAPI:

def bulk_request
  execute_in_batches(params[:information], batch_size: 100, strategy: :add) do |chunks|
    ThirdAPI.bulk_request(chunks)
  end
end

The batchifier will take three parameters, the first one being the information that needs to be partitioned, then the batch_size we wish to utilize and finally the symbol of the strategy that will be implemented on the response of each batch.

Available strategies

  • Add: For each request, it joins together each response no matter the result.
  • Maintain-Unique: It will only add the results that are not present already in the response.
  • No-Response: It will not provide any response whatsoever.

Adding new strategies

Should you desire to add new strategies, it's as simple as creating a new class and defining a method called merge_strategy which will hold the logic that will be implemented to parse and merge the results from each batch. Let's look at an example:

module Wor
  module Batchifier
    class MaintainUnique < Strategy
      def merge_strategy(response,memo)
        return response.merge(memo) { |_, v1, _| v1 }
      end
    end
  end
end

The merge_strategy will receive two parameters, the first being "response" which is the total response which will be returned from execute_in_batches, and "memo" is the recursive response from each batch that will be added to response in each iteration. If you want to merge or do something else entirely, you have the option to do so.

All strategies have a base_case which by default is an empty hash {} but if you wish to override it, you can define your own in your strategy by simply adding a method called base_case which should return the value you desire for your own personal needs.

def base_case
 # An initial step or value where the responses will be merged to.
end

The new class that will hold the method merge_strategy should inherit the class Strategy. If the strategy doesn't define the method, an exception will be raised when trying to utilize it warning that it does not respect the contract set by the Strategy Interface.

You can also define a merge strategy via Proc, without the need of creating a new class. The Proc should receive two parameters: the first being "response" and the second one being "memo", both of which work the same way as they do when you create a class and define its merge strategy. All Procs have {} as their base case which cannot be changed. Let's look at an example:

merge_strategy = Proc.new do |response, memo|
  memo = [] if memo.empty?
  memo + response
end

execute_in_batches(collection, batch_size: 10, strategy: merge_strategy) do |chunk|
  ...
end

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Run rubocop lint (rubocop -R --format simple)
  5. Run rspec tests (bundle exec rspec)
  6. Push your branch (git push origin my-new-feature)
  7. Create a new Pull Request

About

This project was developed by Pedro Jara along with Diego Raffo at Wolox.

Maintainers: Pedro Jara, Federico Volonnino

Contributors: Pedro Jara, Federico Volonnino

Wolox

License

wor-batchifier is available under the MIT license.

Copyright (c) 2017 Wolox

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

wor-batchifier's People

Contributors

redwarewolf avatar fvolonnino avatar dependabot[bot] avatar

Stargazers

Guido Marucci Blas avatar Ramiro Rojo avatar Maxi avatar  avatar  avatar

Watchers

Gabriel Zanzotti avatar  avatar James Cloos avatar Luciana Reznik avatar Federico Ezequiel Di Nucci avatar  avatar Edwin Gómez avatar Wolox avatar Juan Ignacio avatar Ignacio Rivera avatar  avatar draffo avatar Ramiro Rojo avatar Gabriel Bori avatar Nicolas Zarewsky avatar Christian Petersen avatar Javier Flores avatar Jerson López Castaño avatar Francisco Iglesias avatar Ezequiel Schwemmer  avatar  avatar Gustavo Hernan Siciliano avatar  avatar  avatar Cristian Vega avatar  avatar  avatar

Forkers

fvolonnino

wor-batchifier's Issues

Replace .to_a with lazy array mapping

The method execute_in_batches(collection, batch_size: 100, strategy: :add) uses the method to_a which loads the whole array in memory.
With high batch_sizes this will be very costly. It's suggested to replace this with a lazy mapping instead.

Custom merge strategies as blocks AND classes

Give the option to the user of the gem to define a custom strategy as a block given as a parameter of the "execute_in_batches" method.

Example:

execute_in_batches(information, batch_size: 100, { |batch| merge_batch(batch) })

instead and in addition to being able to use the gem in the following way:

execute_in_batches(information, batch_size: 100, strategy: :add)

Rescue exceptions from batch processing

When the wor-batchifier executes a block of code of what it should do with the batches, if an exception arises, it should be caught, processed and shown in a more formal way with exceptions that belong to the gem and not a stack trace that comes from the gem code.

Parallel Batch Execution

Add a feature to process each batch in parallel in separate processes, letting the user of the gem choose in how many different procceses should the batch be ran in.

Example:

execute_in_batches(information, batch_size: 100, strategy: :add, processes: 4)

The default should be 1 and the gem should allow you to do:

execute_in_batches(information, batch_size: 100, strategy: :add)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.