Code Monkey home page Code Monkey logo

brc's Introduction

Brc

Update 1 The main switch is from using File.stream to erlang's prim_file. For some reason, and I want to investigate this, is that the File.stream based approach only used 8 of 16 processors. That's on my 8 physical CPU, 16 logical CPU machine.

Erlang's prim_file is very handy here. You can read a block of bytes, then at that stopping point you can read_line to get to the end of the line that could have been chopped off from the block read. Thanks to icedragon200 for pointing me to prim_file.

Original

My attempt at the billion row challenge in Elixir. Elixir 1.16.0 - important because the argument order for File.stream changed between 1.15 and 1.16. Vanilla elixir, no extra libraries to run. I am including eflambe to run & make flamegraphs to help tune performance.

brc_city was my first attempt. It used a process for every city. Don't do this, most of your app's time will be spent in process sleeping.

brc uses a pool of workers. Each worker receives a list of cities. Fewer workers with more work eliminates idle time.

If using eflambe, run something like

iex -S mix

:eflambe.apply({Brc, :run_file_buf, ["measurements.txt"]}, [output_format: :brendan_gregg, open: :speedscope])

Installation

After cloning, run 'mix deps.get' to get eflambe. Then 'mix escript.build', then './brc measurments.txt'

brc's People

Contributors

rrcook avatar

Stargazers

David Viramontes avatar Maheep Kumar avatar  avatar Joao P Dubas avatar Abe Guilherme Hidek avatar  avatar Otu Ekanem avatar

Watchers

 avatar Ivan Todorov avatar

Forkers

stevensonmt

brc's Issues

String comparison

Not sure it has any performance impact, but just FYI since Elixir does bitstring comparison byte by byte and your city keys are going to be unique strings (enforced by using them as keys for a map) you don't gain anything by comparing against the key versus comparing against the generated string in this function:

 keys_strings =
      Map.keys(combined_map)
      |> Enum.map(fn key ->
        {min_temp, count, sum, max_temp} = Map.get(combined_map, key)

        {key,
         "#{key}=#{min_temp / 10}/#{:erlang.float_to_binary(sum / (count * 10), decimals: 1)}/#{max_temp / 10}"}
      end)

    # sort the strings by city/key then discard the key, keep the output
    sorted_strings = Enum.sort_by(keys_strings, &elem(&1, 0)) |> Enum.map(&elem(&1, 1))

and you might get a tiny performance improvement by avoiding a second Enum.map call.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.