Code Monkey home page Code Monkey logo

line_server's Introduction

LineServer

Requirements

Docker

Questions

How does your system work? (if not addressed in comments in source)

At application start (lib/line_server/application.ex) the file to be served is read to create the line index and save in ETS (http://erlang.org/doc/man/ets.html). This is done by using the IndexCreator module and IndexRepository module. After this, lines can be server through the REST API.

The index saves for each the number of bytes in the file up till the end of that line. In order to get the start and end of a line (in byte offset) retrieve the index for that line and the previous one.

Lines are served in the GET /lines/:line_number endpoint. If the submitted :line_number is out of bounds for that file a 415 error code is returned. When a request comes in, a lookup is performed in the index to retrieve the necessary information to retrieve that line from the line. Afterwards the file is read at a specific byte offset where the line starts for the amount of bytes of that line. The request returns with 200 code and the line.

What documentation, websites, papers, etc did you consult in doing this assignment?

https://hexdocs.pm/elixir/File.html http://erlang.org/doc/man/file.html#type-mode http://erlang.org/doc/man/file.html#pread-2 http://erlang.org/doc/man/ets.html https://gist.github.com/brienw/85db445a0c3976d323b859b1cdccef9a http://engineering.avvo.com/articles/using-env-with-elixir-and-docker.html https://stackoverflow.com/questions/965053/extract-filename-and-extension-in-bash https://www.digitalocean.com/community/tutorials/how-to-share-data-between-the-docker-container-and-the-host

What third-party libraries or other tools does the system use? How did you choose each library or framework you used?

How long did you spend on this exercise? If you had unlimited more time to spend on this, how would you spend it and how would you prioritize each item?

Around 7 hours I believe, not sure. I would run some performance testing covering different file sizes and number of users to observe how the system would behave. Off of those results see if the solution needs adjustments. I would also consider adding a cache to avoid reading from disk on every request. Although, the effectiveness of the cache is highly dependant on the traffic patterns and the format of the file being served (e.g. number of lines and size of each line).

How will your system perform with a 1 GB file? a 10 GB file? a 100 GB file?

For 1 and 10GB I believe it would perform well, but 100GB raises some questions. At the moment the one case I can think of would be the size of the index being too big to fit into memory. Lets say were working with a server with 8GB in memory, a +100GB file with a small amount of characters on each line could surpass that in index size. Also, constantly reading lines that fill up memory would absolutely cause slowness. Apart from that, not having a caching solution and always reading from disk does influence performance.

How will your system perform with 100 users? 10000 users? 1000000 users?

If I'm not mistaken I believe the system would scale well horizontally. To with the increase in req/s we could add more servers, each server with a copy of the file to serve.

If you were to critique your code, what would you have to say about it?

I think the IndexCreator needs some improvement, could make it easier to read and a simpler implementation. I'm not very used to using the {:ok} and {:err} for contracts. It is commonly used pattern in the elixir community and something I wanted to try out. I think I would need to use them in a larger project to understand the best cases for it.

line_server's People

Contributors

axfcampos avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.