Code Monkey home page Code Monkey logo

s2n-netbench's Introduction

s2n-netbench

An efficiency, performance, and correctness analysis tool for transport protocols.

Why does this exist?

There are many transport protocols and several implementations of each. This tool exists to provide users with the ability to perform a direct comparison and decide the best implementation for their workloads.

Here are a few examples of questions that s2n-netbench aims to answer:

  • What is the cost of encrypting traffic?
    • How much more capacity will I need when deploying this?
  • What transport protocol performs best
    • in a data center?
    • in networks with high packet loss?
    • in networks with high latency?
    • with many concurrent, multiplexed streams?
  • Which implementation of "X" protocol is best for my workload?
  • What is the optimal configuration of the transport's settings for my workload?
  • How does certificate chain length affect handshake throughput?
  • Is implementation "X" interoperable with implementation "Y" of "Z" protocol?

Quickstart

A basic use of s2n-netbench is demonstrated in the netbench-run.sh script. This script will

  • compile all necessary s2n-netbench utilities
  • generate scenario files
  • execute the request-response.json scenario using s2n-quic and s2n-tls drivers
  • execute the connect.json scenario using s2n-quic and s2n-tls drivers
  • collect statistics from the drivers using netbench-collector
  • generate a report in the ./target/netbench/report directory

From the main netbench folder, run the following commands

./scripts/netbench-run
cd target/netbench/report
python3 -m http.server 9000

Then navigate to localhost:9000 in a browser to view the netbench results.

Note that this script does not support bpftrace, as it runs without any of the elevated permissions required for bpf programs.

How it works

netbench-scenarios

netbench provides tools to write scenarios that describe application workloads. An example of a scenario is a simple request/response pattern between a client and server:

use netbench_scenario::prelude::*;

config!({
    /// The size of the client's request to the server
    let request_size: Byte = 1.kilobytes();

    /// The size of the server's response to the client
    let response_size: Byte = 10.megabytes();
});

pub fn scenario(config: Config) -> Scenario {
    let Config {
        request_size,
        response_size,
    } = config;

    Scenario::build(|scenario| {
        let server = scenario.create_server();

        scenario.create_client(|client| {
            client.connect_to(server, |conn| {
                conn.open_bidirectional_stream(
                    |local| {
                        local.send(request_size);
                        local.receive(response_size);
                    },
                    |remote| {
                        remote.receive(request_size);
                        remote.send(response_size);
                    },
                );
            });
        });
    })
}

This scenario generates a json file of instructions. These instructions are protocol and language independent, which means they can easily be executed by a "netbench driver", written in any language or runtime.

netbench-driver

Netbench drivers are responsible for executing netbench scenarios. Each transport protocol has a client and server implementation. Each of these implementations is a self-container binary that consumes a scenario.json file. Implemented drivers include:

  • TCP
  • native-tls
    • OpenSSL on Linux
    • Secure Transport on macOS
    • SChannel on Windows
  • s2n-quic
  • s2n-tls

netbench-collector

Driver metrics are collected with the netbench-collector utility. There are two implementation of this available - a generic utility and a bpftrace utility. The generic utility uses the proc fs to gather information about the process, while the bpftrace implementation is able to collect a wider variety of statistics through ebpf probes.

The collector binary takes a netbench-driver as an argument. The driver binary is spawned as a child process. The collector will continuously gather metrics from the driver and emit those metrics to stdout.

netbench-cli

netbench-cli is used to visualize the results of the scenarios. This reports use vega which is "a declarative format for creating, saving, and sharing visualization designs".

report is used to generate individual .json reports. These can be visualized by pasting them into the vega editor.

report-tree is used to to generate a human-readable .html report. Given a directory structure like the following

request-response/ # scenario
├─ tls/ # driver
│  ├─ client.json
│  ├─ server.json
├─ quic/
   ├─ client.json
   ├─ server.json

report-tree will generate the individual reports and package them into a human readable index.html file that can be used to view graphs of the results.

A sample report can be found here.

Note that you will not be able to open the report directly since the report relies on the jsdelivr cdn. This request will fail when the URL is a local file scheme with a CORS request not HTTP error.

To get around this, use a local server.

# assuming the report is in ./report
cd report
# start a local server on port 9000
python3 -m http.server 9000

In a browser, navigate to localhost:9000 to view the netbench report.

s2n-netbench's People

Contributors

amazon-auto avatar arunsathiya avatar camshaft avatar dependabot[bot] avatar dougch avatar goatgoose avatar jmayclin avatar lrstewart avatar lundinc2 avatar maddeleine avatar mark-simulacrum avatar orrinni avatar toidiu avatar wesleyrosenblum avatar x77a1 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

dougch goatgoose

s2n-netbench's Issues

ci: build individual crates

Problem:

Building the entire s2n-netbench workspace may result in different dependencies and features being enabled than building each crate individually. This can result in problems when publishing to crates.io, such as #5

Solution:

Add CI tasks to build each crate individually, similar to as was done in aws/s2n-quic#1673

Requirements / Acceptance Criteria:

If a dependency is used that causes an individual crate to be unbuildable, the CI task will fail

[orchestrator]: investigate `native_tls_driver` stuck behavior

Problem:

When run as part of the orchestrator, the native_tls client driver process gets stuck and never terminates. This results in the orchestrator never finishing and also hanging forever. Only the native_tls driver exhibited this behavior.

Observed behavior

  • the behavior would occur when running the incast scenario with 5+ server
  • the issue became more reproducible when using multiple servers

[orchestrator]: Investigate zombie process during Client Workflow Workflow

Somethings is causing the collector to become a Zombie process during Russula client worker workflow

We can detect the zombie process and continue with Russula shutdown, which causes the process to be killed. This indicates that Russula is possibly preventing a clean close of the collector.

root       54245  Sl ./target/debug/russula_cli --protocol NetbenchClientWorker --port 9000 --peer-list 54.198.168.151:4433
root       54688  Z  [netbench-collec] <defunct>

Tracking issue: netbench-orchestrator

Goals

The goal of the Orchestrator is to automate Netbench runs in the cloud.

Why?
Often developers of transport protocols are interested in gathering performance data for the protocol they are developing. Netbench is a tool which can be used to measure this performance data. However, its often necessary to run Netbench scenarios in the cloud so that the results better match production systems.

Project plan

Branch with working POC: https://github.com/aws/s2n-netbench/tree/ak-fullOrchImport

  • Russula: component to coordinate remote workers #20
  • Aws utility wrappers
  • Orchestrator
    • base components (error, state) #22
    • cli #26
    • report, dashboad (index.html) #27
    • orchestrator run req_res.json #28

[orchestrator]: Separate Russula constructors for Coordinator and Worker

Problem:

Currently the Coordinator and Worker both take a list of peers to connect to. While this works, the Worker should only ever be connecting to a single Coordinator and doent need to take a list.

Solution

Create different builder/endpoint for Coordinator/Workers to capture different usage patterns.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.