Code Monkey home page Code Monkey logo

libprotobuf-mutator's Introduction

libprotobuf-mutator

Build Status Fuzzing Status

Overview

libprotobuf-mutator is a library to randomly mutate protobuffers.
It could be used together with guided fuzzing engines, such as libFuzzer.

Quick start on Debian/Ubuntu

Install prerequisites:

sudo apt-get update
sudo apt-get install protobuf-compiler libprotobuf-dev binutils cmake \
  ninja-build liblzma-dev libz-dev pkg-config autoconf libtool

Compile and test everything:

mkdir build
cd build
cmake .. -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug
ninja check

Clang is only needed for libFuzzer integration.
By default, the system-installed version of protobuf is used. However, on some systems, the system version is too old. You can pass LIB_PROTO_MUTATOR_DOWNLOAD_PROTOBUF=ON to cmake to automatically download and build a working version of protobuf.

Installation:

ninja
sudo ninja install

This installs the headers, pkg-config, and static library. By default the headers are put in /usr/local/include/libprotobuf-mutator.

Usage

To use libprotobuf-mutator simply include mutator.h and mutator.cc into your build files.

The ProtobufMutator class implements mutations of the protobuf tree structure and mutations of individual fields. The field mutation logic is very basic -- for better results you should override the ProtobufMutator::Mutate* methods with more sophisticated logic, e.g. using libFuzzer's mutators.

To apply one mutation to a protobuf object do the following:

class MyProtobufMutator : public protobuf_mutator::Mutator {
 public:
  // Optionally redefine the Mutate* methods to perform more sophisticated mutations.
}
void Mutate(MyMessage* message) {
  MyProtobufMutator mutator;
  mutator.Seed(my_random_seed);
  mutator.Mutate(message, 200);
}

See also the ProtobufMutatorMessagesTest.UsageExample test from mutator_test.cc.

Integrating with libFuzzer

LibFuzzerProtobufMutator can help to integrate with libFuzzer. For example

#include "src/libfuzzer/libfuzzer_macro.h"

DEFINE_PROTO_FUZZER(const MyMessageType& input) {
  // Code which needs to be fuzzed.
  ConsumeMyMessageType(input);
}

Please see libfuzzer_example.cc as an example.

Mutation post-processing (experimental)

Sometimes it's necessary to keep particular values in some fields without which the proto is going to be rejected by fuzzed code. E.g. code may expect consistency between some fields or it may use some fields as checksums. Such constraints are going to be significant bottleneck for fuzzer even if it's capable of inserting acceptable values with time.

PostProcessorRegistration can be used to avoid such issue and guide your fuzzer towards interesting code. It registers callback which will be called for each message of particular type after each mutation.

static protobuf_mutator::libfuzzer::PostProcessorRegistration<MyMessageType> reg = {
    [](MyMessageType* message, unsigned int seed) {
      TweakMyMessage(message, seed);
    }};

DEFINE_PROTO_FUZZER(const MyMessageType& input) {
  // Code which needs to be fuzzed.
  ConsumeMyMessageType(input);
}

Optional: Use seed if callback uses random numbers. It may help later with debugging.

Important: Callbacks should be deterministic and avoid modifying good messages. Callbacks are called for both: mutator generated and user provided inputs, like corpus or bug reproducer. So if callback performs unnecessary transformation it may corrupt the reproducer so it stops triggering the bug.

Note: You can add callback for any nested message and you can add multiple callbacks for the same message type.

static PostProcessorRegistration<MyMessageType> reg1 = {
    [](MyMessageType* message, unsigned int seed) {
      TweakMyMessage(message, seed);
    }};
static PostProcessorRegistration<MyMessageType> reg2 = {
    [](MyMessageType* message, unsigned int seed) {
      DifferentTweakMyMessage(message, seed);
    }};
static PostProcessorRegistration<MyMessageType::Nested> reg_nested = {
    [](MyMessageType::Nested* message, unsigned int seed) {
      TweakMyNestedMessage(message, seed);
    }};

DEFINE_PROTO_FUZZER(const MyMessageType& input) {
  // Code which needs to be fuzzed.
  ConsumeMyMessageType(input);
}

UTF-8 strings

"proto2" and "proto3" handle invalid UTF-8 strings differently. In both cases string should be UTF-8, however only "proto3" enforces that. So if fuzzer is applied to "proto2" type libprotobuf-mutator will generate any strings including invalid UTF-8. If it's a "proto3" message type, only valid UTF-8 will be used.

Extensions

Currently the library does not mutate extensions. This can be a problem if extension contains required fields so the library will not be able to change the message into valid initialized state. You can use post processing hooks to cleanup/initialize the message as workaround.

Users of the library

Grammars

Bugs found with help of the library

Chromium

Envoy

Related materials

libprotobuf-mutator's People

Contributors

allen-webb avatar ammaraskar avatar bshastry avatar dende avatar emmettneyman avatar georgthegreat avatar hctim avatar iii-i avatar jonathanmetzman avatar jorgbrown avatar kcc avatar l4l avatar lebdron avatar ligurio avatar morehouse avatar nareddyt avatar pefoley2 avatar spq avatar toshipiazza avatar tpopela avatar vadorovsky avatar vitalybuka avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.