Code Monkey home page Code Monkey logo

supercop-blockciphers's Introduction

supercop-blockciphers

This repository contains fast block cipher implementations (for x86-64) in counter-mode for the SUPERCOP cryptographic benchmarking framework. These are the artifacts of the implementation/integration part of my Master's Thesis "Block Ciphers: Fast Implementations on x86-64 Architecture". Fulltext is available at: http://urn.fi/URN:NBN:fi:oulu-201305311409

SUPERCOP: http://bench.cr.yp.to/supercop.html

Installation: Copy contents of crypto_stream/ of this repository to crypto_stream/ in SUPERCOP package.

Licensing note: Some implementations contain GPLv2 licensed code, while other are mix of permissive licenses (ISC, new BSD, MIT, public-domain).

-Jussi Kivilinna

beyond_master branch

This branch contains new implementations, that were not included in Master's Thesis.

New implementations so far:

  • Camellia AES-NI/AVX2
  • Serpent AVX2 (by Johannes Götzfried)
  • Twofish AVX2 (using vpgatherdd)
  • Twofish AVX2 (without vpgatherdd, based on AVX impl.)
  • Blowfish AVX2 (using vpgatherdd)

Results on Intel Core i5-4570 (haswell, cpuid: 306C3h):

  • Blowfish
    • Improved 16-way word-sliced with table look-ups (AVX): 8.11 cycles/byte
    • 4-way table look-up: 8.55 cycles/byte
    • Götzfried's 16-way word-sliced with table look-ups (AVX): 10.35 cycles/byte
    • 32-way word-sliced (AVX2, vpgatherdd): 12.95 cycles/byte
    • 1-way table look-up: 24.26 cycles/byte
    • OpenSSL: 26.59 cycles/byte
    • Crypto++: 28.07 cycles/byte
  • AES
    • Crypto++ (AES-NI): 0.82 cycles/byte
    • 8-way AVX bit-sliced: 6.16 cycles/byte
    • 8-way SSSE3 bit-sliced (Käsper & Schwabe): 6.36 cycles/byte
    • 2-way table look-up: 7.85 cycles/byte
    • 1-way table look-up: 10.87 cycles/byte
  • Camellia
    • 32-way byte-sliced with (AVX2 & AES-NI): 3.72 cycles/byte
    • 16-way byte-sliced with (AVX & AES-NI): 5.93 cycles/byte
    • 2-way table look-up: 10.37 cycles/byte
    • 1-way table look-up: 16.72 cycles/byte
    • OpenSSL: 18.91 cycles/byte
    • Crypto++: 22.12 cycles/byte
  • Serpent
    • Götzfried's 16-way word-sliced (AVX2): 5.18 cycles/byte
    • Götzfried's 8-way word-sliced (AVX): 10.29 cycles/byte
    • 8-way word-sliced (SSE2): 10.47 cycles/byte
    • C impl. from Linux kernel: 34.18 cycles/byte
  • Twofish
    • 16-way word-sliced with table look-ups (AVX2, without vpgatherdd): 8.37 cycles/byte
    • Improved 8-way word-sliced with table look-ups (AVX): 8.81 cycles/byte
    • Götzfried's 16-way word-sliced with table look-ups (AVX): 10.33 cycles/byte
    • 3-way table look-up: 11.24 cycles/byte
    • 2-way table look-up: 12.10 cycles/byte
    • 16-way word-sliced (AVX2, vpgatherdd): 12.73 cycles/byte
    • Assembly impl. from Linux kernel: 16.85 cycles/byte
    • Crypto++: 18.10 cycles/byte
    • 1-way table look-up: 18.71 cycles/byte

Results on Intel Core i3-6100 (skylake, measured in 'bench-slope' tool of libgcrypt):

  • Blowfish
    • 32-way word-sliced (AVX2, vpgatherdd): 5.41 cycles/byte
    • 4-way table look-up (libgcrypt impl.): 7.91 cycles/byte
  • AES
    • libgcrypt (AES-NI): 0.63 cycles/byte
  • Camellia
    • 32-way byte-sliced with (AVX2 & AES-NI, libgcrypt impl.): 3.12 cycles/byte
  • Serpent
    • 16-way word-sliced (AVX2, libgcrypt impl.): 4.77 cycles/byte
  • Twofish
    • 16-way word-sliced (AVX2, vpgatherdd): 6.40 cycles/byte
    • 3-way table look-up (libgcrypt impl.): 10.1 cycles/byte

supercop-blockciphers's People

Contributors

jkivilin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.