Code Monkey home page Code Monkey logo

stim's Introduction

Stim

Logo

What is Stim?

Stim is a tool for high performance simulation and analysis of quantum stabilizer circuits, especially quantum error correction (QEC) circuits. Typically Stim is used as a python package (pip install stim), though stim can also be used as a command line tool or a C++ library.

Stim's key features:

  1. Really fast simulation of stabilizer circuits. Have a circuit with thousands of qubits and millions of operations? stim.Circuit.compile_sampler() will perform a few seconds of analysis and then produce an object that can sample shots at kilohertz rates.

  2. Semi-automatic decoder configuration. stim.Circuit.detector_error_model() converts a noisy circuit into a detector error model (a Tanner graph) which can be used to configure decoders. Adding the option decompose_operations=True will additionally suggest how hyper errors can be decomposed into graphlike errors, making it easier to configure matching-based decoders.

  3. Useful building blocks for working with stabilizers, such as stim.PauliString, stim.Tableau, and stim.TableauSimulator.

Stim's main limitations are:

  1. There is no support for non-Clifford operations, such as T gates and Toffoli gates. Only stabilizer operations are supported.
  2. stim.Circuit only supports Pauli noise channels (eg. no amplitude decay). For more complex noise you must manually drive a stim.TableauSimulator.
  3. stim.Circuit only supports single-control Pauli feedback. For multi-control feedback, or non-Pauli feedback, you must manually drive a stim.TableauSimulator.

Stim's design philosophy:

  • Performance is king. The goal is not to be fast enough, it is to be fast in an absolute sense. Think of it this way. The difference between doing one thing per second (human speeds) and doing ten billion things per second (computer speeds) is 100 decibels (100 factors of 1.26). Because software slowdowns tend to compound exponentially, the choices we make can be thought of multiplicatively; they can be thought of as spending or saving decibels. For example, under default usage, python is 100 times slower than C++. That's 20dB of the 100dB budget! A fifth of the multiplicative performance budget allocated to language choice! Too expensive! Although stim will never achieve the glory of 30 GiB per second of FizzBuzz, it at least wishes it could.
  • Bottom up. Stim is intended to be like an assembly language: a mostly straightforward layer upon which more complex layers can be built. The user may define QEC constructions at some high level, perhaps as a set of stabilizers or as a parity check matrix, but these concepts are explained to Stim at a low level (e.g. as circuits). Stim is not necessarily the abstraction that the user wants, but stim wants to implement low-level pieces simple enough and fast enough that the high-level pieces that the user wants can be built on top.
  • Backwards compatibility. Stim's python package uses semantic versioning. Within a major version (1.X), stim guarantees backwards compatibility of its python API and of its command line API. Note stim DOESN'T guarantee backwards compatibility of the underlying C++ API.

How do I use Stim?

See the Getting Started Notebook.

Stuck? Get help on the quantum computing stack exchange and use the stim tag.

See the reference documentation:

How does Stim work?

See the paper describing Stim. Stim makes three core improvements over previous stabilizer simulators:

  1. Vectorized code. Stim's hot loops are heavily vectorized, using 256 bit wide AVX instructions. This makes them very fast. For example, Stim can multiply Pauli strings with 100 billion terms in one second.
  2. Reference Frame Sampling. When bulk sampling, Stim only uses a general stabilizer simulator for an initial reference sample. After that, it cheaply derives as many samples as needed by propagating simulated errors diffed against the reference. This simple trick is ridiculously cheaper than the alternative: constant cost per gate, instead of linear cost or even quadratic cost.
  3. Inverted Stabilizer Tableau. When doing general stabilizer simulation, Stim tracks the inverse of the stabilizer tableau that was historically used. This has the unexpected benefit of making measurements that commute with the current stabilizers take linear time instead of quadratic time. This is beneficial in error correcting codes, because the measurements they perform are usually redundant and so commute with the current stabilizers.

How do I cite Stim?

When using Stim for research, please cite:

@article{gidney2021stim,
  doi = {10.22331/q-2021-07-06-497},
  url = {https://doi.org/10.22331/q-2021-07-06-497},
  title = {Stim: a fast stabilizer circuit simulator},
  author = {Gidney, Craig},
  journal = {{Quantum}},
  issn = {2521-327X},
  publisher = {{Verein zur F{\"{o}}rderung des Open Access Publizierens in den Quantenwissenschaften}},
  volume = {5},
  pages = {497},
  month = jul,
  year = {2021}
}

stim's People

Contributors

alexbourassa avatar chrispattison avatar danielbarter avatar dougthor42 avatar dstrain115 avatar ecpeterson avatar fdmalone avatar folded avatar justinledford avatar maffoo avatar markturner289 avatar mghibaudi avatar mmcewen-g avatar newmanmg avatar nickdgardner avatar noajshu avatar oon3m0oo avatar scottpjones avatar speller26 avatar strilanc avatar viathor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stim's Issues

Feature request: append instructions to circuit

I don't see any way to append a CircuitInstruction directly to a Circuit (working in the Python interface)

Use case: I want to separate out the code for the noiseless QEC circuit from the noise. So I write the noiseless version. Then I want to be able to iterate over the circuit and create a new circuit by appending the previous instructions plus new ones for noise (and dealing appropriately with repeat blocks).

Compilation Error

When trying to compile on mac, I get the following error:

[ 24%] Building CXX object CMakeFiles/stim_benchmark.dir/src/simulators/measure_record_batch_writer.cc.o
In file included from /Users/mgnewman/miscellaneous/Stim/src/simulators/measure_record_batch_writer.cc:17:
In file included from /Users/mgnewman/miscellaneous/Stim/src/simulators/measure_record_batch_writer.h:20:
In file included from /Users/mgnewman/miscellaneous/Stim/src/simulators/../simd/simd_bit_table.h:20:
In file included from /Users/mgnewman/miscellaneous/Stim/src/simulators/../simd/simd_bits.h:21:
In file included from /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/random:1641:
In file included from /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/algorithm:643:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:2338:5: error:
delete called on 'MeasureRecordWriter' that is abstract but has
non-virtual destructor [-Werror,-Wdelete-abstract-non-virtual-dtor]
delete __ptr;
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:2651:7: note:
in instantiation of member function
'std::__1::default_delete::operator()' requested here
_ptr.second()(__tmp);
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:2605:19: note:
in instantiation of member function
'std::__1::unique_ptr<MeasureRecordWriter,
std::__1::default_delete >::reset' requested here
~unique_ptr() { reset(); }
^
/Users/mgnewman/miscellaneous/Stim/src/simulators/measure_record_batch_writer.cc:33:27: note:
in instantiation of member function
'std::__1::unique_ptr<MeasureRecordWriter,
std::__1::default_delete >::~unique_ptr' requested
here
writers.push_back(MeasureRecordWriter::make(out, f));
^
1 error generated.
make[2]: *** [CMakeFiles/stim_benchmark.dir/src/simulators/measure_record_batch_writer.cc.o] Error 1
make[1]: *** [CMakeFiles/stim_benchmark.dir/all] Error 2
make: *** [all] Error 2

Add noisy measurement operation

Maybe just an argument to the measurement operation is a probability of flipping the result? Or else specialized MX_NOISY operations.

Observations about SIMD, benchmarking, possible improvements

First of all, thank you for creating this, it is incredibly useful!!! Not the least as a reference for other code writers!

I am the developer of QuantumClifford.jl, which implements some useful Clifford circuit tools. It does not have the elegant Pauli frame tracking that enables your ridiculously high speeds, and generally, it has not been hand-tuned, but it is fairly fast and it has its uses elsewhere ;)

I made some recent improvements in the inner-loop methods (the Pauli multiplication method) that were inspired by your work and wanted to see how it compares against stim. I was surprised that in one (very restricted) micro-benchmark the pure julia code was faster than your C++ SIMD code.

Here is the example:

julia> using QuantumClifford, BenchmarkTools
julia> a = random_pauli(1_000_000);
julia> b = random_pauli(1_000_000);
julia> @btime a*b;
  41.598 μs (4 allocations: 244.39 KiB)

julia> b = random_pauli(1_000_000_000);
julia> a = random_pauli(1_000_000_000);
julia> @btime a*b;
  101.032 ms (4 allocations: 238.42 MiB)
In [2]: from stim import *
In [3]: a = PauliString.random(1_000_000);
In [4]: b = PauliString.random(1_000_000);
In [5]: %timeit a*b
42.7 µs ± 17.8 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [6]: b = PauliString.random(1_000_000_000);
In [7]: a = PauliString.random(1_000_000_000);
In [8]: %timeit a*b
214 ms ± 4.65 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The results are basically the same, except for larger (really large) Pauli's, julia is faster by a factor of 2ish. I do not know whether this is just some slowdown caused by the python interface. Or it might be due to the version of gcc/llvm that was used to compile stim.

Just to be clear, I am not claiming that my julia library is generally faster: its aim is somewhat different from the aim of Stim and I am certain that, holistically, Stim blows everyone out of the water for the task for which it was designed. But given how crazy fast Stim is, I assumed that you would want to know that there might be still some more optimizations to be done. I can try to dig up the machine code that julia compiles if that would be of use. Here is the julia code performing this operation, which is fairly similar to your C++ code.

Add standalone methods for reading/writing samples data

Method should take a SampleFormat, a FILE *, a max_shots (must be a multiple of 64) or similar, an expected_bits_per_shot, and return a simd_bit_table.

enum SampleFormat {
    /// Human readable format.
    ///
    /// For each shot:
    ///     For each measurement:
    ///         Output '0' if false, '1' if true
    ///     Output '\n'
    SAMPLE_FORMAT_01,

    /// Binary format.
    ///
    /// For each shot:
    ///     For each group of 8 measurement (padded with 0s if needed):
    ///         Output a bit packed byte (least significant bit of byte has first measurement)
    SAMPLE_FORMAT_B8,

    /// Transposed binary format.
    ///
    /// For each measurement:
    ///     For each group of 64 shots (padded with 0s if needed):
    ///         Output bit packed bytes (least significant bit of first byte has first shot)
    SAMPLE_FORMAT_PTB64,

    /// Human readable compressed format.
    ///
    /// For each shot:
    ///     For each measurement_index where the measurement result was 1:
    ///         Output decimal(measurement_index)
    SAMPLE_FORMAT_HITS,

    /// Binary run-length format.
    ///
    /// For each shot:
    ///     Append a one to the shot
    ///     For each run length d of zeros between ones (including runs of length 0):
    ///         Output [0x255] * (d // 255) + [d % 255]
    SAMPLE_FORMAT_R8,

    /// Specific to detection event data.
    ///
    /// For each shot:
    ///     Output "shot" + " D#" for each detector that fired + " L#" for each observable that was inverted + "\n".
    SAMPLE_FORMAT_DETS,
};

Improve the performance of PAULI_CHANNEL_2

Currently, PAULI_CHANNEL_2 is 10x slower than DEPOLARIZE2.

An idea to try is, if the operation is applied to enough targets, create an alias sampling table for the probability distribution over the non-identity part of the channel. Then use rare-error sampling to decide which qubits to apply the alias sampling to.

Repetition code distance off by one

The X distance of a repetition code generated with stim.Circuit.generated is off by 1 (inputting distance d generates a distance d+1 repetition code).

E.g. rather than getting 2d-1 qubits I get:

>>> import stim
>>> stim.Circuit.generated("repetition_code:memory", distance=3, rounds=1).num_qubits
7
>>> stim.Circuit.generated("repetition_code:memory", distance=5, rounds=1).num_qubits
11

Looks like it could be fixed by subtracting one here.

Random exception in python when using multiprocessing due to rdseed failure

When I am running a large number of samples simulations simultaneously using stim in a massively multi-process environment (python not supporting multi-threading, I use multi-process computation using python multithreading starmap), this happens semi-randomly when I run a multi-threaded simulation, see the stack trace below:

image001

This seems to be related to a multi-threading bug in libstdc++, so perhaps using the approach used by the google cloud c++ api would be a good approach. See:

googleapis/google-cloud-cpp-common#208

googleapis/google-cloud-cpp-common#272

seeing as the call to random_device is only done once per process in PYBIND_SHARED_RNG(), it is probably safe performance-wise to use the second fix, as the performance regression should be very minor.

This seems to be related to the shared rdseed buffer in certain Intel cpus.

Add some utilities to stim.Tableau

  • stim.Tableau.x_output_pauli(out, in)
  • stim.Tableau.z_output_pauli(out, in)
  • stim.Tableau.inverse_x_output_pauli(out, in)
  • stim.Tableau.inverse_z_output_pauli(out, in)
  • stim.Tableau.inverse_x_output(out, unsigned: bool = false)
  • stim.Tableau.inverse_z_output(out, unsigned: bool = false)
  • stim.Tableau.inverse(unsigned: bool = false)

FrameSimulator is implementing MY as if it were MRY

    auto r = FrameSimulator::sample_flipped_measurements(Circuit(R"CIRCUIT(
        RY 0
        MY 0
        MY 0
        Z_ERROR(1) 0
        MY 0
        MY 0
        Z_ERROR(1) 0
        MY 0
        MY 0
    )CIRCUIT"), 10000, SHARED_TEST_RNG());
    ASSERT_EQ(r[0].popcnt(), 0);
    ASSERT_EQ(r[1].popcnt(), 0);
    ASSERT_EQ(r[2].popcnt(), 10000);
    ASSERT_EQ(r[3].popcnt(), 10000);
    ASSERT_EQ(r[4].popcnt(), 0);
    ASSERT_EQ(r[5].popcnt(), 0);

fails

Stim package cannot run properly on PyCharm

Install the stim package as required and import it. But it cannot run properly on PyCharm. The result is "ImportError: PauliString: PyType_Ready failed (UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 46-47: invalid continuation byte)!"
issue

Compressing circuit representation

Many simulators typically group noisy channels as pairs (unitary, error channel). When translating this into Stim, it can cause a significant blowup in the representation of the circuit (e.g. requiring a new line for each noisy gate). It would be a nice feature if Stim had a function that could automate the compression of such a circuit into an equivalent representation that minimized the number of lines.

Allow Pauli targets on OBSERVABLE_INCLUDE instructions for debugging

For some simulation experiments, it would be handy to be able to track both the X and Z logical observables without having to set up an EPR pair on the side. These Pauli targets could be used to artificially adjust the simulated observable. They would of course be totally incompatible with physical experiments (you can't do the conversion from measurements to detection events).

Example:

OBSERVABLE_INCLUDE(0) X1 X2 X3
OBSERVABLE_INCLUDE(1) Z1 Z2 Z3
DEPOLARIZE1(0.001) 0 1 2
H 0 1 2
DEPOLARIZE2(0.001) 0 1
CNOT 0 1
CNOT 1 0
CNOT 0 1
H 0 1 2
OBSERVABLE_INCLUDE(0) X1 X2 X3
OBSERVABLE_INCLUDE(1) Z1 Z2 Z3

Build warnings preventing compiling

When building on a mac, I receive the following warnings:

stim-src/src/stabilizers/pauli_string.test.cc:201:9: error: moving a temporary object
      prevents copy elision [-Werror,-Wpessimizing-move]
    x = std::move(PauliString::from_str("XXY"));
        ^
stim-src/src/stabilizers/pauli_string.test.cc:201:9: note: remove std::move call here
    x = std::move(PauliString::from_str("XXY"));
        ^~~~~~~~~~                            ~
stim-src/src/stabilizers/pauli_string.test.cc:203:9: error: moving a temporary object
      prevents copy elision [-Werror,-Wpessimizing-move]
    x = std::move(PauliString::from_str("-IIX"));
        ^
stim-src/src/stabilizers/pauli_string.test.cc:203:9: note: remove std::move call here
    x = std::move(PauliString::from_str("-IIX"));
        ^~~~~~~~~~                             ~
2 errors generated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.