Code Monkey home page Code Monkey logo

cereal's Introduction

MSGQ: A lock free single producer multi consumer message queue

What is this library?

MSGQ is a generic high performance IPC pub sub system with a single publisher and multiple subscribers. MSGQ is designed to be a high performance replacement for ZMQ-like SUB/PUB patterns. It uses a ring buffer in shared memory to efficiently read and write data. Each read requires a copy. Writing can be done without a copy, as long as the size of the data is known in advance. While MSGQ is the core of this library, this library also allows replacing the MSGQ backend with ZMQ or a spoofed implementation that can be used for deterministic testing. This library also contains visionipc, an IPC system specifically for large contiguous buffers (like images/video).

Storage

The storage for the queue consists of an area of metadata, and the actual buffer. The metadata contains:

  1. A counter to the number of readers that are active
  2. A pointer to the head of the queue for writing. From now on referred to as write pointer
  3. A cycle counter for the writer. This counter is incremented when the writer wraps around
  4. N pointers, pointing to the current read position for all the readers. From now on referred to as read pointer
  5. N counters, counting the number of cycles for all the readers
  6. N booleans, indicating validity for all the readers. From now on referred to as validity flag

The counter and the pointer are both 32 bit values, packed into 64 bit so they can be read and written atomically.

The data buffer is a ring buffer. All messages are prefixed by an 8 byte size field, followed by the data. A size of -1 indicates a wrap-around, and means the next message is stored at the beginning of the buffer.

Writing

Writing involves the following steps:

  1. Check if the area that is to be written overlaps with any of the read pointers, mark those readers as invalid by clearing the validity flag.
  2. Write the message
  3. Increase the write pointer by the size of the message

In case there is not enough space at the end of the buffer, a special empty message with a prefix of -1 is written. The cycle counter is incremented by one. In this case step 1 will check there are no read pointers pointing to the remainder of the buffer. Then another write cycle will start with the actual message.

There always needs to be 8 bytes of empty space at the end of the buffer. By doing this there is always space to write the -1.

Reset reader

When the reader is lagging too much behind the read pointer becomes invalid and no longer points to the beginning of a valid message. To reset a reader to the current write pointer, the following steps are performed:

  1. Set valid flag
  2. Set read cycle counter to that of the writer
  3. Set read pointer to write pointer

Reading

Reading involves the following steps:

  1. Read the size field at the current read pointer
  2. Read the validity flag
  3. Copy the data out of the buffer
  4. Increase the read pointer by the size of the message
  5. Check the validity flag again

Before starting the copy, the valid flag is checked. This is to prevent a race condition where the size prefix was invalid, and the read could read outside of the buffer. Make sure that step 1 and 2 are not reordered by your compiler or CPU.

If a writer overwrites the data while it's being copied out, the data will be invalid. Therefore the validity flag is also checked after reading it. The order of step 4 and 5 does not matter.

If at steps 2 or 5 the validity flag is not set, the reader is reset. Any data that was already read is discarded. After the reader is reset, the reading starts from the beginning.

If a message with size -1 is encountered, step 3 and 4 are replaced by increasing the cycle counter and setting the read pointer to the beginning of the buffer. After that another read is performed.

cereal's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cereal's Issues

Make msgq work on MacOS

Current msgq relies on sending a signal to the receiver thread for the poll function to work correctly. On linux we use syscall(SYS_tkill, tid, SIGUSR2) for this. Does a MacOS equivalent exist?

Debug_controls.py

I am not able to run the debug_control.py. when I run , I get this error

(openpilot) cyngn@vivekgr:~/roadtrain/openpilot/tools/carcontrols$ python debug_controls.py
make: Nothing to be done for 'all'.
make: Nothing to be done for 'all'.
[ CC ] can_list_to_can_capnp.o
clang++ -std=c++11 -g -fPIC -I../ -I../../ -O2 -Werror=implicit-function-declaration -Werror=incompatible-pointer-types -Werror=int-conversion -Werror=return-type -Werror=format-extra-args -I/usr/include/libusb-1.0 -MMD
-Iinclude -I.. -I../..
-I../../phonelibs/capnp-cpp/include
-I../../phonelibs/zmq/aarch64/include
-I../../selfdrive/messaging
-c -o 'can_list_to_can_capnp.o' 'can_list_to_can_capnp.cc'
can_list_to_can_capnp.cc:6:10: fatal error: 'cereal/gen/cpp/log.capnp.h' file not found
#include "cereal/gen/cpp/log.capnp.h"

Would be great if someone can help me.

OpenCL stuff deprecated: first deprecated in macOS 10.14

I hit commaai/openpilot#1358 and did scons -j4 as suggested there,
but I get 7 errors generated from deprecation warnings from OpenCL that look like this:

clang++ -o cereal/messaging/bridge cereal/messaging/bridge.o -Lphonelibs/libyuv/mac/lib -L/usr/local/lib -L/opt/homebrew/lib -L/usr/local/opt/openssl/lib -L/opt/homebrew/opt/openssl/lib -L/System/Library/Frameworks/OpenGL.framework/Libraries -Lcereal -Lphonelibs -Lopendbc/can -Lselfdrive/boardd -Lselfdrive/common cereal/libmessaging.a -lzmq -framework OpenCL
cereal/visionipc/visionbuf_cl.cc:48:18: error: 'clCreateCommandQueue' is deprecated: first deprecated in macOS 10.14 - (Define
      CL_SILENCE_DEPRECATION to hide this warning) [-Werror,-Wdeprecated-declarations]
  this->copy_q = clCreateCommandQueue(ctx, device_id, 0, &err);

I see a similar things was fixed on commaai/openpilot#1394, does it just need porting here?

Cannot build on macOS due to eventfd

eventfd is only available on Linux, so building openpilot and its tools (specifically I'm trying to run cabana) fails on macOS due to the way IPC synchronization was implemented in #439.

Here is the full error I see, after freshly cloning master and running scons a second time:

$ scons -j8 -u -k
scons: Entering directory `/Users/ebrown1/PycharmProjects/openpilot3'
scons: Reading SConscript files ...
Git commit hash for gitversion.h: e1805f65
scons: done reading SConscript files.
scons: Building targets ...
clang++ -o cereal/messaging/event.os -c -std=c++1z -DGL_SILENCE_DEPRECATION -DSWAGLOG="\"common/swaglog.h\"" -g -fPIC -O2 -Wunused -Werror -Wshadow -Wno-unknown-warning-option -Wno-deprecated-register -Wno-register -Wno-inconsistent-missing-override -Wno-c99-designator -Wno-reorder-init-list -Wno-error=unused-but-set-variable -DGL_SILENCE_DEPRECATION -DSWAGLOG="\"common/swaglog.h\"" -fPIC -I/opt/homebrew/include -I/opt/homebrew/opt/[email protected]/include -I. -Ithird_party/acados/include -Ithird_party/acados/include/blasfeo/include -Ithird_party/acados/include/hpipm/include -Ithird_party/catch2/include -Ithird_party/libyuv/include -Ithird_party/json11 -Ithird_party/curl/include -Ithird_party/linux/include -Ithird_party/snpe/include -Ithird_party/mapbox-gl-native-qt/include -Ithird_party/qrcode -Ithird_party -Icereal -Iopendbc/can -Ithird_party/json11 cereal/messaging/event.cc
generate_dbc_json(["tools/cabana/generate_dbc_json"], [])
tools/cabana/dbc/generate_dbc_json.py --out tools/cabana/dbc/car_fingerprint_to_dbc.json
Traceback (most recent call last):
  File "tools/cabana/dbc/generate_dbc_json.py", line 5, in <module>
    from selfdrive.car.car_helpers import get_interface_attr
  File "/Users/ebrown1/PycharmProjects/openpilot3/selfdrive/car/car_helpers.py", line 5, in <module>
    from common.params import Params
  File "/Users/ebrown1/PycharmProjects/openpilot3/common/params.py", line 1, in <module>
    from common.params_pyx import Params, ParamKeyType, UnknownKeyName, put_nonblocking, put_bool_nonblocking # pylint: disable=no-name-in-module, import-error
ModuleNotFoundError: No module named 'common.params_pyx'
scons: *** Error 1
cereal/messaging/event.cc:9:10: fatal error: 'sys/eventfd.h' file not found
#include <sys/eventfd.h>
         ^~~~~~~~~~~~~~~
1 error generated.
scons: *** [cereal/messaging/event.os] Error 1
scons: done building targets (errors occurred during build).

As a temporary workaround, I was able to build cabana by checking out d0a8b3780c in the openpilot repo.

Track average frequency in SubMaster

Track average receive frequency over the last 10 seconds. With this we can add a second check on frequencies that can be a lot stricter than on instantaneous frequency.

pub_socket.connect Error

When I try to run the stress.py,
I get an error saying
terminate called after throwing an instance of 'YAML::BadFile'
what(): bad file
Aborted (core dumped)

When I used breakpoints I could see that the error is caused in this line pub_socket.connect(c,"controlsState")

Any help ??

msgq: optimize queue size

msgq uses a fixed default size for all queues. Since our struct are all different sizes, this leaves some services with only 10s of buffer and others with more than an hour; if left running for long enough, the queues will use up ~400MB. At build time, we know the struct size (except lists) and frequency for all services, so we can set each service's queue size based on buffer time, which should save quite a bit of memory.

Set ZeroMQ snd/rcv watermark and linger?

Isn't it interesting to set snd/rcv watermark and linger ZeroMQ options to avoid overflooding the in memory pub/sub queues?

Here is an example for setting it once for all instance sockets, but it can also be done per socket instance.

zsys_set_linger(0);
size_t hwm = 30;
zsys_set_sndhwm(hwm);
zsys_set_rcvhwm(hwm);

On msgq implementation is there any pub/sub queue size limit?

VisionIPC doesn't work for remote ip address

When going through the code, the visionipc_client is binded to "127.0.0.1"
sock = SubSocket::create(msg_ctx, get_endpoint_name(name, type), "127.0.0.1", conflate, false);

But when I hard-coded it with some ip (192.168.1.10), it still doesn't work.

I tried it with the following commands:

# in host machine: ip: 192.168.1.10
export ZMQ=1
replay --demo

# in client machine: ip: 192.168.1.123
export ZMQ=1
python ui.py 192.168.1.10

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.