Code Monkey home page Code Monkey logo

vectorscan's Introduction

About Vectorscan

A fork of Intel's Hyperscan, modified to run on more platforms. Currently ARM NEON/ASIMD and Power VSX are 100% functional. ARM SVE2 support is in ongoing with access to hardware now. More platforms will follow in the future. Further more, starting 5.4.12 there is now a SIMDe port, which can be either used for platforms without official SIMD support, as SIMDe can emulate SIMD instructions, or as an alternative backend for existing architectures, for reference and comparison purposes.

Vectorscan will follow Intel's API and internal algorithms where possible, but will not hesitate to make code changes where it is thought of giving better performance or better portability. In addition, the code will be gradually simplified and made more uniform and all architecture specific -currently Intel- #ifdefs will be removed and abstracted away.

Why was there a need for a fork?

Originally, the ARM porting was intended to be merged into Intel's own Hyperscan, and relevant Pull Requests were made to the project for this reason. Unfortunately, the PRs were rejected for now and the forseeable future, thus we have created Vectorscan for our own multi-architectural and opensource collaborative needs.

The recent license change of Hyperscan makes Vectorscan even more relevant for the FLOSS ecosystem.

What is Vectorscan/Hyperscan/?

Hyperscan and by extension Vectorscan is a high-performance multiple regex matching library. It follows the regular expression syntax of the commonly-used libpcre library, but is a standalone library with its own C API.

Hyperscan/Vectorscan uses hybrid automata techniques to allow simultaneous matching of large numbers (up to tens of thousands) of regular expressions and for the matching of regular expressions across streams of data.

Vectorscan is typically used in a DPI library stack, just like Hyperscan.

License

Vectorscan follows a BSD License like the original Hyperscan (up to 5.4).

Vectorscan continues to be an open source project and we are committed to keep it that way. See the LICENSE file in the project repository.

Hyperscan License Change after 5.4

According to Accelerate Snort Performance with Hyperscan and Intel Xeon Processors on Public Clouds versions of Hyperscan later than 5.4 are going to be closed-source:

The latest open-source version (BSD-3 license) of Hyperscan on Github is 5.4. Intel conducts continuous internal development and delivers new Hyperscan releases under Intel Proprietary License (IPL) beginning from 5.5 for interested customers. Please contact authors to learn more about getting new Hyperscan releases.

Versioning

The master branch on Github will always contain the most recent stable release of Hyperscan. Each version released to master goes through QA and testing before it is released; if you're a user, rather than a developer, this is the version you should be using.

Further development towards the next release takes place on the develop branch. All PRs are first made against the develop branch and if the pass the Vectorscan CI, then they get merged. Similarly with PRs from develop to master.

Compatibility with Hyperscan

Vectorscan aims to be ABI and API compatible with the last open source version of Intel Hyperscan 5.4. After careful consideration we decided that we will NOT aim to achieving compatibility with later Hyperscan versions 5.5/5.6 that have extended Hyperscan's API. If keeping up to date with latest API of Hyperscan, you should talk to Intel and get a license to use that. However, we intend to extend Vectorscan's API with user requested changes or API extensions and improvements that we think are best for the project.

Installation

Debian/Ubuntu

On recent Debian/Ubuntu systems, vectorscan should be directly available for installation:

$ sudo apt install libvectorscan5

Or to install the devel package you can install libvectorscan-dev package:

$ sudo apt install libvectorscan-dev

For other distributions/OSes please check the Wiki

Build Instructions

The build system has recently been refactored to be more modular and easier to extend. For that reason, some small but necessary changes were made that might break compatibility with how Hyperscan was built.

Install Common Dependencies

Debian/Ubuntu

In order to build on Debian/Ubuntu make sure you install the following build-dependencies

$ sudo apt build-essential cmake ragel pkg-config libsqlite3-dev libpcap-dev

Other distributions

TBD

MacOS X (M1/M2/M3 CPUs only)

Assuming an existing HomeBrew installation:

% brew install boost cmake gcc libpcap pkg-config ragel sqlite

*BSD

In NetBSD you will almost certainly need to have a newer compiler installed. Also you will need to install cmake, sqlite, boost and ragel. Also, libpcap is necessary for some of the benchmarks, so let's install that as well. When using pkgsrc, you would typically do this using something similar to

pkg_add gcc12-12.3.0.tgz
pkg_add boost-headers-1.83.0.tgz  boost-jam-1.83.0.tgz      boost-libs-1.83.0nb1.tgz
pkg_add ragel-6.10.tgz
pkg_add cmake-3.28.1.tgz
pkg_add sqlite3-3.44.2.tgz
pkg_add libpcap-1.10.4.tgz

Version numbers etc will of course vary. One would either download the binary packages or build them using pkgsrc. There exist some NetBSD pkg tools like pkgin which help download e.g. dependencies as binary packages, but overall NetBSD leaves a lot of detail exposed to the user. The main package system used in NetBSD is pkgsrc and one will probably want to read up more about it than is in the scope of this document. See https://www.netbsd.org/docs/software/packages.html for more information.

This will not replace the compiler in the standard base distribution, and cmake will probably find the base dist's compiler when it checks automatically. Using the example of gcc12 from pkgsrc, one will need to set two environment variables before starting:

export CC="/usr/pkg/gcc12/bin/cc"
export CXX="/usr/pkg/gcc12/bin/g++"

In FreeBSD similarly, you might want to install a different compiler. If you want to use gcc, it is recommended to use gcc12. You will also, as in NetBSD, need to install cmake, sqlite, boost and ragel packages. Using the example of gcc12 from pkg: installing the desired compiler:

pkg install gcc12
pkg install boost-all
pkg install ragel
pkg install cmake
pkg install sqlite
pkg install libpcap
pkg install ccache

and then before beginning the cmake and build process, set the environment variables to point to this compiler:

export CC="/usr/local/bin/gcc"
export CXX="/usr/local/bin/g++"

A further note in FreeBSD, on the PowerPC and ARM platforms, the gcc12 package installs to a slightly different name, on FreeBSD/ppc, gcc12 will be found using:

export CC="/usr/local/bin/gcc12"
export CXX="/usr/local/bin/g++12"

Then continue with the build as below.

Configure & build

In order to configure with cmake first create and cd into a build directory:

$ mkdir build
$ cd build

Then call cmake from inside the build directory:

$ cmake ../

Common options for Cmake are:

  • -DBUILD_STATIC_LIBS=[On|Off] Build static libraries
  • -DBUILD_SHARED_LIBS=[On|Off] Build shared libraries (if none are set static libraries are built by default)
  • -DCMAKE_BUILD_TYPE=[Release|Debug|RelWithDebInfo|MinSizeRel] Configure build type and determine optimizations and certain features.
  • -DUSE_CPU_NATIVE=[On|Off] Native CPU detection is off by default, however it is possible to build a performance-oriented non-fat library tuned to your CPU
  • -DFAT_RUNTIME=[On|Off] Fat Runtime is only available for X86 32-bit/64-bit and AArch64 architectures and only on Linux. It is incompatible with Debug type and USE_CPU_NATIVE.

Specific options for X86 32-bit/64-bit (Intel/AMD) CPUs

  • -DBUILD_AVX2=[On|Off] Enable code for AVX2.
  • -DBUILD_AVX512=[On|Off] Enable code for AVX512. Implies BUILD_AVX2.
  • -DBUILD_AVX512VBMI=[On|Off] Enable code for AVX512 with VBMI extension. Implies BUILD_AVX512.

Specific options for Arm 64-bit CPUs

  • -DBUILD_SVE=[On|Off] Enable code for SVE, like on AWS Graviton3 CPUs. Not much code is ported just for SVE , but enabling SVE code production, does improve code generation, see Benchmarks.
  • -DBUILD_SVE2=[On|Off] Enable code for SVE2, implies BUILD_SVE. Most non-Neon code is written for SVE2
  • -DBUILD_SVE2_BITPERM=[On|Off] Enable code for SVE2_BITPERM harwdare feature, implies BUILD_SVE2.

Other options

  • SANITIZE=[address|memory|undefined] (experimental) Use libasan sanitizer to detect possible bugs. For now only address is tested. This will eventually be integrated in the CI.

SIMDe options

  • SIMDE_BACKEND=[On|Off] Enable SIMDe backend. If this is chosen all native (SSE/AVX/AVX512/Neon/SVE/VSX) backends will be disabled and a SIMDe SSE4.2 emulation backend will be enabled. This will enable Vectorscan to build and run on architectures without SIMD.
  • SIMDE_NATIVE=[On|Off] Enable SIMDe native emulation of x86 SSE4.2 intrinsics on the building platform. That is, SSE4.2 intrinsics will be emulated using Neon on an Arm platform, or VSX on a Power platform, etc.

Build

If cmake has completed successfully you can run make in the same directory, if you have a multi-core system with N cores, running

$ make -j <N>

will speed up the process. If all goes well, you should have the vectorscan library compiled.

Contributions

The official homepage for Vectorscan is at www.github.com/VectorCamp/vectorscan.

Vectorscan Development

All development of Vectorscan is done in public.

Original Hyperscan links

For reference, the official homepage for Hyperscan is at www.hyperscan.io.

Hyperscan Documentation

Information on building the Hyperscan library and using its API is available in the Developer Reference Guide.

And you can find the source code on Github.

For Intel Hyperscan related issues and questions, please follow the relevant links there.

vectorscan's People

Contributors

a16bitsysop avatar abdulawal1 avatar abondarev84 avatar anatolyburakov avatar apostolos00tapsas avatar azat avatar carenas avatar coytea avatar danilak-g avatar danlark1 avatar fatchanghao avatar flip111 avatar georgewort avatar gtsoul-tech avatar hongyang7 avatar hs-zhuwenjun avatar isildur-g avatar jeffplaisance avatar jlintonarm avatar jmtaylor90 avatar jth avatar liquidaty avatar luchy0120 avatar markos avatar pallas avatar starius avatar vectorscan-ci avatar wls avatar xiangwang1 avatar ypicchi-arm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vectorscan's Issues

New release with recent perf improvements?

Not really an issue but rather a question. We (ClickHouse) are interested in the latest perf improvements, especially #119 and #118. Both exist currently only in branch develop. If they are considered stable enough, it would be great to merge them to master or make a new release (not sure what VectorScan's release policy is). I feel a bit uneasy with compiling a develop state of vectorscan into ClickHouse. Thanks!

BTW: We recently enabled vectorscan-on-ARM in ClickHouse and it works nicely.

build failure unless build type is release

Building for alpine linux I get this error (as build type has to be None for alpine)

/hyperscan-5.3.1/unit/internal/masked_move.cpp:35:10: fatal error: util/masked_move.h: No such file or directory
943   35 | #include "util/masked_move.h"

https://gitlab.alpinelinux.org/a16bitsysop/aports/-/jobs/277952/raw

If I patch and add "NONE" to the types that sets RELEASE_BUILD

--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -30,7 +30,7 @@
     message(STATUS "Build type ${CMAKE_BUILD_TYPE}")
 endif()

-if(CMAKE_BUILD_TYPE MATCHES RELEASE|RELWITHDEBINFO|MINSIZEREL)
+if(CMAKE_BUILD_TYPE MATCHES NONE|RELEASE|RELWITHDEBINFO|MINSIZEREL)
     message(STATUS "using release build")
     set(RELEASE_BUILD TRUE)
 else()

It compiles as it misses the problem files, but the tools are not compiled now.

https://gitlab.alpinelinux.org/a16bitsysop/aports/-/jobs/277980

I would like to keep the tools, as there is already an hyperscan-tools package on alpine.

ARM Intrinsics don't build on clang

I'm trying to build vectorscan on an ARM machine using clang (arm64, macOS, M1 chip). Clang compilation works on x86_64, but fails on ARM. The build fails while compiling the NEON intrinsics:

Version (clang -v):

Apple clang version 12.0.5 (clang-1205.0.22.9)
Target: arm64-apple-darwin20.4.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Compiler output:

Consolidate compiler generated dependencies of target hs_exec
[  1%] Building C object CMakeFiles/hs_exec.dir/src/runtime.c.o
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/runtime.c:45:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/nfa_rev_api.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/vermicelli.h:36:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/bitutils.h:51:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/bitutils.h:41:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/common/bitutils.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/simd_utils.h:67:
/Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/simd_utils.h:98:19: error: argument to '__builtin_neon_vshlq_n_v' must be a constant integer
    return (m128) vshlq_n_s32((int64x2_t)a, b);
                  ^                         ~
/Library/Developer/CommandLineTools/usr/lib/clang/12.0.5/include/arm_neon.h:24760:23: note: expanded from macro 'vshlq_n_s32'
  __ret = (int32x4_t) __builtin_neon_vshlq_n_v((int8x16_t)__s0, __p1, 34); \
                      ^                                         ~~~~
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/runtime.c:45:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/nfa_rev_api.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/vermicelli.h:36:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/bitutils.h:51:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/bitutils.h:41:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/common/bitutils.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/simd_utils.h:67:
/Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/simd_utils.h:103:19: error: argument to '__builtin_neon_vshrq_n_v' must be a constant integer
    return (m128) vshrq_n_s32((int64x2_t)a, b);
                  ^                         ~
/Library/Developer/CommandLineTools/usr/lib/clang/12.0.5/include/arm_neon.h:25168:23: note: expanded from macro 'vshrq_n_s32'
  __ret = (int32x4_t) __builtin_neon_vshrq_n_v((int8x16_t)__s0, __p1, 34); \
                      ^                                         ~~~~
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/runtime.c:45:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/nfa_rev_api.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/vermicelli.h:36:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/bitutils.h:51:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/bitutils.h:41:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/common/bitutils.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/simd_utils.h:67:
/Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/simd_utils.h:108:19: error: argument to '__builtin_neon_vshlq_n_v' must be a constant integer
    return (m128) vshlq_n_s64((int64x2_t)a, b);
                  ^                         ~
/Library/Developer/CommandLineTools/usr/lib/clang/12.0.5/include/arm_neon.h:24778:23: note: expanded from macro 'vshlq_n_s64'
  __ret = (int64x2_t) __builtin_neon_vshlq_n_v((int8x16_t)__s0, __p1, 35); \
                      ^                                         ~~~~
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/runtime.c:45:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/nfa_rev_api.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/vermicelli.h:36:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/bitutils.h:51:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/bitutils.h:41:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/common/bitutils.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/simd_utils.h:67:
/Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/simd_utils.h:113:19: error: argument to '__builtin_neon_vshrq_n_v' must be a constant integer
    return (m128) vshrq_n_s64((int64x2_t)a, b);
                  ^                         ~
/Library/Developer/CommandLineTools/usr/lib/clang/12.0.5/include/arm_neon.h:25186:23: note: expanded from macro 'vshrq_n_s64'
  __ret = (int64x2_t) __builtin_neon_vshrq_n_v((int8x16_t)__s0, __p1, 35); \
                      ^                                         ~~~~
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/runtime.c:45:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/nfa_rev_api.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/vermicelli.h:36:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/bitutils.h:51:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/bitutils.h:41:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/common/bitutils.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/simd_utils.h:67:
/Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/simd_utils.h:166:12: error: argument to '__builtin_neon_vgetq_lane_i32' must be a constant integer
    return vgetq_lane_u32((uint32x4_t) in, imm);
           ^                               ~~~
/Library/Developer/CommandLineTools/usr/lib/clang/12.0.5/include/arm_neon.h:7511:22: note: expanded from macro 'vgetq_lane_u32'
  __ret = (uint32_t) __builtin_neon_vgetq_lane_i32((int32x4_t)__s0, __p1); \
                     ^                                              ~~~~
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/runtime.c:45:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/nfa_rev_api.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/vermicelli.h:36:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/bitutils.h:51:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/bitutils.h:41:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/common/bitutils.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/simd_utils.h:67:
/Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/simd_utils.h:190:12: error: argument to '__builtin_neon_vgetq_lane_i64' must be a constant integer
    return vgetq_lane_u64((uint64x2_t) in, imm);
           ^                               ~~~
/Library/Developer/CommandLineTools/usr/lib/clang/12.0.5/include/arm_neon.h:7534:22: note: expanded from macro 'vgetq_lane_u64'
  __ret = (uint64_t) __builtin_neon_vgetq_lane_i64((int64x2_t)__s0, __p1); \
                     ^                                              ~~~~
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/runtime.c:45:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/nfa_rev_api.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/nfa/vermicelli.h:36:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/bitutils.h:51:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/bitutils.h:41:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/common/bitutils.h:38:
In file included from /Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/simd_utils.h:67:
/Users/mg/projects/hyperscan-java-native/cppbuild/hyperscan-5.4.0/src/util/arch/arm/simd_utils.h:304:18: error: argument to '__builtin_neon_vextq_v' must be a constant integer
    return (m128)vextq_s8((int8x16_t)l, (int8x16_t)r, offset);
                 ^                                    ~~~~~~
/Library/Developer/CommandLineTools/usr/lib/clang/12.0.5/include/arm_neon.h:6896:23: note: expanded from macro 'vextq_s8'
  __ret = (int8x16_t) __builtin_neon_vextq_v((int8x16_t)__s0, (int8x16_t)__s1, __p2, 32); \
                      ^                                                        ~~~~
7 errors generated.
make[2]: *** [CMakeFiles/hs_exec.dir/src/runtime.c.o] Error 1
make[1]: *** [CMakeFiles/hs_exec.dir/all] Error 2
make: *** [all] Error 2

It seems like clang doesn't understand how to inline parameters it expects to be constant.

Maybe some of the abstraction functions for the intrinsics need to be converted to macros?

vectorscan 5.4.8 error: cannot use 'char' with '__vector bool' on alpine ppc64le

when compiling 5.4.8 with clang 14.0.6 on alpine I get the following error:

/builds/a16bitsysop/aports/community/vectorscan/src/vectorscan-vectorscan-5.4.8/src/util/supervector/arch/ppc64el/impl.cpp:52:49: error: cannot use 'char' with '__vector bool'
[620](https://gitlab.alpinelinux.org/a16bitsysop/aports/-/jobs/839733#L620)really_inline SuperVector<16>::SuperVector(char __bool __vector v)

Full log:
https://gitlab.alpinelinux.org/a16bitsysop/aports/-/jobs/839733/raw

If it is patched with:

--- a/src/util/supervector/arch/ppc64el/impl.cpp
+++ b/src/util/supervector/arch/ppc64el/impl.cpp
@@ -49,7 +49,7 @@
 
 template<>
 template<>
-really_inline SuperVector<16>::SuperVector(char __bool __vector v)
+really_inline SuperVector<16>::SuperVector(__vector __bool char v)
 {
     u.u8x16[0] = (uint8x16_t) v;
 };

It compiles but there are alot of test failures:
https://gitlab.alpinelinux.org/a16bitsysop/aports/-/jobs/839792

[question] install vectorscan package

@markos do we support install package through apt-get or we have to build the package on local?
for hyperscan we run apt-get install -y libhyperscan5 command. do we have something similar for vectorscan?
thanks.

'Illegal instruction' error on x86_64 CentOS 7.9

I got a error 'Illegal instruction' to run unit test and example on x86_64 CentOS 7.9. It's appreciated to get help from the community.

build

wget https://github.com/VectorCamp/vectorscan/archive/refs/tags/vectorscan/5.4.7.zip
unzip 5.4.7.zip
cd vectorscan-vectorscan-5.4.7
mkdir build
cd build
PATH=$RAGEL_PATH:$PATH cmake -DBOOST_ROOT=$BOOST_SOURCE ..
make

run

./bin/unit-hyperscan
Illegal instruction

./bin/simplegrep cpp Makefile
Illegal instruction

Env

  • CentOS Linux release 7.9.2009 (Core)
  • Linux hostname 3.10.0-1160.59.1.el7.x86_64 #1 SMP Wed Feb 23 16:47:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • gcc version 11.1.0

Using built-in specs.
COLLECT_GCC=/home/xiaokang/opt/ldb/bin/gcc-11
COLLECT_LTO_WRAPPER=/mnt/disk3/xiaokang/opt/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.1.0-1ubuntu118.04.1' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --disable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-YRKbe7/gcc-11-11.1.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-YRKbe7/gcc-11-11.1.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.1.0 (Ubuntu 11.1.0-1ubuntu1
18.04.1)

  • cat /proc/cpuinfo

processor : 0
vendor_id : AuthenticAMD
cpu family : 23
model : 49
model name : AMD EPYC 7K62 48-Core Processor
stepping : 0
microcode : 0x1000065
cpu MHz : 2595.124
cache size : 512 KB
physical id : 0
siblings : 48
core id : 0
cpu cores : 24
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm art rep_good nopl extd_apicid amd_dcm eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext retpoline_amd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 arat
bogomips : 5190.24
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management:

HS_FLAG_UTF8 flag doesn't seem to work as expected on aarch64 platforms

Here are my test result on aarch64

pattern: 星{2}
scan text: 星星点灯
flags: HS_FLAG_UTF8 | HS_FLAG_SINGLEMATCH
expect: matched
actual: unmatched

on linux x86_64 platform this works fine


env

  • vectorscan 5.4.8
  • gcc 9.5.0
  • ragel 6.10
  • boost 1.67.0
  • Linux hostname-PC 4.19.0-arm64-desktop #5214 SMP Tue Sep 6 16:40:43 CST 2022 aarch64 GNU/Linux

test code

  std::string pattern = u8"星{2}";
  hs_compile_error_t *error = nullptr;
  hs_database_t *db         = nullptr;
  int flags = HS_FLAG_UTF8 | HS_FLAG_SINGLEMATCH;

  hs_error_t ret = hs_compile(pattern.c_str(), flags, HS_MODE_BLOCK,
                              nullptr,&db,&error);
  if (ret != HS_SUCCESS) {
      std::cout << "hs_compile error msg: " << error->message << ". expression: " << error->expression << std::endl;
      hs_free_compile_error(error);
      error = nullptr;
  }
  ASSERT_EQ(ret, HS_SUCCESS);

  hs_scratch_t *scratch = nullptr;
  ret = hs_alloc_scratch(db, &scratch);
  ASSERT_EQ(ret, HS_SUCCESS);

  std::string scanText = u8"星星点灯";
  static int matchCount = 0;
  ret = hs_scan(db, scanText.c_str(), scanText.size(), 0, scratch,
                [](unsigned int id, unsigned long long from,
                   unsigned long long to, unsigned int flags,
                   void *context) -> int {
        printMatchedInfo(id, from, to, flags, context);
        matchCount++;
        return 0;
      },
      nullptr);

  EXPECT_EQ(ret, HS_SUCCESS);
  EXPECT_EQ(matchCount,1);

  // cleanup
  hs_free_database(db);
  hs_free_scratch(scratch);
  matchCount = 0;

cmake file

cmake_minimum_required(VERSION 3.16)
cmake_policy(SET CMP0074 NEW)
cmake_policy(SET CMP0115 NEW)
if(CMAKE_VERSION VERSION_GREATER_EQUAL "3.24.0")
    cmake_policy(SET CMP0135 NEW)
endif()
project(utf8-test)

set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
include(FetchContent)

FetchContent_Declare(vectorscan URL https://github.com/VectorCamp/vectorscan/archive/refs/tags/vectorscan/5.4.8.tar.gz)
FetchContent_MakeAvailable(vectorscan)
FetchContent_GetProperties(vectorscan)

FetchContent_Declare(googletest URL https://github.com/google/googletest/archive/refs/tags/release-1.12.1.tar.gz)
FetchContent_MakeAvailable(googletest)
find_package(Boost REQUIRED)

file(GLOB TEST_SRCS tests/*.cc)
add_executable(utf8-test ${TEST_SRCS})
target_include_directories(utf8-test PRIVATE ${vectorscan_SOURCE_DIR}/src)
target_link_libraries(utf8-test PUBLIC hs gtest gmock)

steps to reproduce

# on project dir
mkdir build
cd build
cmake ..
cmake --build .
./utf8-test

cmake output

-- The C compiler identification is GNU 9.5.0
-- The CXX compiler identification is GNU 9.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/local/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/local/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Deprecation Warning at build/_deps/vectorscan-src/CMakeLists.txt:1 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.

Update the VERSION argument value or use a ... suffix to tell
CMake that the project does not need compatibility with older versions.

-- Performing Test ARCH_X86_64
-- Performing Test ARCH_X86_64 - Failed
-- Performing Test ARCH_IA32
-- Performing Test ARCH_IA32 - Failed
-- Performing Test ARCH_AARCH64
-- Performing Test ARCH_AARCH64 - Success
-- Performing Test ARCH_ARM32
-- Performing Test ARCH_ARM32 - Failed
-- Performing Test ARCH_PPC64EL
-- Performing Test ARCH_PPC64EL - Failed
-- Default build type 'Release with debug info'
-- using release build
-- Boost version: 1.67.0
-- Found Python: /usr/bin/python3.7 (found version "3.7.3") found components: Interpreter
-- Build date: 2022-10-31
-- Building static libraries
-- gcc version 9.5.0
CMake Warning at build/_deps/vectorscan-src/CMakeLists.txt:184 (message):
Something went wrong determining gcc tune: -mtune=armv8-a not valid,
falling back to -mtune=native

-- ARCH_C_FLAGS :
-- ARCH_CXX_FLAGS :
-- g++ version 9.5.0
-- Looking for include file unistd.h
-- Looking for include file unistd.h - found
-- Looking for C++ include arm_neon.h
-- Looking for C++ include arm_neon.h - found
-- Looking for posix_memalign
-- Looking for posix_memalign - found
-- Looking for _aligned_malloc
-- Looking for _aligned_malloc - not found
-- Performing Test HAS_C_HIDDEN
-- Performing Test HAS_C_HIDDEN - Success
-- Performing Test HAS_CXX_HIDDEN
-- Performing Test HAS_CXX_HIDDEN - Success
-- Looking for _LIBCPP_VERSION
-- Looking for _LIBCPP_VERSION - not found
-- generator is Unix Makefiles
-- Performing Test HAS_C_ATTR_IFUNC
-- Performing Test HAS_C_ATTR_IFUNC - Success
-- Performing Test HAVE_NEON
-- Performing Test HAVE_NEON - Success
-- Performing Test HAVE_CC_BUILTIN_ASSUME_ALIGNED
-- Performing Test HAVE_CC_BUILTIN_ASSUME_ALIGNED - Success
-- Performing Test HAVE_CXX_BUILTIN_ASSUME_ALIGNED
-- Performing Test HAVE_CXX_BUILTIN_ASSUME_ALIGNED - Success
-- Performing Test HAVE__BUILTIN_CONSTANT_P
-- Performing Test HAVE__BUILTIN_CONSTANT_P - Success
-- Performing Test C_FLAG_Wvla
-- Performing Test C_FLAG_Wvla - Success
-- Performing Test C_FLAG_Wpointer_arith
-- Performing Test C_FLAG_Wpointer_arith - Success
-- Performing Test C_FLAG_Wstrict_prototypes
-- Performing Test C_FLAG_Wstrict_prototypes - Success
-- Performing Test C_FLAG_Wmissing_prototypes
-- Performing Test C_FLAG_Wmissing_prototypes - Success
-- Performing Test CXX_FLAG_Wvla
-- Performing Test CXX_FLAG_Wvla - Success
-- Performing Test CXX_FLAG_Wpointer_arith
-- Performing Test CXX_FLAG_Wpointer_arith - Success
-- Performing Test CC_SELF_ASSIGN
-- Performing Test CC_SELF_ASSIGN - Failed
-- Performing Test CXX_SELF_ASSIGN
-- Performing Test CXX_SELF_ASSIGN - Failed
-- Performing Test CC_PAREN_EQUALITY
-- Performing Test CC_PAREN_EQUALITY - Failed
-- Performing Test CXX_UNUSED_CONST_VAR
-- Performing Test CXX_UNUSED_CONST_VAR - Success
-- Performing Test CXX_IGNORED_ATTR
-- Performing Test CXX_IGNORED_ATTR - Success
-- Performing Test CXX_REDUNDANT_MOVE
-- Performing Test CXX_REDUNDANT_MOVE - Success
-- Performing Test CXX_WEAK_VTABLES
-- Performing Test CXX_WEAK_VTABLES - Failed
-- Performing Test CXX_MISSING_DECLARATIONS
-- Performing Test CXX_MISSING_DECLARATIONS - Success
-- Performing Test CXX_UNUSED_LOCAL_TYPEDEFS
-- Performing Test CXX_UNUSED_LOCAL_TYPEDEFS - Success
-- Performing Test CXX_WUNUSED_VARIABLE
-- Performing Test CXX_WUNUSED_VARIABLE - Success
-- Performing Test CC_STRINGOP_OVERFLOW
-- Performing Test CC_STRINGOP_OVERFLOW - Success
-- Building for current host CPU: -march=armv8-a -mtune=native
-- Looking for mmap
-- Looking for mmap - found
-- Doxygen not found, unable to generate API reference
-- Sphinx not found, unable to generate developer reference
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29")
-- Checking for module 'libpcre>=8.41'
-- No package 'libpcre' found
-- PCRE version 8.41 or above not found
-- PCRE 8.41 or above not found
-- Could not find libpcap - some examples will not be built
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found Boost: /usr/include (found version "1.67.0")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/skywo/workzone/vectorscan-utf8-test/build

result snapshot

image

Build fails on x86

Build fails on 32-bit x86 with:

[  380s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.8/src/util/state_compress.c:579:12: error: incompatible types when assigning to type 'm512' {aka '__vector(8) long long int'} from type 'int'

macOS support

Is there any chance we could support building/installing vectorscan under macOS?

hyperscan supports building the project under macOS using clang, and many developers use macOS as their dev machine, but for this project, it seems hard coded to use gcc/Linux, which makes it not easy to get people to start using it.

legal instruction on startup when running rspamd on aarch on Alpine Linux

It ran fine in Alpine 3.14 when it was first released, but some change led to this error. I originally could not stop it.

It worked on Alpine Edge at that time, now the same issue is happening on Edge.

It does not create a core when it crashes like this, the log output is:

2021-10-07 06:43:49 #5913(main) lua; lua_cfg_transform.lua:161: group excessqp has no symbols
2021-10-07 06:43:49 #5913(main) lua; lua_cfg_transform.lua:161: group excessb64 has no symbols
2021-10-07 06:43:49 #5913(main) <>; lua; lua_cfg_transform.lua:511: enable `options.check_all_filters` for neural network
2021-10-07 06:43:49 #5913(main) <>; lua; lua_cfg_transform.lua:569: converted surbl rules to rbl rules
2021-10-07 06:43:49 #5913(main) <>; lua; lua_cfg_transform.lua:583: converted emails rules to rbl rules
2021-10-07 06:43:49 #5913(main) cfg; rspamd_rcl_maybe_apply_lua_transform: configuration has been transformed in Lua
2021-10-07 06:43:49 #5913(main) <1wtzg4>; cfg; rspamd_config_set_action_score: action add header has been already registered with priority 0, override it with new priority: 0, old score: nan
2021-10-07 06:43:49 #5913(main) <1wtzg4>; cfg; rspamd_config_set_action_score: action reject has been already registered with priority 0, override it with new priority: 0, old score: nan
2021-10-07 06:43:49 #5913(main) <1wtzg4>; cfg; rspamd_config_set_action_score: action greylist has been already registered with priority 0, override it with new priority: 0, old score: nan
2021-10-07 06:43:49 #5913(main) rspamd_regexp_library_init: pcre2 is compiled with JIT for ARM-64 64bit (little endian + unaligned)
2021-10-07 06:43:49 #5913(main) <1wtzg4>; lua; init.lua:58: controller plugin neural; register webui path learn
2021-10-07 06:43:49 #5913(main) <1wtzg4>; lua; init.lua:58: controller plugin maps; register webui path query
2021-10-07 06:43:49 #5913(main) <1wtzg4>; lua; init.lua:58: controller plugin maps; register webui path list
2021-10-07 06:43:49 #5913(main) <1wtzg4>; lua; init.lua:58: controller plugin maps; register webui path query_specific
2021-10-07 06:43:49 #5913(main) <1wtzg4>; lua; init.lua:58: controller plugin selectors; register webui path list_transforms
2021-10-07 06:43:49 #5913(main) <1wtzg4>; lua; init.lua:58: controller plugin selectors; register webui path check_message
2021-10-07 06:43:49 #5913(main) <1wtzg4>; lua; init.lua:58: controller plugin selectors; register webui path list_extractors
2021-10-07 06:43:49 #5913(main) <1wtzg4>; lua; init.lua:58: controller plugin selectors; register webui path check_selector

There is more info here:
https://gitlab.alpinelinux.org/alpine/aports/-/issues/12822

Now if I disable vectorscan support it runs without error, so it appears to be related to vectorscan even though the version hasn't changed.

Sanitizers fail a lot, known?

Hi, we decided to run ASAN sanitizers with vectorscan and it fails in various places

Is that a known issue? I believe we tested hyperscan and tests were generally fine, however, with some rewrites it's hard to say what changed

As an example:

[ RUN      ] HyperscanTestBehaviour.SerializedDogfood1
=================================================================
==3279==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7f55c3b63c1f at pc 0x7f55c248feb2 bp 0x7ffcfbca82a0 sp 0x7ffcfbca8298
READ of size 16 at 0x7f55c3b63c1f thread T0
    #0 0x7f55c248feb1 in loadu [src/util/supervector/arch/x86/impl.cpp:502]:12
    #1 0x7f55c248feb1 in scanDoubleUnaligned<(unsigned short)16> [src/hwlm/noodle_engine_simd.hpp:98]:24
    #2 0x7f55c248feb1 in scanDoubleMain<(unsigned short)16> [src/hwlm/noodle_engine_simd.hpp:204]:12
    #3 0x7f55c248feb1 in scanDouble [src/hwlm/noodle_engine_simd.hpp:230]:12
    #4 0x7f55c248feb1 in scan [src/hwlm/noodle_engine.cpp:118]:16
    #5 0x7f55c248feb1 in noodExec [src/hwlm/noodle_engine.cpp:132]:12
    #6 0x7f55c2f50bdf in pureLiteralBlockExec [src/runtime.c:218]:5
    #7 0x7f55c2f50bdf in hs_scan [src/runtime.c:422]:9
    #8 0x7f55c3cf59aa in (anonymous namespace)::HyperscanTestBehaviour_SerializedDogfood1_Test::TestBody() [unit/hyperscan/behaviour.cpp:651]:11
    #9 0x7f5576ad39d4 in testing::Test::Run() [googletest/src/gtest.cc:2731]:5
    #10 0x7f5576ad64bb in testing::TestInfo::Run() [googletest/src/gtest.cc:2910]:11
    #11 0x7f5576ad882b in testing::TestSuite::Run() [googletest/src/gtest.cc:3069]:30
    #12 0x7f5576b0a124 in testing::internal::UnitTestImpl::RunAllTests() [googletest/src/gtest.cc:5942]:44
    #13 0x7f5576b08fe9 in testing::UnitTest::Run() [googletest/src/gtest.cc:5511]:10
    #14 0x55fd78fbf84f in RUN_ALL_TESTS [googletest/include/gtest/gtest.h:2326]:46
    #15 0x55fd78fbf84f in main [testing/base/internal/gunit_main.cc:83]:10

0x7f55c3b63c1f is located 33 bytes to the left of global variable '<string literal>' defined in (0x7f55c3b63c40) of size 2
  '<string literal>' is ascii string '<'
0x7f55c3b63c1f is located 7 bytes to the right of global variable '<string literal>' defined in '[unit/hyperscan/behaviour.cpp:646]:24' (0x7f55c3b63c00) of size 24
  '<string literal>' is ascii string 'delicious puppy treats!'
SUMMARY: AddressSanitizer: global-buffer-overflow [src/util/supervector/arch/x86/impl.cpp:502]:12 in loadu
Shadow bytes around the buggy address:
  0x0feb38764730: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
  0x0feb38764740: 07 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x0feb38764750: 01 f9 f9 f9 f9 f9 f9 f9 00 06 f9 f9 00 06 f9 f9
  0x0feb38764760: 00 00 01 f9 f9 f9 f9 f9 07 f9 f9 f9 00 07 f9 f9
  0x0feb38764770: 00 f9 f9 f9 00 01 f9 f9 00 00 00 07 f9 f9 f9 f9
=>0x0feb38764780: 00 00 00[f9]f9 f9 f9 f9 02 f9 f9 f9 00 00 00 00
  0x0feb38764790: 00 00 00 00 00 00 00 00 07 f9 f9 f9 f9 f9 f9 f9
  0x0feb387647a0: 00 00 00 00 00 00 00 00 01 f9 f9 f9 f9 f9 f9 f9
  0x0feb387647b0: 00 00 04 f9 f9 f9 f9 f9 00 00 04 f9 f9 f9 f9 f9
  0x0feb387647c0: 04 f9 f9 f9 00 05 f9 f9 00 03 f9 f9 07 f9 f9 f9
  0x0feb387647d0: 00 04 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==3279==ABORTING

build with mingw-w64?

It would be great if this fork supported compilation with mingw-w64.

UPDATE: With some minor modifications, we've managed to compile simplegrep.exe on OSX+mingw-w64. This requires a minor change in src/util/alloc.cpp to use __mingw_aligned_malloc() instead of posix_memalign() and __mingw_aligned_free() instead of free() in aligned_free_internal.

Will you consider incorporating these changes if we create a PR? There are still other issues to resolve (such as getting it to properly use AVX2 instead of only working with SSE), but if if this fork doesn't officially support minGW, this change would at least provide a way to get an unsupported minimal working version with this toolchain, without impacting other builds

Allow to cancel hs_scan*()

We (ClickHouse) recently encountered some patterns which are extremely expensive to evaluate with vector/hyperscan, for example bounded repeats "x{n,m}" (these are also documented as being expensive). As a mitigation, we now check patterns on a best-effort basis and reject them when they will likely be expensive.

A better solution would be to either

  • add a new method to vector/hyperscan that predicts runtime costs ("fast"/"slow" will be sufficient), or
  • (the preferred alternative) allow canceling the scan. Functions hs_scan_*() (*) are provided callbacks which can stop the scan but they are only called when a match is found. Ideally, a second callback can be provided which is called regularly (every N "steps" - whatever that means in the context of vectorscan). I know that vectorscan attempts to stay API-compatible with hyperscan, so these callbacks could be added as new parameters with default value.

EDIT: Just noticed that pattern compilation, i.e. hs_compile_multi(), becomes slow (not: the scan). A callback for canceling hs_compile_*() would be great.

(*) ClickHouse actually only uses block mode, not streaming or vector modes.

Support for PCRE2

While building rpm with pcre2-devel on RHEL based distro
-- Checking for module 'libpcre>=8.41' -- Package 'libpcre', required by 'virtual:world', not found -- PCRE version 8.41 or above not found -- PCRE 8.41 or above not found

but on older pcre-devel it detects pcre correctly
-- Checking for module 'libpcre>=8.41' -- Found libpcre, version 8.42 -- PCRE version 8.41 or above

Is it possible to support the newer library?

different content of build/bin folder on x86 and aarch64

on an aarch64 system (graviton3 server) the build/bin folder contains the following files

  1. benchmarks
  2. hscheck
  3. simplegrep
  4. unit-hyperscan
  5. unit-internal

on an x86 system (icelake server) the build/bin folder contains the following files

  1. hscheck
  2. simplegrep
  3. unit-hyperscan

I'd like to compare performance under several scenarios on both systems.

is this intended behavior? I can see that there are instructions to make the file in build/benchmarks but haven't looked deeper into it yet.
thanks in advance

build fails when AVX512 is enabled

Build fails with:

<>/src/util/arch/x86/simd_utils.h:655:16: warning: implicit declaration of function ‘set512_64’; did you mean ‘set2x64’? [-Wimplicit-function-declaration]
655 | m512 idx = set512_64(3ULL, 2ULL, 1ULL, 0ULL, 7ULL, 6ULL, 5ULL, 4ULL);

Seems there is a helper missing. Hyperscan 5.4.0 has it.

Full log with further failure here:
https://launchpadlibrarian.net/519994259/buildlog_ubuntu-focal-amd64.vectorscan_5.4.0-0ubuntu1~focal_BUILDING.txt.gz

Build fails with glibc >= 2.34

Build fails with glibc >= 2.34 with:

[ 6426s] [ 91%] Building CXX object tools/hscollider/CMakeFiles/hscollider.dir/sig.cpp.o
[ 6426s] cd /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/build/tools/hscollider && /usr/bin/c++  -I/home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/tools/hscollider -I/home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/build/tools/hscollider -I/home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/tools/hscollider -I/home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/src -I/home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/build -I/home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7 -I/home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/tools/src -I/home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/util -isystem /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/include -mbranch-protection=standard -O2 -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=3 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type  -g -march=armv8-a+crypto+crc -mtune=native  -O3 -std=c++17 -Wall -Wextra -Wshadow -Wswitch -Wreturn-type -Wcast-qual -Wno-deprecated -Wnon-virtual-dtor -fno-strict-aliasing -fno-new-ttp-matching -DNDEBUG -Wno-maybe-uninitialized -Wno-abi -fno-omit-frame-pointer -fvisibility=hidden -Wvla -Wpointer-arith -Wno-unused-const-variable -Wno-ignored-attributes -Wno-redundant-move -Wmissing-declarations  -g -DNDEBUG -MD -MT tools/hscollider/CMakeFiles/hscollider.dir/sig.cpp.o -MF CMakeFiles/hscollider.dir/sig.cpp.o.d -o CMakeFiles/hscollider.dir/sig.cpp.o -c /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/tools/hscollider/sig.cpp
[ 6428s] In file included from /usr/include/signal.h:328,
[ 6428s]                  from /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/tools/hscollider/sig.cpp:40:
[ 6428s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.7/tools/hscollider/sig.cpp:169:40: error: size of array 'alt_stack_loc' is not an integral constant-expression
[ 6428s]   169 | static TLS_VARIABLE char alt_stack_loc[SIGSTKSZ];
[ 6428s]       |                                        ^~~~~~~~

This was found on openSUSE Tumbleweed aarch64.

building with clang for alpine fails on x86_64 and x86 with linking errors

I tried building with clang as gcc tunes too much for arm64 so it fails at runtime with rspamd.

When attempting to build with clang 12.0.1 it also has lots of nm errors for x86_86 and x86 finding glibc:

'libc.so.6': No such file
471nm: 'libc.so.6': No such file

Then at the end it has lots of linker errors, aarch64 compiles fine with clang.

https://gitlab.alpinelinux.org/a16bitsysop/aports/-/jobs/611687

When using gcc 11.2.1 on arm64 (https://gitlab.alpinelinux.org/a16bitsysop/aports/-/jobs/605071/raw )is sets these:

-- gcc version 11.2.1
CMake Warning at CMakeLists.txt:185 (message):
  Something went wrong determining gcc tune:
  -mtune=armv8.2-a+crypto+fp16+rcpc+dotprod+ssbs not valid, falling back to
  -mtune=native

then further down
-- Building for current host CPU: -march=armv8.2-a+crypto+fp16+rcpc+dotprod+ssbs -mtune=native

So I have created a temporary patch:
https://gitlab.alpinelinux.org/alpine/aports/-/blob/1d08133a1b57f07565f7a679210c24db09681a32/community/vectorscan/armv8.patch

So I can build with gcc.

shuftiExec() and truffleExec() are vectorscan code hot spot for Snort3 on arm

Vectorscan version: v5.4.2+vectorscan
Hyperscan version: v5.4.2

When Snort3 inspects one .pcap packets file with same configuration, the vectorscan module profiling result on arm (around 2900 milliseconds) is almost twice as much as hyperscan profiling result on x86 (around 1600 milliseconds). From the perf analysis result, shuftiExec() and truffleExec() are vectorscan code hot spot on arm, but this is not the case for x86 hyperscan. Any idea or suggestion for vectorscan here? Thanks.

Snort 3 with Arm vectorscan perf result:

arm

Snort 3 with x86 hyperscan perf result:

x86

build failure on x86

Build fails for x86 when building on Alpine Linux, log here:
https://gitlab.alpinelinux.org/a16bitsysop/aports/-/jobs/278030/raw

407/builds/a16bitsysop/aports/community/hyperscan/src/hyperscan-5.3.1/src/util/arch/x86/simd_utils.h: In function 'movq':
408/builds/a16bitsysop/aports/community/hyperscan/src/hyperscan-5.3.1/src/util/arch/x86/simd_utils.h:128:12: error: implicit declaration of function '_mm_cvtsi128_si64'; did you mean '_mm_cvtsi128_si32'? [-Werror=implicit-function-declaration]
409  128 |     return _mm_cvtsi128_si64(in);
410      |            ^~~~~~~~~~~~~~~~~
411      |            _mm_cvtsi128_si32

Version comparison between that of Intel Hyperscan

I see that the latest version of vectorscan is 5.4.2+vectorscan(anything newer?), I am curious here what's the comparable version of Hyperscan of Intel release(https://github.com/intel/hyperscan).

It seems there are some code differences between v5.4.2+vectorscan and the release of v5.4.0 of Intel Hyperscan.
So what's the base release of v5.4.2+vectorscan for which we revised from?

cannot build on apple silicon

I am trying to compile the project on my mac (MacBook Pro (14-inch, 2021), Monterey 12.0.1)
I did the following steps:

git clone [email protected]:VectorCamp/vectorscan.git 
brew install make
brew install boost
brew install bagel
brew install sqlite

dd vectorscan
mkdir build
cd build
cmake ..

and I am getting:

CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- The C compiler identification is AppleClang 13.0.0.13000027
-- The CXX compiler identification is AppleClang 13.0.0.13000027
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test ARCH_X86_64
-- Performing Test ARCH_X86_64 - Failed
-- Performing Test ARCH_IA32
-- Performing Test ARCH_IA32 - Failed
-- Performing Test ARCH_AARCH64
-- Performing Test ARCH_AARCH64 - Success
-- Performing Test ARCH_ARM32
-- Performing Test ARCH_ARM32 - Failed
-- Default build type 'Release with debug info'
-- using release build
-- Boost version: 1.76.0
-- Found PythonInterp: /usr/bin/python (found version "2.7.18")
-- Build date: 2022-01-05
-- Building static libraries
-- Looking for include file unistd.h
-- Looking for include file unistd.h - found
-- Looking for C++ include arm_neon.h
-- Looking for C++ include arm_neon.h - found
-- Looking for posix_memalign
-- Looking for posix_memalign - found
-- Looking for _aligned_malloc
-- Looking for _aligned_malloc - not found
-- Performing Test HAS_C_HIDDEN
-- Performing Test HAS_C_HIDDEN - Success
-- Performing Test HAS_CXX_HIDDEN
-- Performing Test HAS_CXX_HIDDEN - Success
-- Looking for _LIBCPP_VERSION
-- Looking for _LIBCPP_VERSION - found
-- Performing Test HAVE_NEON
-- Performing Test HAVE_NEON - Success
-- Performing Test HAVE_CC_BUILTIN_ASSUME_ALIGNED
-- Performing Test HAVE_CC_BUILTIN_ASSUME_ALIGNED - Success
-- Performing Test HAVE_CXX_BUILTIN_ASSUME_ALIGNED
-- Performing Test HAVE_CXX_BUILTIN_ASSUME_ALIGNED - Success
-- Performing Test HAVE__BUILTIN_CONSTANT_P
-- Performing Test HAVE__BUILTIN_CONSTANT_P - Success
-- Performing Test C_FLAG_Wvla
-- Performing Test C_FLAG_Wvla - Success
-- Performing Test C_FLAG_Wpointer_arith
-- Performing Test C_FLAG_Wpointer_arith - Success
-- Performing Test C_FLAG_Wstrict_prototypes
-- Performing Test C_FLAG_Wstrict_prototypes - Success
-- Performing Test C_FLAG_Wmissing_prototypes
-- Performing Test C_FLAG_Wmissing_prototypes - Success
-- Performing Test CXX_FLAG_Wvla
-- Performing Test CXX_FLAG_Wvla - Success
-- Performing Test CXX_FLAG_Wpointer_arith
-- Performing Test CXX_FLAG_Wpointer_arith - Success
-- Performing Test CC_SELF_ASSIGN
-- Performing Test CC_SELF_ASSIGN - Success
-- Performing Test CXX_SELF_ASSIGN
-- Performing Test CXX_SELF_ASSIGN - Success
-- Performing Test CC_PAREN_EQUALITY
-- Performing Test CC_PAREN_EQUALITY - Success
-- Performing Test CXX_UNUSED_CONST_VAR
-- Performing Test CXX_UNUSED_CONST_VAR - Success
-- Performing Test CXX_IGNORED_ATTR
-- Performing Test CXX_IGNORED_ATTR - Success
-- Performing Test CXX_REDUNDANT_MOVE
-- Performing Test CXX_REDUNDANT_MOVE - Success
-- Performing Test CXX_WEAK_VTABLES
-- Performing Test CXX_WEAK_VTABLES - Success
-- Performing Test CXX_MISSING_DECLARATIONS
-- Performing Test CXX_MISSING_DECLARATIONS - Success
-- Performing Test CXX_UNUSED_LOCAL_TYPEDEFS
-- Performing Test CXX_UNUSED_LOCAL_TYPEDEFS - Success
-- Performing Test CXX_WUNUSED_VARIABLE
-- Performing Test CXX_WUNUSED_VARIABLE - Success
-- Performing Test CC_STRINGOP_OVERFLOW
-- Performing Test CC_STRINGOP_OVERFLOW - Failed
-- Building for current host CPU: -march= -mtune=native
-- Looking for mmap
-- Looking for mmap - not found
-- Doxygen not found, unable to generate API reference
-- Sphinx not found, unable to generate developer reference
-- Found PkgConfig: //opt/homebrew/bin/pkg-config (found version "0.29.2")
-- Checking for module 'libpcre>=8.41'
--   Found libpcre, version 8.45
-- PCRE version 8.41 or above
-- Looking for pthread.h
-- Looking for pthread.h - not found
-- Could NOT find Threads (missing: Threads_FOUND)
-- Checking for module 'sqlite3'
--   Found sqlite3, version 3.36.0
-- Performing Test SQLITE_VERSION_OK
-- Performing Test SQLITE_VERSION_OK - Failed
CMake Error at cmake/sqlite3.cmake:34 (message):
  sqlite3 is broken from 3.8.7 to 3.8.10 - please find a working version
Call Stack (most recent call first):
  tools/hsbench/CMakeLists.txt:1 (include)

Any change you know what could be missing so that I can update the doc?

sqlite is there:

 brew install sqlite
Running `brew update --preinstall`...
==> Auto-updated Homebrew!
Updated 1 tap (homebrew/core).
==> Updated Formulae
Updated 1 formula.

Warning: sqlite 3.37.1 is already installed and up-to-date.
To reinstall 3.37.1, run:
  brew reinstall sqlite

and pthread as well:

ls -la /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/pthread
total 96
drwxr-xr-x   10 root  wheel    320 Jul 22 21:03 .
drwxr-xr-x  295 root  wheel   9440 Jan  5 10:28 ..
-rw-r--r--    1 root  wheel   5885 Jul 22 20:30 introspection.h
-rw-r--r--    1 root  wheel  24358 Jul 22 20:30 pthread.h
-rw-r--r--    1 root  wheel   1977 Jul 18 23:22 pthread_impl.h
-rw-r--r--    1 root  wheel   3630 Jul 22 20:30 pthread_spis.h
-rw-r--r--    1 root  wheel  10256 Jul 22 20:30 qos.h
-rw-r--r--    1 root  wheel   1410 Jul 18 23:22 sched.h
-rw-r--r--    1 root  wheel   2761 Jul 22 20:30 spawn.h
-rw-r--r--    1 root  wheel   2062 Jul 22 20:30 stack_np.h

and finally:

cmake --version
cmake version 3.22.1

msvc toolchain support?

The cmake function CHECK_C_SOURCE_COMPILES has poor support for msvc, but it seems to be rarely documented anywhere.

The current status is that we will fail on the arch.cmake check. I'm not sure if I need to change more than that.

can not build on armv7

I'm trying to build using armv7 gcc, found these errors. looks like some api are only available under armv8. Is it possible to make them armv7 compatible?

In file included from /home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/simd_utils.h:67:0,
                 from /home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/common/bitutils.h:38,
                 from /home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/bitutils.h:41,
                 from /home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/bitutils.h:51,
                 from /home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/bitfield.h:38,
                 from /home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/charreach.h:40,
                 from /home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/ue2string.h:37,
                 from /home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/compiler/compiler.h:41,
                 from /home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/hs.cpp:38:
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h: In function 'int diff128(m128, m128)':
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:60:15: error: 'vaddvq_s8' was not declared in this scope
     int res = vaddvq_s8((int8x16_t) vceqq_s32(a, b));
               ^~~~~~~~~
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:60:15: note: suggested alternative: 'vaddq_s8'
     int res = vaddvq_s8((int8x16_t) vceqq_s32(a, b));
               ^~~~~~~~~
               vaddq_s8
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h: In function 'u32 diffrich128(m128, m128)':
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:74:12: error: 'vaddvq_u32' was not declared in this scope
     return vaddvq_u32(vandq_u32(vmvnq_s32(vceqq_s32((int32x4_t)a, (int32x4_t)b)), movemask));
            ^~~~~~~~~~
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:74:12: note: suggested alternative: 'vaddq_u32'
     return vaddvq_u32(vandq_u32(vmvnq_s32(vceqq_s32((int32x4_t)a, (int32x4_t)b)), movemask));
            ^~~~~~~~~~
            vaddq_u32
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h: In function 'u32 diffrich64_128(m128, m128)':
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:83:43: error: 'vceqq_s64' was not declared in this scope
     return vaddvq_u64(vandq_u64(vmvnq_s32(vceqq_s64((int64x2_t)a, (int64x2_t)b)), movemask));
                                           ^~~~~~~~~
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:83:43: note: suggested alternative: 'vceq_p64'
     return vaddvq_u64(vandq_u64(vmvnq_s32(vceqq_s64((int64x2_t)a, (int64x2_t)b)), movemask));
                                           ^~~~~~~~~
                                           vceq_p64
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:83:12: error: 'vaddvq_u64' was not declared in this scope
     return vaddvq_u64(vandq_u64(vmvnq_s32(vceqq_s64((int64x2_t)a, (int64x2_t)b)), movemask));
            ^~~~~~~~~~
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:83:12: note: suggested alternative: 'vaddq_u64'
     return vaddvq_u64(vandq_u64(vmvnq_s32(vceqq_s64((int64x2_t)a, (int64x2_t)b)), movemask));
            ^~~~~~~~~~
            vaddq_u64
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h: In function 'm128 eq64_m128(m128, m128)':
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:121:19: error: 'vceqq_u64' was not declared in this scope
     return (m128) vceqq_u64((int64x2_t)a, (int64x2_t)b);
                   ^~~~~~~~~
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:121:19: note: suggested alternative: 'vceq_p64'
     return (m128) vceqq_u64((int64x2_t)a, (int64x2_t)b);
                   ^~~~~~~~~
                   vceq_p64
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h: In function 'm128 pshufb_m128(m128, m128)':
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:379:18: error: 'vqtbl1q_s8' was not declared in this scope
     return (m128)vqtbl1q_s8((int8x16_t)a, (uint8x16_t)btranslated);
                  ^~~~~~~~~~
/home/ltekken/workref/fortipkg/build/vectorscan/vectorscan-5.4.2-1/armv7/src/util/arch/arm/simd_utils.h:379:18: note: suggested alternative: 'vtbl1_s8'
     return (m128)vqtbl1q_s8((int8x16_t)a, (uint8x16_t)btranslated);
                  ^~~~~~~~~~
                  vtbl1_s8

Different behavior on x64 and aarch64

I noticed inconsitent matching behavior between x64 and aarch64 for certain regex and input.

Setup

Machine 1 (x64):

user@ubuntu:~$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

user@ubuntu:~$ ragel -version
Ragel State Machine Compiler version 6.10 March 2017
Copyright (c) 2001-2009 by Adrian Thurston
user@ubuntu:~$ uname -a
Linux ubuntu 5.13.0-39-generic #44~20.04.1-Ubuntu SMP Thu Mar 24 16:43:35 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Machine 2 (aarch64) - QEMU Cortex-A72:

user@aarch64-vm:~$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

user@aarch64-vm:~$ ragel -version
Ragel State Machine Compiler version 6.10 March 2017
Copyright (c) 2001-2009 by Adrian Thurston
user@aarch64-vm:~$ uname -a
Linux aarch64-vm 5.4.0-107-generic #121-Ubuntu SMP Thu Mar 24 16:07:22 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

Version

Tested with vectorscan build from master and from release 5.4.6 - both exhibit same behavior.

Test program

I compile it using g++.

#include <iostream>
#include <cstring>
#include "hs/hs.h"

const auto match_cb = [](auto, auto, auto, auto, void* ctx) {
    *((bool*) ctx) = true;
    return 0;
};

bool vectorscan_scan(char* regex, const char* input, unsigned int id) {
    hs_database_t* db = nullptr;
    hs_scratch_t* scratch = nullptr;
    hs_compile_error_t* error;
    bool is_match = false;
    if (hs_compile(regex, HS_FLAG_DOTALL | HS_FLAG_SINGLEMATCH | HS_FLAG_UTF8 | HS_FLAG_CASELESS, HS_MODE_BLOCK, nullptr, &db, &error) != HS_SUCCESS) {
        printf("ERROR hs_compile(): %s\n", error->message);
        return false;
    }
    if (hs_alloc_scratch(db, &scratch) != HS_SUCCESS) {
        printf("ERROR hs_alloc_scratch()\n");
        return false;
    }
    if (hs_scan(db, input, strlen(input), id, scratch, match_cb, &is_match) != HS_SUCCESS) {
        printf("ERROR hs_compile(): %s\n", error->message);
        return false;
    }
    hs_free_database(db);
    hs_free_scratch(scratch);
    return is_match;
}

int main() {
    char regex[] = R"(schtasks(\.exe)?\s.*\/create.*cscript.*)";
    char input[] = R"(schtasks.exe /create "cscript abcd")";
    if (!vectorscan_scan(regex, input, 1)) {
        printf("FAIL - SHOULD MATCH\n");
    } else {
        printf("SUCCESS - OK\n");
    }
    return 0;
}

The problem

When I run the program on x64 the input is successfully matched, however on ARM machine it isn't. I observed that the issue occurs only when both HS_FLAG_UTF8 | HS_FLAG_CASELESS flags are provided. Sometimes small tweaks in regex make the problem go away, but I'm unable to pinpoint what exactly triggers this inconsistency.

Unexpected behavior in aarch64

Hello,

First, thank you for taking the time to make arm support possible :)

Second, I have found a case where vectorscan reports a false positive match on ARM aarch64. The same input does not produce a false positive in the original hyperscan on x64.

I have isolated a very small reproducible example with 2 input regexes and a couple of bytes of corpus text. The text that is scanned is:
xxxxxxxxxx?y\nTEXT12345xxxxxxxxxxxx

whereas the two regexes are:

^x\\z*x
y\\z*TEXT12345

The single match is reported as follows:

  • Match id: 1
  • Ending position of match: 23
  • Matched pattern: y\\z*TEXT12345
  • Input from 0 to 23: xxxxxxxxxx?y\nTEXT12345

As far as I know, this should not match.

What I think could help is that the two regexes only produce a match if compiled without the flag HS_FLAG_SOM_LEFTMOST (this is why I only report the ending position of the match). For example, in my tests I was using flags HS_FLAG_DOTALL | HS_FLAG_MULTILINE, but the moment you include HS_FLAG_SOM_LEFTMOST, the match is no longer falsely reported.

Furthermore, if I remove e.g. one or more x chars from the end of the input string (even though these are not matched), then the match is no longer reported. Same with the x chars at the beginning. I know this is a strange example but it comes from a much larger dataset of inputs and this is the smallest I could pinpoint. Also note that if compiling the regexes individually, none of them produce matches.

The self-contained code of the example (notice the multiple backslashes for the escaping character \\\\):

#include <iostream>
#include <vector>
#include <hs.h>
#include <cstring>

typedef struct match{
    unsigned int id;
    unsigned int from;
    unsigned int to;
} Match;

int on_match_counter(unsigned int id, unsigned long long from, unsigned long long to, unsigned int flags, void *ctx)
{
    std::vector<Match> * matches = (std::vector<Match> *) ctx;
    Match m;
    m.id = id;
    m.from = from;
    m.to = to;
    matches->push_back(m);
    return 0;
}

int main(int ac, char ** av)
{
    const char input[] = "xxxxxxxxxx?y\\nTEXT12345xxxxxxxxxxxx";

    std::vector<const char *> cstr_patterns;
    std::vector<unsigned> patterns_flags;
    std::vector<unsigned> patterns_ids;
    
    cstr_patterns.push_back("^x\\\\z*x");
    cstr_patterns.push_back("y\\\\z*TEXT12345");

    for(int i=0; i<(int) cstr_patterns.size(); i++)
    {
        patterns_flags.push_back(HS_FLAG_DOTALL | HS_FLAG_MULTILINE); // produces one match
        //patterns_flags.push_back(HS_FLAG_DOTALL | HS_FLAG_MULTILINE | HS_FLAG_SOM_LEFTMOST); // does not produce any
        patterns_ids.push_back(i);
    }


    hs_database_t * db_block = NULL;
    hs_compile_error_t * compile_err = NULL;
    hs_scratch_t * scratch = NULL;

    hs_error_t err = hs_compile_multi(cstr_patterns.data(), patterns_flags.data(),
                    patterns_ids.data(), cstr_patterns.size(), HS_MODE_BLOCK,
                    NULL, &db_block, &compile_err);

    if (err != HS_SUCCESS)
    {
        hs_free_compile_error(compile_err);
        throw std::runtime_error("ERROR: Unable to compile.\n");
    }
    err = hs_alloc_scratch(db_block, &scratch);
    if (err != HS_SUCCESS) {
        hs_free_database(db_block);
        throw std::runtime_error("ERROR: Unable to allocate scratch space. Exiting.\n");
    }

    std::vector<Match> matches;
    err = hs_scan(db_block, input, strlen(input), 0, scratch, on_match_counter, (void *) &matches);
    if (err != HS_SUCCESS) 
        throw std::runtime_error("ERROR: Scanning");
    

    std::cout << "Found " << matches.size() << " match(es) with " << cstr_patterns.size() << " patterns" << std::endl;
    for(int i=0; i<(int)matches.size(); i++)
    {
        std::cout << "Match " << i << "\n\t[id:" << matches.at(i).id << "]@(" << matches.at(i).from << "," << matches.at(i).to << ")" << std::endl;
        std::cout << "\t" << cstr_patterns.at(matches.at(i).id) << std::endl;
        std::cout << "\t" << std::string(input).substr(matches.at(i).from, matches.at(i).to) << std::endl;
        
    }
    
    hs_free_database(db_block);
    hs_free_scratch(scratch);

    return 0;
}

I compiled with g++-10 (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 on x64 and gcc10-g++ (GCC) 10.3.1 20210422 (Red Hat 10.3.1-1) on aarch64. Ragel version is Ragel State Machine Compiler version 6.10 March 2017 for both.

I noticed there was also a recent post with a similar problem here and that maybe this PR fixes the problem. I can try rerunning the test when the PR is merged.

Let me know if there is anything else I can provide. Thank you for your time.

Add Debian packaging

Adding a few issues to keep track of things to do. This one is obvious and should allow Vectorscan to be used as a drop-in replacement in Debian-based distributions.

Why vectorscan require GCC9+ while hyperscan require only GCC4.8?

I'm compiling vectorscan with GCC6.5 and got this error:

[0m-- The C compiler identification is GNU 6.5.1
-- The CXX compiler identification is GNU 6.5.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /apsara/alicpp/built/gcc-6.5.1/gcc-6.5.1/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /apsara/alicpp/built/gcc-6.5.1/gcc-6.5.1/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test ARCH_X86_64
-- Performing Test ARCH_X86_64 - Failed
-- Performing Test ARCH_IA32
-- Performing Test ARCH_IA32 - Failed
-- Performing Test ARCH_AARCH64
-- Performing Test ARCH_AARCH64 - Success
-- Performing Test ARCH_ARM32
-- Performing Test ARCH_ARM32 - Failed
-- Performing Test ARCH_PPC64EL
-- Performing Test ARCH_PPC64EL - Failed
-- Default build type 'Release with debug info'
-- using release build
-- Boost version: 1.64.0
-- Found PythonInterp: /usr/bin/python (found version "2.7.5")
-- Build date: 2022-06-01
-- Building static libraries
-- gcc version 6.5.1
�[91mCMake Warning at CMakeLists.txt:187 (message):
Something went wrong determining gcc tune: -mtune=ARCH not valid, falling
back to -mtune=native
�[0m-- ARCH_C_FLAGS   :
-- ARCH_CXX_FLAGS :
-- g++ version 6.5.1
�[91mCMake Error at CMakeLists.txt:273 (message):
A minimum of g++ 9 is required for C++17 support
�[0m-- Configuring incomplete, errors occurred!

it's weird while hyperscan is supporting GCC4.8 and vectorscan raise version requirements to GCC9+ with this commit.

Anyone would explain this?

Installation

Hi, I know this isn't the place to write about this but I'm struggling to understand how to install vectorscan. Does it require hyperscan to be installed or is it standalone?
Also I don't want to cross-compile as I'm trying to install it directly on my Raspberry Pi 4 with DietPi OS (Debian-based) x64 ARMv8 Bullseye. How should I go about configuring cmake/setenv-arm64-cross.sh as mentioned in README?

Thank you

the actual type of char is different on different platforms

On android devices, char is unsigned, unlike PC. In the function unique_ptr parse(const char *ptr, ParseMode &globalMode), it assigns the content of each byte to a variable of type short by accessing a char pointer. On android devices, this step can have unexpected results. So it should be int8_t not char.
16492300306567
And after I replace those "char" with "int8_t" or "uint8_t", vectorscan works fine on my android device.

Unable to compile by Visual Studio 2017

Is windows platform supported at all by this fork of intel/hyperscan ?
When I tried to build it by Visual Studio 2017 for x86_64,
a lot of errors generated:

c:\devl\hs2\1\src\util/arch/common/bitutils.h(42): error C3861: '__builtin_clz': identifier not found [C:\devl\hs2\1\build\hs_compile.vcxproj]
c:\devl\hs2\1\src\util/arch/common/bitutils.h(47): error C3861: '__builtin_clzll': identifier not found [C:\devl\hs2\1\build\hs_compile.vcxproj]
c:\devl\hs2\1\src\util/arch/common/bitutils.h(53): error C3861: '__builtin_ctz': identifier not found [C:\devl\hs2\1\build\hs_compile.vcxproj]
c:\devl\hs2\1\src\util/arch/common/bitutils.h(58): error C3861: '__builtin_ctzll': identifier not found [C:\devl\hs2\1\build\hs_compile.vcxproj]
c:\devl\hs2\1\src\util/arch/common/bitutils.h(407): error C3861: '__builtin_popcountl': identifier not found [C:\devl\hs2\1\build\hs_compile.vcxproj]
c:\devl\hs2\1\src\util/arch/common/bitutils.h(413): error C3861: '__builtin_clzl': identifier not found [C:\devl\hs2\1\build\hs_compile.vcxproj]
c:\devl\hs2\1\src\util/arch/x86/bitutils.h(146): error C3861: 'ctz64': identifier not found [C:\devl\hs2\1\build\hs_compile.vcxproj]

More then that if to workaround above errors with a couple fixes in files

src/util/arch/common/bitutils.h
src/util/arch/x86/bitutils.h

than is possible to proceed with compilation, but later build fails anyway with the following errors:

C:\devl\hs2\1\src\util/arch/x86/simd_utils.h(56): error C2440: 'type cast': cannot convert from '__m128i' to 'm128' [C:\devl\hs2\1\build\hs_exec.vcxproj]
C:\devl\hs2\1\src\util/arch/x86/simd_utils.h(61): error C2440: 'type cast': cannot convert from '__m128i' to 'm128' [C:\devl\hs2\1\build\hs_exec.vcxproj]
C:\devl\hs2\1\src\fdr\teddy.c(1073): error C2143: syntax error: missing ')' before '(' [C:\devl\hs2\1\build\hs_exec.vcxproj]
...

Please advice

Building fails on macos/x64 with 'mktemp: illegal option -- p'

Building 73695e4 on x64 macos 12.4 (Monterey) fails with the default build instructions, e.g.

$ cmake ../vectorscan
CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.

-- The C compiler identification is AppleClang 13.1.6.13160021
-- The CXX compiler identification is AppleClang 13.1.6.13160021
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test ARCH_X86_64
-- Performing Test ARCH_X86_64 - Success
-- Performing Test ARCH_IA32
-- Performing Test ARCH_IA32 - Failed
-- Performing Test ARCH_AARCH64
-- Performing Test ARCH_AARCH64 - Failed
-- Performing Test ARCH_ARM32
-- Performing Test ARCH_ARM32 - Failed
-- Performing Test ARCH_PPC64EL
-- Performing Test ARCH_PPC64EL - Failed
-- Default build type 'Release with debug info'
-- using release build
-- Boost version: 1.78.0
-- Found Python: /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/bin/python3.8 (found version "3.8.9") found components: Interpreter 
-- Build date: 2022-06-19
-- Building static libraries
-- clang will tune for native, generic
-- ARCH_C_FLAGS   : -msse4.2
-- ARCH_CXX_FLAGS : -msse4.2
-- Looking for include file unistd.h
-- Looking for include file unistd.h - found
-- Looking for include file intrin.h
-- Looking for include file intrin.h - not found
-- Looking for C++ include intrin.h
-- Looking for C++ include intrin.h - not found
-- Looking for include file x86intrin.h
-- Looking for include file x86intrin.h - found
-- Looking for C++ include x86intrin.h
-- Looking for C++ include x86intrin.h - found
-- Looking for posix_memalign
-- Looking for posix_memalign - found
-- Looking for _aligned_malloc
-- Looking for _aligned_malloc - not found
-- Performing Test HAS_C_HIDDEN
-- Performing Test HAS_C_HIDDEN - Success
-- Performing Test HAS_CXX_HIDDEN
-- Performing Test HAS_CXX_HIDDEN - Success
-- Looking for _LIBCPP_VERSION
-- Looking for _LIBCPP_VERSION - found
-- Performing Test HAVE_SSE42
-- Performing Test HAVE_SSE42 - Success
-- Performing Test HAVE_AVX2
-- Performing Test HAVE_AVX2 - Success
-- Performing Test HAVE_AVX512
-- Performing Test HAVE_AVX512 - Failed
-- Performing Test HAVE_AVX512VBMI
-- Performing Test HAVE_AVX512VBMI - Failed
-- Performing Test HAVE_CC_BUILTIN_ASSUME_ALIGNED
-- Performing Test HAVE_CC_BUILTIN_ASSUME_ALIGNED - Success
-- Performing Test HAVE_CXX_BUILTIN_ASSUME_ALIGNED
-- Performing Test HAVE_CXX_BUILTIN_ASSUME_ALIGNED - Success
-- Performing Test C_FLAG_Wvla
-- Performing Test C_FLAG_Wvla - Success
-- Performing Test C_FLAG_Wpointer_arith
-- Performing Test C_FLAG_Wpointer_arith - Success
-- Performing Test C_FLAG_Wstrict_prototypes
-- Performing Test C_FLAG_Wstrict_prototypes - Success
-- Performing Test C_FLAG_Wmissing_prototypes
-- Performing Test C_FLAG_Wmissing_prototypes - Success
-- Performing Test CXX_FLAG_Wvla
-- Performing Test CXX_FLAG_Wvla - Success
-- Performing Test CXX_FLAG_Wpointer_arith
-- Performing Test CXX_FLAG_Wpointer_arith - Success
-- Performing Test CC_SELF_ASSIGN
-- Performing Test CC_SELF_ASSIGN - Success
-- Performing Test CXX_SELF_ASSIGN
-- Performing Test CXX_SELF_ASSIGN - Success
-- Performing Test CC_PAREN_EQUALITY
-- Performing Test CC_PAREN_EQUALITY - Success
-- Performing Test CXX_UNUSED_CONST_VAR
-- Performing Test CXX_UNUSED_CONST_VAR - Success
-- Performing Test CXX_IGNORED_ATTR
-- Performing Test CXX_IGNORED_ATTR - Success
-- Performing Test CXX_REDUNDANT_MOVE
-- Performing Test CXX_REDUNDANT_MOVE - Success
-- Performing Test CXX_WEAK_VTABLES
-- Performing Test CXX_WEAK_VTABLES - Success
-- Performing Test CXX_MISSING_DECLARATIONS
-- Performing Test CXX_MISSING_DECLARATIONS - Success
-- Performing Test CXX_UNUSED_LOCAL_TYPEDEFS
-- Performing Test CXX_UNUSED_LOCAL_TYPEDEFS - Success
-- Performing Test CXX_WUNUSED_VARIABLE
-- Performing Test CXX_WUNUSED_VARIABLE - Success
-- Performing Test CC_STRINGOP_OVERFLOW
-- Performing Test CC_STRINGOP_OVERFLOW - Failed
-- Building runtime for multiple microarchitectures
-- Looking for mmap
-- Looking for mmap - found
-- Doxygen not found, unable to generate API reference
-- Sphinx not found, unable to generate developer reference
-- Found PkgConfig: /usr/local/bin/pkg-config (found version "0.29.2") 
-- Checking for module 'libpcre>=8.41'
--   Found libpcre, version 8.45
-- PCRE version 8.41 or above
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Checking for module 'sqlite3'
--   Found sqlite3, version 3.37.0
-- Performing Test SQLITE_VERSION_OK
-- Performing Test SQLITE_VERSION_OK - Success
-- Looking for sqlite3_open_v2
-- Looking for sqlite3_open_v2 - found
-- Looking for C++ include pthread_np.h
-- Looking for C++ include pthread_np.h - not found
-- Looking for pthread_setaffinity_np
-- Looking for pthread_setaffinity_np - not found
-- Looking for malloc_info
-- Looking for malloc_info - not found
-- Looking for shmget
-- Looking for shmget - found
-- Performing Test BACKTRACE_LIBC
-- Performing Test BACKTRACE_LIBC - Success
-- Performing Test HAS_RDYNAMIC
-- Performing Test HAS_RDYNAMIC - Success
-- Looking for sigaltstack
-- Looking for sigaltstack - found
-- Looking for sigaction
-- Looking for sigaction - found
-- Looking for setrlimit
-- Looking for setrlimit - found
-- Configuring done
CMake Warning (dev) at CMakeLists.txt:1255 (add_library):
  Policy CMP0115 is not set: Source file extensions must be explicit.  Run
  "cmake --help-policy CMP0115" for policy details.  Use the cmake_policy
  command to set the policy and suppress this warning.

  File:

    /Users/<user>/<path>/vectorscan/src/hs_version.h.in
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Generating done
-- Build files have been written to: /Users/<user>/<path>/vectorscan-build
$ cmake --build .
[  0%] Building C object CMakeFiles/hs_exec_core2.dir/src/crc32.c.o
mktemp: illegal option -- p
usage: mktemp [-d] [-q] [-t prefix] [-u] template ...
       mktemp [-d] [-q] [-u] -t prefix 
make[2]: *** [CMakeFiles/hs_exec_core2.dir/src/crc32.c.o] Error 1
make[1]: *** [CMakeFiles/hs_exec_core2.dir/all] Error 2
make: *** [all] Error 2
$ 

macos's mktemp has the following options:

OPTIONS
     The available options are as follows:

     -d      Make a directory instead of a file.

     -q      Fail silently if an error occurs.  This is useful if a script does
             not want error output to go to standard error.

     -t prefix
             Generate a template (using the supplied prefix and TMPDIR if set)
             to create a filename template.

     -u      Operate in “unsafe” mode.  The temp file will be unlinked before
             mktemp exits.  This is slightly better than mktemp(3) but still
             introduces a race condition.  Use of this option is not encouraged.

vectorscan/5.4.7 fails to enable SVE2 with gcc 10.3.0

The main issue is that CMakeLists.txt is appending flags for march that are already in the flag list (namely +sve2 and +sve2-bitperm). The second issue is the attempt to set mtune to be the same as march which does not work for aarch64.

git status

HEAD detached at vectorscan/5.4.7
nothing to commit, working tree clean

/usr/bin/cmake --prefix=/usr/local -DBUILD_SVE2=1 -DBUILD_SVE2_BITPERM=1 -DCMAKE_INSTALL_PREFIX=/usr/local -DFAT_RUNTIME=off -DBOOST_ROOT=/bmark/snort/0/source/boost/ -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_FLAGS="-mcpu=native" -DCMAKE_CXX_FLAGS="-mcpu=native" /bmark/snort/0/source/hyperscan

-- The C compiler identification is GNU 10.3.0
-- The CXX compiler identification is GNU 10.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test ARCH_X86_64
-- Performing Test ARCH_X86_64 - Failed
-- Performing Test ARCH_IA32
-- Performing Test ARCH_IA32 - Failed
-- Performing Test ARCH_AARCH64
-- Performing Test ARCH_AARCH64 - Success
-- Performing Test ARCH_ARM32
-- Performing Test ARCH_ARM32 - Failed
-- Performing Test ARCH_PPC64EL
-- Performing Test ARCH_PPC64EL - Failed
-- Build type RELEASE
-- using release build
-- Boost version: 1.77.0
-- Found PythonInterp: /usr/bin/python (found version "2.7.18")
-- Build date: 2022-05-06
-- Building static libraries
-- gcc version 10.3.0
CMake Warning at CMakeLists.txt:187 (message):
Something went wrong determining gcc tune:
-mtune=armv8.5-a+crypto+rcpc+sve2-sm4+sve2-aes+sve2-sha3+sve2-bitperm+nopredres
not valid, falling back to -mtune=native

-- ARCH_C_FLAGS :
-- ARCH_CXX_FLAGS :
-- g++ version 10.3.0
-- Looking for include file unistd.h
-- Looking for include file unistd.h - found
-- Looking for C++ include arm_neon.h
-- Looking for C++ include arm_neon.h - found
-- Looking for C++ include arm_sve.h
-- Looking for C++ include arm_sve.h - found
-- Looking for posix_memalign
-- Looking for posix_memalign - found
-- Looking for _aligned_malloc
-- Looking for _aligned_malloc - not found
-- Performing Test HAS_C_HIDDEN
-- Performing Test HAS_C_HIDDEN - Success
-- Performing Test HAS_CXX_HIDDEN
-- Performing Test HAS_CXX_HIDDEN - Success
-- Looking for _LIBCPP_VERSION
-- Looking for _LIBCPP_VERSION - not found
-- Performing Test HAVE_NEON
-- Performing Test HAVE_NEON - Success
-- Performing Test HAVE_SVE2_BITPERM
-- Performing Test HAVE_SVE2_BITPERM - Failed
-- Performing Test HAVE_SVE2
-- Performing Test HAVE_SVE2 - Failed
-- Performing Test HAVE_CC_BUILTIN_ASSUME_ALIGNED
-- Performing Test HAVE_CC_BUILTIN_ASSUME_ALIGNED - Success
-- Performing Test HAVE_CXX_BUILTIN_ASSUME_ALIGNED
-- Performing Test HAVE_CXX_BUILTIN_ASSUME_ALIGNED - Success
-- Performing Test HAVE__BUILTIN_CONSTANT_P
-- Performing Test HAVE__BUILTIN_CONSTANT_P - Success
-- Performing Test C_FLAG_Wvla
-- Performing Test C_FLAG_Wvla - Success
-- Performing Test C_FLAG_Wpointer_arith
-- Performing Test C_FLAG_Wpointer_arith - Success
-- Performing Test C_FLAG_Wstrict_prototypes
-- Performing Test C_FLAG_Wstrict_prototypes - Success
-- Performing Test C_FLAG_Wmissing_prototypes
-- Performing Test C_FLAG_Wmissing_prototypes - Success
-- Performing Test CXX_FLAG_Wvla
-- Performing Test CXX_FLAG_Wvla - Success
-- Performing Test CXX_FLAG_Wpointer_arith
-- Performing Test CXX_FLAG_Wpointer_arith - Success
-- Performing Test CC_SELF_ASSIGN
-- Performing Test CC_SELF_ASSIGN - Failed
-- Performing Test CXX_SELF_ASSIGN
-- Performing Test CXX_SELF_ASSIGN - Failed
-- Performing Test CC_PAREN_EQUALITY
-- Performing Test CC_PAREN_EQUALITY - Failed
-- Performing Test CXX_UNUSED_CONST_VAR
-- Performing Test CXX_UNUSED_CONST_VAR - Success
-- Performing Test CXX_IGNORED_ATTR
-- Performing Test CXX_IGNORED_ATTR - Success
-- Performing Test CXX_REDUNDANT_MOVE
-- Performing Test CXX_REDUNDANT_MOVE - Success
-- Performing Test CXX_WEAK_VTABLES
-- Performing Test CXX_WEAK_VTABLES - Failed
-- Performing Test CXX_MISSING_DECLARATIONS
-- Performing Test CXX_MISSING_DECLARATIONS - Success
-- Performing Test CXX_UNUSED_LOCAL_TYPEDEFS
-- Performing Test CXX_UNUSED_LOCAL_TYPEDEFS - Success
-- Performing Test CXX_WUNUSED_VARIABLE
-- Performing Test CXX_WUNUSED_VARIABLE - Success
-- Performing Test CC_STRINGOP_OVERFLOW
-- Performing Test CC_STRINGOP_OVERFLOW - Success
-- Building for current host CPU: -march=armv8.5-a+crypto+rcpc+sve2-sm4+sve2-aes+sve2-sha3+sve2-bitperm+nopredres+sve2-bitperm -mtune=native
-- Looking for mmap
-- Looking for mmap - not found
-- Doxygen not found, unable to generate API reference
-- Sphinx not found, unable to generate developer reference
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1")
-- Checking for module 'libpcre>=8.41'
-- Found libpcre, version 8.45
-- PCRE version 8.41 or above
-- Looking for pthread.h
-- Looking for pthread.h - not found
-- Could NOT find Threads (missing: Threads_FOUND)
-- Checking for module 'sqlite3'
-- Found sqlite3, version 3.31.1
-- Performing Test SQLITE_VERSION_OK
-- Performing Test SQLITE_VERSION_OK - Failed
CMake Error at cmake/sqlite3.cmake:34 (message):
sqlite3 is broken from 3.8.7 to 3.8.10 - please find a working version
Call Stack (most recent call first):
tools/hsbench/CMakeLists.txt:1 (include)

-- Configuring incomplete, errors occurred!
See also "/bmark/snort/0/source/hyperscan/CMakeFiles/CMakeOutput.log".
See also "/bmark/snort/0/source/hyperscan/CMakeFiles/CMakeError.log".

tail -n 20 CMakeFiles/CMakeError.log

Change Dir: /bmark/snort/0/source/hyperscan/CMakeFiles/CMakeTmp

Run Build Command(s):/usr/bin/make cmTC_4ea79/fast && /usr/bin/make -f CMakeFiles/cmTC_4ea79.dir/build.make CMakeFiles/cmTC_4ea79.dir/build
make[1]: Entering directory '/bmark/snort/0/source/hyperscan/CMakeFiles/CMakeTmp'
Building C object CMakeFiles/cmTC_4ea79.dir/src.c.o
/usr/bin/cc -mcpu=native -march=armv8.5-a+crypto+rcpc+sve2-sm4+sve2-aes+sve2-sha3+sve2-bitperm+nopredres+sve2-bitperm -mtune=native -DSQLITE_VERSION_OK -o CMakeFiles/cmTC_4ea79.dir/src.c.o -c /bmark/snort/0/source/hyperscan/CMakeFiles/CMakeTmp/src.c
Assembler messages:
Error: must specify extensions to add before specifying those to remove
Error: unrecognized option -march=armv8.5-a+crypto+rcpc+sve2-sm4+sve2-aes+sve2-sha3+sve2-bitperm+nopredres+sve2-bitperm
make[1]: *** [CMakeFiles/cmTC_4ea79.dir/build.make:66: CMakeFiles/cmTC_4ea79.dir/src.c.o] Error 1
make[1]: Leaving directory '/bmark/snort/0/source/hyperscan/CMakeFiles/CMakeTmp'
make: *** [Makefile:121: cmTC_4ea79/fast] Error 2

Source file was:
#include <sqlite3.h>
#if SQLITE_VERSION_NUMBER >= 3008007 && SQLITE_VERSION_NUMBER < 3008010
#error broken sqlite
#endif
int main() {return 0;}

assume_aligned vs. std::assume_aligned

Vectorscan uses a macro assume_aligned(x, y) in a few places (see files "src/util/simd_utils.h", "src/util/supervector/supervector.hpp") which maps to __builtin_assume_aligned() on systems which support that.

We (ClickHouse) use vectorscan as a third-party library. I currently try to upgrade our (LLVM) libcxx (= the C++ standard library) from 14 to 15. Libcxx 15 provides an implementation of std::assume_aligned which is part of the C++20 standard. I noticed that the standard library implementation and the macro definition in vectorscan clash and break compilation.

My initial attempt to fix this was to prefix vectorscan's macro as vectorscan_assume_aligned(x, y) and to rename all vectorscan-interal usages of the macro accordingly. That made the compiler happy and I think it is a viable interim solution.

Unfortunately, it looks like I somehow don't have rights to open a pull request in this repository so I could not upstream the patch. I also could not create a fork of vectorscan in our (ClickHouse's) organization because vectorscan is a fork of hyperscan, we already have an (archived) fork of hyperscan in our organization and GitHub doesn't allow forking a fork if the parent already exists in an organization 😞 So to make things work on our end and resolve the clash, I ended up commenting out std::assume_aligned in libcxx 15.

It would be great if you could prefix assume_aligned() in vectorscan or convert the code to use std::assume_aligned directly (if you are not afraid of the C++20 dependency).

aarch64 and ppc64le: error: narrowing conversion of 'XYZ' from 'int' to 'char' [-Wnarrowing]

Building vectorscan 5.4.6 fails with GCC11 on openSUSE Tumbleweed aarch64 with:

[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s]   765 |         };
[  316s]       |         ^
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-64' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-33' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-32' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-17' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-16' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-9' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-8' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-1' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-64' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-33' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-32' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-17' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-16' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-9' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-8' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-1' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-64' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-33' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-32' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-17' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-16' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-9' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-8' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-1' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-64' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-33' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-32' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-17' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-16' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-9' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-8' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-1' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-128' from 'int' to 'char' [-Wnarrowing]
[  316s] /home/abuild/rpmbuild/BUILD/vectorscan-vectorscan-5.4.6/build/src/parser/Parser.cpp:765:9: error: narrowing conversion of '-65' from 'int' to 'char' [-Wnarrowing]

FAT_RUNTIME, shared library only builds, and various other distro related requirements.

I've been looking at packaging this project on fedora/rhel/epel, and it seems many of the x86 options in place for distro's (hyperscan is part of fedora/etc) don't work on aarch64/ppc. Beyond what others have noted about the signed/unsigned char/colm issues:

For starters the expectation is that something like the FAT_RUNTIME option in the makefile generates binaries with a baseline (say armv-8a, without any optional arch extensions) that runs regardless of the HW, and at runtime selects neon/SVE/SVE2 paths if the HW supports it. Since most distro's also build nativly, the CMake file needs to be aware of this and honor the baseline compiler options (which tend to also include things like -mbranch_protection/etc) only extending them as needed rather than trying to do -march/mtune=native options because the build machine won't be related to the install machine.

Also, most distro's frown on static libraries, but the unit test won't compile without them (-DBUILD_SHARED_LIB:BOOL=ON). There is also a problem on aarch64 platforms where the exception stack size is a TLS variable but won't be fixed at buildtime (this is generic to hyperscan as well AFAIK). There are various other lower priority issues, like the documentation isn't installed, the project tosses a LOT of warnings building on aarch64, the c++17 requirement means it won't build with the system compiler on older distros, etc.

Basically, I've spent a couple days trying to create a generic package on aarch64 and it looks like it needs more work beyond the hacky patches I've applied to force it to build in a single case.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.