Code Monkey home page Code Monkey logo

audiowmark's Introduction

audiowmark - Audio Watermarking

Description

audiowmark is an Open Source (GPL) solution for audio watermarking.

A sound file is read by the software, and a 128-bit message is stored in a watermark in the output sound file. For human listeners, the files typically sound the same.

However, the 128-bit message can be retrieved from the output sound file. Our tests show, that even if the file is converted to mp3 or ogg (with bitrate 128 kbit/s or higher), the watermark usually can be retrieved without problems. The process of retrieving the message does not need the original audio file (blind decoding).

Internally, audiowmark is using the patchwork algorithm to hide the data in the spectrum of the audio file. The signal is split into 1024 sample frames. For each frame, some pseoudo-randomly selected amplitudes of the frequency bands of a 1024-value FFTs are increased or decreased slightly, which can be detected later. The algorithm used here is inspired by

Martin Steinebach: Digitale Wasserzeichen für Audiodaten.
Darmstadt University of Technology 2004, ISBN 3-8322-2507-2

If you are interested in the details how audiowmark works, there is a separate documentation for developers.

Open Source License

audiowmark is open source software available under the GPLv3 or later license.

Copyright © 2018-2020 Stefan Westerfeld

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Adding a Watermark

To add a watermark to the soundfile in.wav with a 128-bit message (which is specified as hex-string):

  $ audiowmark add in.wav out.wav 0123456789abcdef0011223344556677
  Input:        in.wav
  Output:       out.wav
  Message:      0123456789abcdef0011223344556677
  Strength:     10

  Time:         3:59
  Sample Rate:  48000
  Channels:     2
  Data Blocks:  4

If you want to use audiowmark in any serious application, please read the section Recommendations for the Watermarking Payload on how to generate the 128-bit message. Typically these bits should be a hash or HMAC of some sort.

The most important options for adding a watermark are:

--key <filename>

Use watermarking key from file <filename> (see Watermark Key).

--strength <s>

Set the watermarking strength (see Watermark Strength).

Retrieving a Watermark

To get the 128-bit message from the watermarked file, use:

  $ audiowmark get out.wav
  pattern  0:05 0123456789abcdef0011223344556677 1.324 0.059 A
  pattern  0:57 0123456789abcdef0011223344556677 1.413 0.112 B
  pattern  0:57 0123456789abcdef0011223344556677 1.368 0.086 AB
  pattern  1:49 0123456789abcdef0011223344556677 1.302 0.098 A
  pattern  2:40 0123456789abcdef0011223344556677 1.361 0.093 B
  pattern  2:40 0123456789abcdef0011223344556677 1.331 0.096 AB
  pattern   all 0123456789abcdef0011223344556677 1.350 0.054

The output of audiowmark get is designed to be machine readable. Each line that starts with pattern contains one decoded message. The fields are seperated by one or more space characters. The first field is a timestamp indicating the position of the data block. The second field is the decoded message. For most purposes this is all you need to know.

The software was designed under the assumption that the message is a hash or HMAC of some sort. Before you start using audiowmark in any serious application, please read the section Recommendations for the Watermarking Payload. You - the user - should be able to decide whether a message is correct or not. To do this, on watermarking song files, you could create a database entry for each message you embedded in a watermark. During retrieval, you should perform a database lookup for each pattern audiowmark get outputs. If the message is not found, then you should assume that a decoding error occurred. In our example each pattern was decoded correctly, because the watermark was not damaged at all, but if you for instance use lossy compression (with a low bitrate), it may happen that only some of the decoded patterns are correct. Or none, if the watermark was damaged too much.

The third field is the sync score (higher is better). The synchronization algorithm tries to find valid data blocks in the audio file, that become candidates for decoding.

The fourth field is the decoding error (lower is better). During message decoding, we use convolutional codes for error correction, to make the watermarking more robust.

The fifth field is the block type. There are two types of data blocks, A blocks and B blocks. A single data block can be decoded alone, as it contains a complete message. However, if during watermark detection an A block followed by a B block was found, these two can be decoded together (then this field will be AB), resulting in even higher error correction capacity than one block alone would have.

To improve the error correction capacity even further, the all pattern combines all data blocks that are available. The combined decoded message will often be the most reliable result (meaning that even if all other patterns were incorrect, this could still be right).

The most important options for getting a watermark are:

--key <filename>

Use watermarking key from file <filename> (see Watermark Key).

--strength <s>

Set the watermarking strength (see Watermark Strength).

--detect-speed
--detect-speed-patient

Detect and correct replay speed difference (see Speed Detection).

--json <file>

Write results to <file> in machine readable JSON format.

Watermark Key

Since the software is Open Source, a watermarking key should be used to ensure that the message bits cannot be retrieved by somebody else (which would also allow removing the watermark without loss of quality). The watermark key controls all pseudo-random parameters of the algorithm. This means that it determines which frequency bands are increased or decreased to store a 0 bit or a 1 bit. Without the key, it is impossible to decode the message bits from the audio file alone.

Our watermarking key is a 128-bit AES key. A key can be generated using

audiowmark gen-key test.key

and can be used for the add/get commands as follows:

audiowmark add --key test.key in.wav out.wav 0123456789abcdef0011223344556677
audiowmark get --key test.key out.wav

Keys can be named using the gen-key --name option, and the key name will be reported for each match:

audiowmark gen-key oct23.key --name "October 2023"

Finally, it is possible to use the --key option more than once for watermark detection. In this case, all keys that are specified will be tried. This is useful if you change keys on a regular basis, and passing multiple keys is more efficient than performing watermark detection multiple times with one key.

audiowmark get --key oct23.key --key nov23.key --key dec23.key out.wav

Watermark Strength

The watermark strength parameter affects how much the watermarking algorithm modifies the input signal. A stronger watermark is more audible, but also more robust against modifications. The default strength is 10. A watermark with that strength is recoverable after mp3/ogg encoding with 128kbit/s or higher. In our informal listening tests, this setting also has a very good subjective quality.

A higher strength (for instance 15) would be helpful for instance if robustness against multiple conversions or conversions to low bit rates (i.e. 64kbit/s) is desired.

A lower strength (for instance 6) makes the watermark less audible, but also less robust. Strengths below 5 are not recommended. To set the strength, the same value has to be passed during both, generation and retrieving the watermark. Fractional strengths (like 7.5) are possible.

audiowmark add --strength 15 in.wav out.wav 0123456789abcdef0011223344556677
audiowmark get --strength 15 out.wav

Recommendations for the Watermarking Payload

Although audiowmark does not specify what the 128-bit message stored in the watermark should be, it was designed under the assumption that the message should be a hash or HMAC of some sort.

Lets look at a typical use case. We have a song called Dreams by an artist called Alice. A user called John Smith downloads a watermarked copy.

Later, we find this file somewhere on the internet. Typically we want to answer the questions:

  • is this one of the files we previously watermarked?

  • what song/artist is this?

  • which user shared it?

When the user downloads a watermarked copy, we construct a string that contains all information we need to answer our questions, for example like this:

Artist:Alice|Title:Dreams|User:John Smith

To obtain the 128-bit message, we can hash this string, for instance by using the first 128 bits of a SHA-256 hash like this:

$ STRING='Artist:Alice|Title:Dreams|User:John Smith'
$ MSG=`echo -n "$STRING" | sha256sum | head -c 32`
$ echo $MSG
ecd057f0d1fbb25d6430b338b5d72eb2

This 128-bit message can be used as watermark:

$ audiowmark add --key my.key song.wav song.wm.wav $MSG

At this point, we should also create a database entry consisting of the hash value $MSG and the corresponding string $STRING.

The shell commands for creating the hash are listed here to provide a simplified example. Fields (like the song title) can contain the characters ' and |, so these cases need to be dealt with.

If we find a watermarked copy of the song on the net, the first step is to detect the watermark message using

$ audiowmark get --key my.key song.wm.wav
pattern  0:05 ecd057f0d1fbb25d6430b338b5d72eb2 1.377 0.068 A
pattern  0:57 ecd057f0d1fbb25d6430b338b5d72eb2 1.392 0.109 B
[...]

The second step is to perform a database lookup for each result returned by audiowmark. If we find a matching entry in our database, this is one of the files we previously watermarked.

As a last step, we can use the string stored in the database, which contains the song/artist and the user that shared it.

The advantages of using a hash as message are:

  1. Although audiowmark sometimes produces false positives, this doesn’t matter, because it is extremely unlikely that a false positive will match an existing database entry.

  2. Even if a few bit errors occur, it is extremely unlikely that a song watermarked for user A will be attributed to user B, simply because all hash bits depend on the user. So this is a much better payload than storing a user ID, artist ID and song ID in the message bits directly.

  3. It is easy to extend, because we can add any fields we need to the hash string. For instance, if we want to store the name of the album, we can simply add it to the string.

  4. If the hash matches exactly, it is really hard to deny that it was this user who shared the song. How else could all 128 bits of the hash match the message bits decoded by audiowmark?

Speed Detection

If a watermarked audio signal is played back a little faster or slower than the original speed, watermark detection will fail. This could happen by accident if the digital watermark was converted to an analog signal and back and the original speed was not (exactly) preserved. It could also be done intentionally as an attack to avoid the watermark from being detected.

In order to be able to find the watermark in these cases, audiowmark can try to figure out the speed difference to the original audio signal and correct the replay speed before detecting the watermark. The search range for the replay speed is approximately [0.8..1.25].

Example: add a watermark to in.wav and increase the replay speed by 5% using sox.

  $ audiowmark add in.wav out.wav 0123456789abcdef0011223344556677
  [...]
  $ sox out.wav out1.wav speed 1.05

Without speed detection, we get no results. With speed detection the speed difference is detected and corrected so we get results.

  $ audiowmark get out1.wav
  $ audiowmark get out1.wav --detect-speed
  speed 1.049966
  pattern  0:05 0123456789abcdef0011223344556677 1.209 0.147 A-SPEED
  pattern  0:57 0123456789abcdef0011223344556677 1.301 0.143 B-SPEED
  pattern  0:57 0123456789abcdef0011223344556677 1.255 0.145 AB-SPEED
  pattern  1:49 0123456789abcdef0011223344556677 1.380 0.173 A-SPEED
  pattern   all 0123456789abcdef0011223344556677 1.297 0.130 SPEED

The speed detection algorithm is not enabled by default because it is relatively slow (total cpu time required) and needs a lot of memory. However the search is automatically run in parallel using many threads on systems with many cpu cores. So on good hardware it makes sense to always enable this option to be robust to replay speed attacks.

There are two versions of the speed detection algorithm, --detect-speed and --detect-speed-patient. The difference is that the patient version takes more cpu time to detect the speed, but produces more accurate results.

Short Payload (experimental)

By default, the watermark will store a 128-bit message. In this mode, we recommend using a 128bit hash (or HMAC) as payload. No error checking is performed, the user needs to test patterns that the watermarker decodes to ensure that they really are one of the expected patterns, not a decoding error.

As an alternative, an experimental short payload option is available, for very short payloads (12, 16 or 20 bits). It is enabled using the --short <bits> command line option, for instance for 16 bits:

audiowmark add --short 16 in.wav out.wav abcd
audiowmark get --short 16 out.wav

Internally, a larger set of bits is sent to ensure that decoded short patterns are really valid, so in this mode, error checking is performed after decoding, and only valid patterns are reported.

Besides error checking, the advantage of a short payload is that fewer bits need to be sent, so decoding will more likely to be successful on shorter clips.

Video Files

For video files, videowmark can be used to add a watermark to the audio track of video files. To add a watermark, use

  $ videowmark add in.avi out.avi 0123456789abcdef0011223344556677
  Audio Codec:  -c:a mp3 -ab 128000
  Input:        in.avi
  Output:       out.avi
  Message:      0123456789abcdef0011223344556677
  Strength:     10

  Time:         3:53
  Sample Rate:  44100
  Channels:     2
  Data Blocks:  4

To detect a watermark, use

  $ videowmark get out.avi
  pattern  0:05 0123456789abcdef0011223344556677 1.294 0.142 A
  pattern  0:57 0123456789abcdef0011223344556677 1.191 0.144 B
  pattern  0:57 0123456789abcdef0011223344556677 1.242 0.145 AB
  pattern  1:49 0123456789abcdef0011223344556677 1.215 0.120 A
  pattern  2:40 0123456789abcdef0011223344556677 1.079 0.128 B
  pattern  2:40 0123456789abcdef0011223344556677 1.147 0.126 AB
  pattern   all 0123456789abcdef0011223344556677 1.195 0.104

The key and strength can be set using the command line options

--key <filename>

Use watermarking key from file <filename> (see Watermark Key).

--strength <s>

Set the watermarking strength (see Watermark Strength).

Videos can be watermarked on-the-fly using HTTP Live Streaming.

Output as Stream

Usually, an input file is read, watermarked and an output file is written. This means that it takes some time before the watermarked file can be used.

An alternative is to output the watermarked file as stream to stdout. One use case is sending the watermarked file to a user via network while the watermarker is still working on the rest of the file. Here is an example how to watermark a wav file to stdout:

audiowmark add in.wav - 0123456789abcdef0011223344556677 | play -

In this case the file in.wav is read, watermarked, and the output is sent to stdout. The "play -" can start playing the watermarked stream while the rest of the file is being watermarked.

If - is used as output, the output is a valid .wav file, so the programs running after audiowmark will be able to determine sample rate, number of channels, bit depth, encoding and so on from the wav header.

Note that all input formats supported by audiowmark can be used in this way, for instance flac/mp3:

audiowmark add in.flac - 0123456789abcdef0011223344556677 | play -
audiowmark add in.mp3 - 0123456789abcdef0011223344556677 | play -

Input from Stream

Similar to the output, the audiowmark input can be a stream. In this case, the input must be a valid .wav file. The watermarker will be able to start watermarking the input stream before all data is available. An example would be:

cat in.wav | audiowmark add - out.wav 0123456789abcdef0011223344556677

It is possible to do both, input from stream and output as stream.

cat in.wav | audiowmark add - - 0123456789abcdef0011223344556677 | play -

Streaming input is also supported for watermark detection.

cat in.wav | audiowmark get -

Wav Pipe Format

In some cases, the length of the streaming input is not known by the program that produces the stream. For instance consider a mp3 that is being decoded by madplay.

cat in.mp3 |
  madplay -o wave:- - |
  audiowmark add - out.wav f0

Since madplay doesn’t know the length of the output when it starts decoding the mp3, the best it can do is to fill the wav header with a big number. And indeed, audiowmark will watermark the stream, but also print a warning like this:

audiowmark: warning: unexpected EOF; input frames (1073741823) != output frames (8316288)

This may sound harmless, but for very long input streams, this will also truncate the audio input after this length. If you already know that you need to input a wav file from a pipe (without correct length in the header) and simply want to watermark all of it, it is better to use the wav-pipe format:

cat in.mp3 |
  madplay -o wave:- - |
  audiowmark add --input-format wav-pipe --output-format rf64 - out.wav f0

This will not print a warning, and it also works correctly for long input streams. Note that using rf64 as output format is necessary for huge output files (larger than 4G).

Similar to pipe input, audiowmark can write a wav header with a huge number (in cases where it does not know the length in advance) if the output format is set to wav-pipe.

cat in.mp3 |
  madplay -o wave:- - |
  audiowmark add --input-format wav-pipe --output-format wav-pipe - - f0 |
  lame - > out.mp3

If you need both, wav-pipe input and output, a shorter way to write it is using --format wav-pipe, like this:

cat in.mp3 |
  madplay -o wave:- - |
  audiowmark add --format wav-pipe - - f0 |
  lame - > out.mp3

Raw Streams

So far, all streams described here are essentially wav streams, which means that the wav header allows audiowmark to determine sample rate, number of channels, bit depth, encoding and so forth from the stream itself, and the a wav header is written for the program after audiowmark, so that this can figure out the parameters of the stream.

If the program before or after audiowmark doesn’t support wav headers, raw streams can be used instead. The idea is to set all information that is needed like sample rate, number of channels,…​ manually. Then, headerless data can be processed from stdin and/or sent to stdout.

--input-format raw
--output-format raw
--format raw

These can be used to set the input format or output format to raw. The last version sets both, input and output format to raw.

--raw-rate <rate>

This should be used to set the sample rate. The input sample rate and the output sample rate will always be the same (no resampling is done by the watermarker). There is no default for the sampling rate, so this parameter must always be specified for raw streams.

--raw-input-bits <bits>
--raw-output-bits <bits>
--raw-bits <bits>

The options can be used to set the input number of bits, the output number of bits or both. The number of bits can either be 16 or 24. The default number of bits is 16.

--raw-input-endian <endian>
--raw-output-endian <endian>
--raw-endian <endian>

These options can be used to set the input/output endianness or both. The <endian> parameter can either be little or big. The default endianness is little.

--raw-input-encoding <encoding>
--raw-output-encoding <encoding>
--raw-encoding <encoding>

These options can be used to set the input/output encoding or both. The <encoding> parameter can either be signed or unsigned. The default encoding is signed.

--raw-channels <channels>

This can be used to set the number of channels. Note that the number of input channels and the number of output channels must always be the same. The watermarker has been designed and tested for stereo files, so the number of channels should really be 2. This is also the default.

Other Command Line Options

--output-format rf64

Regular wav files are limited to 4GB in size. By using this option, audiowmark will write RF64 wave files, which do not have this size limit. This is not the default because not all programs might be able to read RF64 wave files.

--q, --quiet

Disable all information messages generated by audiomark.

--strict

This option will enable strict error checking, which may in some situations make audiowmark return an error, where it could continue.

HTTP Live Streaming

Introduction for HLS

HTTP Live Streaming (HLS) is a protocol to deliver audio or video streams via HTTP. One example for using HLS in practice would be: a user watches a video in a web browser with a player like hls.js. The user is free to play/pause/seek the video as he wants. audiowmark can watermark the audio content while it is being transmitted to the user.

HLS splits the contents of each stream into small segments. For the watermarker this means that if the user seeks to a position far ahead in the stream, the server needs to start sending segments from where the new play position is, but everything in between can be ignored.

Another important property of HLS is that it allows separate segments for the video and audio stream of a video. Since we watermark only the audio track of a video, the video segments can be sent as they are (and different users can get the same video segments). What is watermarked are the audio segments only, so here instead of sending the original audio segments to the user, the audio segments are watermarked individually for each user, and then transmitted.

Everything necessary to watermark HLS audio segments is available within audiowmark. The server side support which is necessary to send the right watermarked segment to the right user is not included.

HLS Requirements

HLS support requires some headers/libraries from ffmpeg:

  • libavcodec

  • libavformat

  • libavutil

  • libswresample

To enable these as dependencies and build audiowmark with HLS support, use the --with-ffmpeg configure option:

$ ./configure --with-ffmpeg

In addition to the libraries, audiowmark also uses the two command line programs from ffmpeg, so they need to be installed:

  • ffmpeg

  • ffprobe

Preparing HLS segments

The first step for preparing content for streaming with HLS would be splitting a video into segments. For this documentation, we use a very simple example using ffmpeg. No matter what the original codec was, at this point we force transcoding to AAC with our target bit rate, because during delivery the stream will be in AAC format.

$ ffmpeg -i video.mp4 -f hls -master_pl_name replay.m3u8 -c:a aac -ab 192k \
  -var_stream_map "a:0,agroup:aud v:0,agroup:aud" \
  -hls_playlist_type vod -hls_list_size 0 -hls_time 10 vs%v/out.m3u8

This splits the video.mp4 file into an audio stream of segments in the vs0 directory and a video stream of segments in the vs1 directory. Each segment is approximately 10 seconds long, and a master playlist is written to replay.m3u8.

Now we can add the relevant audio context to each audio ts segment. This is necessary so that when the segment is watermarked in order to be transmitted to the user, audiowmark will have enough context available before and after the segment to create a watermark which sounds correct over segment boundaries.

$ audiowmark hls-prepare vs0 vs0prep out.m3u8 video.mp4
AAC Bitrate:  195641 (detected)
Segments:     18
Time:         2:53

This steps reads the audio playlist vs0/out.m3u8 and writes all segments contained in this audio playlist to a new directory vs0prep which contains the audio segments prepared for watermarking.

The last argument in this command line is video.mp4 again. All audio that is watermarked is taken from this audio master. It could also be supplied in wav format. This makes a difference if you use lossy compression as target format (for instance AAC), but your original video has an audio stream with higher quality (i.e. lossless).

Watermarking HLS segments

So with all preparations made, what would the server have to do to send a watermarked version of the 6th audio segment vs0prep/out5.ts?

$ audiowmark hls-add vs0prep/out5.ts send5.ts 0123456789abcdef0011223344556677
Message:      0123456789abcdef0011223344556677
Strength:     10

Time:         0:15
Sample Rate:  44100
Channels:     2
Data Blocks:  0
AAC Bitrate:  195641

So instead of sending out5.ts (which has no watermark) to the user, we would send send5.ts, which is watermarked.

In a real-world use case, it is likely that the server would supply the input segment on stdin and send the output segment as written to stdout, like this

$ [...] | audiowmark hls-add - - 0123456789abcdef0011223344556677 | [...]
[...]

The usual parameters are supported in audiowmark hls-add, like

--key <filename>

Use watermarking key from file <filename> (see Watermark Key).

--strength <s>

Set the watermarking strength (see Watermark Strength).

The AAC bitrate for the output segment can be set using:

--bit-rate <bit_rate>

Set the AAC bit-rate for the generated watermarked segment.

The rules for the AAC bit-rate of the newly encoded watermarked segment are:

  • if the --bit-rate option is used during hls-add, this bit-rate will be used

  • otherwise, if the --bit-rate option is used during hls-prepare, this bit-rate will be used

  • otherwise, the bit-rate of the input material is detected during hls-prepare

Compiling from Source

Stable releases are available from http://uplex.de/audiowmark

The steps to compile the source code are:

./configure
make
make install

If you build from git (which doesn’t include configure), the first step is ./autogen.sh. In this case, you need to ensure that (besides the dependencies listed below) the autoconf-archive package is installed.

Compiling from Source on Windows/Cygwin

Windows is not an officially supported platform. However, if you want to build audiowmark (and videowmark) from source on windows, one way to do so is to use Cygwin. Andreas Strohmeier provided build instructions for Cygwin.

Dependencies

If you compile from source, audiowmark needs the following libraries:

  • libfftw3

  • libsndfile

  • libgcrypt

  • libzita-resampler

  • libmpg123

If you want to build with HTTP Live Streaming support, see also HLS Requirements.

Building fftw

audiowmark needs the single prevision variant of fftw3.

If you are building fftw3 from source, use the --enable-float configure parameter to build it, e.g.::

cd ${FFTW3_SOURCE}
./configure --enable-float --enable-sse && \
make && \
sudo make install

or, when building from git

cd ${FFTW3_GIT}
./bootstrap.sh --enable-shared --enable-sse --enable-float && \
make && \
sudo make install

Docker Build

You should be able to execute audiowmark via Docker. Example that outputs the usage message:

docker build -t audiowmark .
docker run -v <local-data-directory>:/data --rm -i audiowmark -h

audiowmark's People

Contributors

guofei9987 avatar nigoroll avatar padmick avatar swesterfeld avatar tim-janik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

audiowmark's Issues

weird results with hls when original audio source is mono

I accidentally found weird behavior when the original audio input is mono.

When played with vlc, all audio segements sound as generated by the following steps sound ok, except for the output of audiowmark hls-add.

This does not happen when the original audio is stereo (sox -c 2)

This might even be an ffmpeg bug, because I call it with -ac 2, but I thought it might be interesting to log the issue here anyway.

My reproducer:

  • generate noise:
sox -n vtc/noise.wav synth 60 whitenoise
  • create an mp4 video container
ffmpeg -i vtc/noise.wav -filter_complex 'color=c=red' -t 60 \
  vtc/avimark/video.mp4
...
Input #0, wav, from 'vtc/noise.wav':
  Duration: 00:01:00.00, bitrate: 1536 kb/s
  Stream #0:0: Audio: pcm_s32le ([1][0][0][0] / 0x0001), 48000 Hz, mono, s32, 1536 kb/s
Stream mapping:
  color:default (graph 0) -> Stream #0:0 (libx264)
  Stream #0:0 -> #0:1 (pcm_s32le (native) -> aac (native))
...
Output #0, mp4, to 'vtc/avimark/video.mp4':
  Metadata:
    encoder         : Lavf59.27.100
  Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 25 fps, 12800 tbn
    Metadata:
      encoder         : Lavc59.37.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
  Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 69 kb/s
    Metadata:
      encoder         : Lavc59.37.100 aac
  • create a hls video
ffmpeg -i vtc/avimark/video.mp4 -y -g 48 -sc_threshold 0 -ac 2 \
 -map 0:a:0 -b:a:0 140000 -c:a:0 aac \
 -map 0:v:0 -c:v:0 copy \
 -var_stream_map "a:0,agroup:aud v:0,agroup:aud" \
 -f hls -hls_playlist_type event -hls_time 6 \
 -hls_segment_filename vtc/avimark/stream%v-%03d.ts \
 -master_pl_name master.m3u8 \
 vtc/avimark/stream%v.m3u8
...
Output #0, hls, to 'vtc/avimark/stream%v.m3u8':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.27.100
  Stream #0:0(und): Audio: aac (LC), 48000 Hz, stereo, fltp, 140 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc59.37.100 aac
  Stream #0:1(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 3 kb/s, 25 fps, 25 tbr, 90k tbn (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc59.37.100 libx264
  • prepare for audiowmark hls
/usr/local/bin/audiowmark hls-prepare \
  vtc/avimark vtc/avimark/preptmp stream0.m3u8 vtc/avimark/video.mp4
AAC Bitrate:  142968 (detected)
Segments:     11
Time:         1:00

Now when I watermark one of the resulting segments...

$ audiowmark hls-add vtc/avimark/aviprep0-000.ts t.ts affe
Message:      affeaffeaffeaffeaffeaffeaffeaffe
Strength:     10

Time:         0:09
Sample Rate:  48000
Channels:     1
Data Blocks:  0
AAC Bitrate:  142968

when I play the file, the audio sounds somehow like noise, but "flutters" weridly.

detect-speed did not work

Hi,

I tried a simple sox tempo transform, and --detect-speed was not able to recover the message. I wonder if it is because of my audio or if there's a bug in the program. Would you be able to provide an example where detect-speed works as expected?

Many thanks for your help!

Tim

audio watermark question

Hi. it's me again. I have a question for you. I'd be grateful if you help me.
I gonna implement program that embed and hide 12 digits decimal number into .wav file audio signal and it's don't affect to file sound human hear, then after while I want extract same 12 digits decimal number from those files. I know somethings about NAudio .net , Fast Fourier Transform.
I have below algorithm idea but I can't implement it.
Through Using Fourier Transform, I change main .wav file to frequency domain then make a signal in frequency domain form 12 digits number in less than 20 hz ( because it`s not hear by human ) and then merge two signal to one signal in frequency domain and finally operate Inverse Fourier Transform to back to time domain. in other hand I should process some file and extract numbers

Big audio files cropped

Hi,

I'm evaluating this with the use of bigger audio files, several tens of hours.

When testing a bigger 5Gb wav, the process looks ok but the resulting watermarked file is smaller (around 4G). ffprobe gives:

Original
Duration: 17:17:22.09, bitrate: 705 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 1 channels, s16, 705 kb/s

Watermarked:
Duration: 13:31:35.77, bitrate: 705 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 1 channels, s16, 705 kb/s

The output file size is 4294967338, which I find to be too close to be a coincidence with what stdint.h defines for size_t:

/* Limit of `size_t' type.  */
# if __WORDSIZE == 64
#  define SIZE_MAX              (18446744073709551615UL)
# else
#  define SIZE_MAX              (4294967295U)
# endif

I'm guessing some counter somewhere should not limit it to that, haven't found out where yet.

Using 48kHz files

Hi
From Developer papers I understood that audiowmark uses 44.1 conversion. I wonder if it is possible to implement watermarking of 48 kHz files without conversion.

Thanks

Keys do not appear to work

audiowmark --key test.key add test.wav output.wav 1234567890
audiowmark: error parsing commandline args (use audiowmark -h)

Output from a raw audio input stream returns static noise

Hi! First off - thank you for this wonderful project.

While watermarking files saved on disk has worked perfectly, I need to pass a stream through audiowmark with stdin and (eventually) stdout. First, I wanted to try streaming to stdin and saving to disk. As instructed by the documentation, because I am streaming data to stdin, it's necessary to set several raw flags since the full length of the stream is not known at the time processing starts. I think my understanding is incorrect because after watermark processing, I receive a file that is the correct file size but only plays static noise.

In my example, I have a .wav file with the following properties:

Channels: 2
Sample Rate: 44100
Bits per sample: 24
Endian little
Encoding signed

Therefore my command should look like: audiowmark add --key secret.key - out.wav 0123456789abcdef0011223344556677 --format raw --raw-rate 44100 --raw-bits 24 --raw-channels 2

This code is running in a node environment, so I'm passing a createReadStream of my file to the stdin of this command. I didn't add this code here, but I can if needed.

I think most of my confusion comes from the --format flags. I tried setting --input-format raw and --output-format wav, but this fails because I think(?) the only options are raw or auto? I did some digging and found an enum with only those two values.

I figured that with --format raw, I would pass in headless stream data, and be returned headless stream data. As a result, I tried passing this raw stdout to sox to then convert this raw data to an audio file, but I still hear static which leads me to believe that the output that I'm generating from audiowmark is incorrect.

The command I would pass to stdin of sox would look like (only thing changed is passing stdout)

audiowmark add --key secret.key - - 0123456789abcdef0011223344556677 --format raw --raw-rate 44100 --raw-bits 24 --raw-channels 2

Am I on the correct track here? Any initial thoughts as to why static noise is being generated on the output?


EDIT: This may be an issue with how I'm streaming audio to stdin using node's createReadStream. I tried to stream directly to sox but am hearing similar static (more hissing this time around though). If this is the problem, I'll close this issue once confirmed.

Build for Windows

Hello Stefan,

very nice project.
Is there a way to build it for windows?

BR
Andreas

Enhancement Request: Preserve MP3 Metadata When Watermarking

Hi.
When using audiowmark to watermark an MP3 file, it successfully reads the MP3, adds the watermark but outputs the result as a WAV file, regardless of the file extension used as the output file name.
I.e. the resulting out.mp3 file with an MP3 extension actually contains WAV file data.
That behavior might be according to spec, but is quite surprising.
And, if the output file is a out.wav file, and e.g. lame is used to re-encode the WAV into MP3 data, the resulting MP3 file will lack the metadata from the original MP3, e.g. title, artist, album.

Here is the process I have come up with to preserve the metadata:

  1. Extract the metadata from the original MP3 using ffmpeg.
  2. Merge the extracted metadata into the re-encoded MP3 after watermarking into a WAV, using ffmpeg again.

The commands are:

ffmpeg -i input.mp3  -f ffmetadata metadata.txt
src/audiowmark add input.mp3 x.wav 0123456789abcdef0011223344556677 
lame x.wav x.mp3
ffmpeg -i x.mp3 -i metadata.txt -map_metadata 1 -codec copy output.mp3

It would be great if audiowmark could detect that the output file is supposed to be an MP3 file and handle the watermarking process in a way so that the output file is an MP3 with the original metadata intact and encoded with a bitrate similar to the input (some heuristics might be needed to guess the input bit rate). This would greatly improve usability and be less error prone than doing it all manually.

Regarding the WAV output even if the file extension is .mp3, I'd suggest to at least warn the user about the possible surprise and maybe require --force or similar, in case the user really wants that (which is highly unlikely).

To summarize:

  • Please catch WAV-in-MP3-file as output.
  • Ideally, re-encode MP3 output files.
  • Preserve metadata from input MP3 files when producing MP3 files.

If that is preferred, I could help with the creation of a wrapper script to achieve the above. (I just suspect that a standalone script couldn't support streaming mode really well...)

Thanks for consideration.

--key does not work with videowmark

Hello!

Thank you for this excellent program. I've encountered an issue when trying to add a watermark with a key.

The command & output:

videowmark add --key filename.key /source/path /dest/path message
Audio Codec:  -c:a aac -ab 193247
audiowmark: unsupported global option '--key' (use audiowmark -h)
videowmark: error: watermark generation failed (audiowmark)

Looking at the videowmark code, it seems that the key is being added to the global args:
--key ) AUDIOWMARK_ARGS+=("--key" "$2"); shift 2 ;;

A possible solution would be to add a separate variable for after-command args. I will try to do this later and open a PR, though my bash is not very good...

Self Implementation

Hi @swesterfeld.
Thanks again for your great package.
I have a problem with this approach. and it's about running a shell execution in my wrapper.
I have so many requests and I have to respond them below 0.5 milliseconds so preferably it's better to implement the logic of this package in my wrapper.
Could you please tell me about the logic of this package and what is going on behind the scenes of that?
And what sources did you use to implement that?

Thanks

Audible noise in watermarked audio even with --strength 5, also not extractable even at --strength 100

Hello, first of all, thank you very much for this amazing piece of software. I've been experimenting with it for the past month and it's definitely the best open source project for watermarking audio files.

I've encountered a consistent issue with watermarking certain audio files. I can describe the file as "Dark Pulse Flutter" and is producing noticeable noise after watermarking, even when using a minimal strength setting (e.g., --strength 5), not to mention that the watermark is not extractable at all even when increasing the strength (e.g., --strength 100)

Here's what I've observed and attempted so far:

  • The noise is particularly evident with some audio types, not all ( Let me know if there is a way to share the audio file securely ).
  • The problem persists across various watermark strengths.
  • Applying a low-pass filter seems to reduce the noise's audibility..

Given the above, I'm seeking advice on adjustments that could help mitigate the noise issue without compromising the watermark's extractability. Here are some specific points and questions:

  • Watermark Algorithm Sensitivity: Is there a way to adjust the sensitivity of the patchwork algorithm to these audio types, perhaps by modifying the distribution of frequency bands used for embedding?

  • Parameter Tuning: Could you suggest parameter tuning that might address the noise issue? I am particularly interested in whether there are non-documented parameters or advanced configurations that could be adjusted.

  • Low-Pass Filter Usage: Since a low-pass filter has helped, is there a recommended approach or a set of parameters that would allow for its use without affecting the watermark?

Frequency Resolution: Would changing the frame size, if possible, provide a solution? If so, how could this be achieved given the current limitations of the command-line options?

Any insights or suggestions would be greatly appreciated. The goal is to find a balance between minimizing the watermark's audibility and ensuring its reliability and extractability, especially for this specific audio type.

How do I extract the watermark from the ts file?

when i run this:

audiowmark hls-add vs0prep/out1.ts send1.ts 0123456789abcdef0011223344556677

How do I know it works correctly?
How do I extract the watermark from send1.ts ?

And , What should I do to get the watermark added every second?
Please help me.

zita-resampler not found after compiling from source

Hello, I am having trouble after compiling from source. Commands listed in readme are running fine:
./configure
make
make install

I am then trying audiowmark gen-key test.key and get the following error:
error while loading shared libraries: libzita-resampler.so.1: cannot open shared object file: No such file or directory

libzita-resampler was recognized while running ./configure and the file libzita-resampler.so.1 is located in /usr/local/lib64 but I am not sure where to move it or how to make it accessible for audiowmark.

The server uses EC2 instance with AWS AMI 2 Linux which is based on RHEL 7 in case that's relevant.

I would appreciate any help in this matter.

Minor issues

rawconverter.cc should "#include < array >"

testrandom.cc: two warnings: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]

audiowmark: libgcrypt version mismatch (GCRYPT_VERSION in random.cc?)
/opt/local/lib/libgcrypt.20.dylib (compatibility version 23.0.0, current version 23.1.0)

hello please

hello can i added and i get from
.ts or .mp4 format because with get i have this problem

audiowmark: error loading /root/ts/ok.ts: Format not recognised.

Increased file size

Why is the output of audio file size is increased when its watermarked? Although the message added is just 128bit and its added multiple times which doesn't even increase by 100 bytes. But i see the increase in size of audio files by 10s of MB. Not sure am I missing out something. Any help with this is appreciated.

How to find synchronization bits

Hello, thank you for your contribution. But regarding your algorithm, I have a question that I don't quite understand. The size of each A or B block in the algorithm is 2226 * 1024, which is approximately 52 seconds per block. That is, the 510 bit synchronization information and 858 * 2 bit watermark information are hidden in the 2226 * 1024 frame (52 second audio). However, I first watermark the audio and then crop out 15 seconds, and the program can still complete the correct embedding. This is amazing, May I ask how your algorithm works?

syncfinder

Hello, first of all, thank you for your code, but I have a few questions to ask you. 1. In the code, I found that two arrays, up and down, were created, and I am not very clear about their purpose. 2. How does the system find the beginning of the 510 frame synchronization block after cropping the audio? In other words, how does the system find the beginning of the watermark after cropping the audio.

full_flac_mem

Hi @swesterfeld
Thanks for this great package.
I have a question about adding full_flac_mem to the ts's audio.
What we have to do that?
Could you please tell me a brief explanation of that?

Thanks

Zita-resampler issue

i have this error during the compiling checking for _Z28zita_resampler_major_versionv in -lzita-resampler... no
configure: error: You need to install libzita-resampler to build this package.

Issues building Audiowmark on macOS

  • macOS 10.14.6
  • ZSH shell + Prezto

Hello! I am having problems building Audiowmark. I've downloaded the master zip. unzipped it into ~/music/audiowmark, issued cd ~/music/audiowmark but I'm not sure what to do next. Ideally, there would be some make command to build the package, but that does't seem to be present. Forgive me for my lack of knowledge in these matters ;-)

Build for Mac OSX High Sierra

Hi,
I'm having troubles installing audiowmark on mac HS. The error is
No package 'fftw3f' found

Consider adjusting the PKG_CONFIG_PATH environment variable if you
installed software in a non-standard prefix.

Alternatively, you may set the environment variables FFTW_CFLAGS
and FFTW_LIBS to avoid the need to call pkg-config.

I installed FFTW from FFTW site,brew install FFTW gave me errors
any ideas?
Many thanks

Video Watermarking: support for multiple audio tracks

Currently, for videos with multiple audio tracks (for instance different language tracks for a film), videowmark generates an error message like this:

input file must have one audio stream and one video stream

Since multiple audio streams are present in some videos, what we can and should do in this case is watermark all audio tracks.

Unable to stream mp3 to stdin of audiowmark?

I have a use case to watermark mp3 files streamed in to stdin of audiowmark. Since mp3 files do not contain the proper headers required for obtaining metadata necessary for watermarking with a stream, I'd need to use --format raw and manually set some options. However since mp3 files do not contain a bit depth (right? Source here), I would be unable to set the --raw-bits flag.

Given this scenario, do I understand correctly that I'd need to load the mp3 file completely in memory and then simply pass the whole file to audiowmark to handle to bypass this streaming input limitation?

Also one thing I've noticed is that when I load the entire mp3 file in to memory (7MB) and then pass it to audiowmark the output mp3 file size is almost 5x bigger (~50MB). When doing the same with WAVs the conversion is close to 1:1.

Again, as mentioned in my previous issue, thanks for your excellent work on this library!

"Format not recognized" when input is a stream

Hi,

I am getting this on Intel Mac:

$ wget https://patrickdearteaga.com/audio/Chiptronical.ogg 
$ cat Chiptronical.ogg |  docker run --rm -v $PWD/:/data audiowmark add - b.ogg 1     
audiowmark: error opening -: Format not recognised.

This works fine:

docker run --rm -v $PWD/:/data audiowmark add Chiptronical.ogg   b.ogg 1 

How to decode watermark correctly

Hi
I tested different length waves, but some waves can decoded watermark in one pattern, but other pattern can't find message. Or even can't find message in any pattern. I try use mp3 format but still have this problem.

I saw you said "all pattern combines all data blocks that are available" , I use audiowmark get don't see all pattern.

image

my wave information :
image

I add watermark by :audiowmark add oriWave/inshi1.wav newWave/inshi1.wav ecd057f0d1fbb25d6430b338b5d72eb2
image

decode watermark by:audiowmark get newWave/inshi1.wav (only one pattern have message)
image


my wave information :
image

I add watermark by :audiowmark add oriWave/inshi1.wav newWave/inshi1.wav ecd057f0d1fbb25d6430b338b5d72eb2
image

decode watermark by:audiowmark get newWave/inshi1.wav (no pattern have message)
image

How Can get all pattern?
Why I can't decode correctly?

Handling HLS format errors

Hello, I encountered an error when using this tool to process files in hls format with the following message.

(base) ➜   ffmpeg -i video.mp4 -f hls -master_pl_name replay.m3u8 -c:a aac -ab 192k \
  -var_stream_map "a:0,agroup:aud v:0,agroup:aud" \
  -hls_playlist_type vod -hls_list_size 0 -hls_time 10 vs%v/out.m3u8

(base) ➜   audiowmark hls-prepare vs0 vs0prep out.m3u8 video.mp4
AAC Bitrate:  55000 (detected)
Segments:     7
audiowmark: hls: ff_decode failed: Format not recognised.

At the same time, I executed the test script in the source directory and it prompted the same error, is there something wrong with my operation? Or is HLS not supported?

-> test/hls-test.sh
...
...
...
==== audiowmark hls-prepare hls-test-dir.30950/as0 hls-test-dir.30950/as0prep out.m3u8 hls-test-dir.30950/test-input.wav ====
audiowmark: failed to load audio master: hls-test-dir.30950/test-input.wav
./hls-test.sh: failed to run audiowmark hls-prepare hls-test-dir.30950/as0 hls-test-dir.30950/as0prep out.m3u8 hls-test-dir.30950/test-input.wav

MacOS 10.15.7, ffmpeg version 5.0

ffmpeg version 5.0 Copyright (c) 2000-2022 the FFmpeg developers
  built with Apple clang version 12.0.0 (clang-1200.0.32.29)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/5.0 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray 

--enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf 

--enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass 
--enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg 
--disable-libjack --disable-indev=jack --enable-videotoolbox

audiowmark build from git source - newest
video.mp4 info:

h264 1920x1080 25fps 1024Kbps, aac 44.1Khz, 2ch, 32Kpbs

Watermarking .wav from stdin can cause SIGPIPE

When watermarking a .wav file from stdin, audiowmark stops reading data after all samples have been read. However, there are valid .wav files that contain data after the actual samples. Here is one example:

$ wget https://github.com/sfzinstruments/karoryfer.weresax/blob/master/Samples/alto/a2_p_rr2_cnd.wav?raw=true -q -O test.wav
$ rifftree -s test.wav
RIFF(WAVE)-> (796900 Bytes)
            fmt ; (16 Bytes)
            data; (793728 Bytes)
            cue ; (28 Bytes)
            LIST(adtl)-> (74 Bytes)
                        ltxt; (20 Bytes)
                        labl; (11 Bytes)
                        note; (17 Bytes)
            _PMX; (2738 Bytes)
            DISP; (4 Bytes)
            SyLp; (184 Bytes)
            smpl; (60 Bytes)
$ strace -o trace dd if=test.wav bs=16 | audiowmark add - - f0 > /dev/null
Input:        -
Output:       -
Message:      f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0
Strength:     10

Time:         0:05
Sample Rate:  44100
Channels:     1
Data Blocks:  0
$ tail trace
read(0, "                ", 16)         = 16
write(1, "                ", 16)        = 16
read(0, "                ", 16)         = 16
write(1, "                ", 16)        = 16
read(0, " \n              ", 16)        = 16
write(1, " \n              ", 16)       = 16
read(0, "                ", 16)         = 16
write(1, "                ", 16)        = -1 EPIPE (Datenübergabe unterbrochen (broken pipe))
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=3253354, si_uid=1000} ---
+++ killed by SIGPIPE +++

We can see that test.wav in this example has a RIFF data chunk (which contains the actual samples) but after it there are a few other RIFF chunks. Reading these is not necessary for watermarking the audio, so currently we don't. Actually this behaviour is not explicitly implemented in audiowmark but is caused by how libsndfile implements reading from the pipe (it simply stops reading somewhat after the data chunk).

There are two possible ways to deal with this:

  1. keep it as is, and force the process that provides the data to handle this problem
  2. drain the pipe when watermarking "-" when closing the input file

I slighly prefer (2) because it makes things robust without forcing the user to deal with a few corner cases that occur very infrequently.

Random generator is incompatible with LLVM 15

make[2]: Entering directory `/opt/local/var/macports/build/_Users_user_macports-ports_audio_audiowmark/audiowmark/work/audiowmark-0.6.1/src'
  CXX      audiowmark.o
  CXX      random.o
  CXX      audiostream.o
  CXX      sfinputstream.o
In file included from random.cc:18:
In file included from ./random.hh:26:
In file included from /opt/local/libexec/llvm-15/bin/../include/c++/v1/random:1682:
In file included from /opt/local/libexec/llvm-15/bin/../include/c++/v1/__random/bernoulli_distribution.h:14:
/opt/local/libexec/llvm-15/bin/../include/c++/v1/__random/uniform_real_distribution.h:119:5: error: static assertion failed due to requirement '__libcpp_random_is_valid_urng<Random, void>::value':
    static_assert(__libcpp_random_is_valid_urng<_URNG>::value, "");
    ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/local/libexec/llvm-15/bin/../include/c++/v1/__random/uniform_real_distribution.h:84:17: note: in instantiation of function template specialization 'std::uniform_real_distribution<>::operator()<Random>' requested here
        {return (*this)(__g, __p_);}
                ^
./random.hh:73:24: note: in instantiation of function template specialization 'std::uniform_real_distribution<>::operator()<Random>' requested here
    return double_dist (*this);
                       ^
In file included from audiowmark.cc:21:
In file included from /opt/local/libexec/llvm-15/bin/../include/c++/v1/random:1682:
In file included from /opt/local/libexec/llvm-15/bin/../include/c++/v1/__random/bernoulli_distribution.h:14:
/opt/local/libexec/llvm-15/bin/../include/c++/v1/__random/uniform_real_distribution.h:119:5: error: static assertion failed due to requirement '__libcpp_random_is_valid_urng<Random, void>::value':
    static_assert(__libcpp_random_is_valid_urng<_URNG>::value, "");
    ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/local/libexec/llvm-15/bin/../include/c++/v1/__random/uniform_real_distribution.h:84:17: note: in instantiation of function template specialization 'std::uniform_real_distribution<>::operator()<Random>' requested here
        {return (*this)(__g, __p_);}
                ^
./random.hh:73:24: note: in instantiation of function template specialization 'std::uniform_real_distribution<>::operator()<Random>' requested here
    return double_dist (*this);
                       ^
1 error generated.
1 error generated.
make[2]: *** [audiowmark.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [random.o] Error 1

unexpected `error: input frames (357913941) != output frames` on reading from stdin

I have some hard to explain error

ffmpeg -hide_banner -loglevel error -i sample-mp4-file.mp4 -map 0:a:0 -f wav -c:a pcm_s16le - | ./audiowmark add - - 0123456789abcdef0011223344556677 > /dev/null
Input:        -
Output:       -
Message:      0123456789abcdef0011223344556677
Strength:     10

Time:         124:16
Sample Rate:  48000
Channels:     6
audiowmark: error: input frames (357913941) != output frames (6045696)

but if I do

ffmpeg -hide_banner -loglevel error -i sample-mp4-file.mp4 -map 0:a:0 -f wav -c:a pcm_s16le - > original.wav

and then just

cat original.wav | ./audiowmark add - - 0123456789abcdef0011223344556677 > /dev/null

then the error goes away.

It seems sometimes libsndfile can't properly detect wav file duration if it comes from stdin. For example this sample-mp4-file.mp4 comes from https://www.learningcontainer.com/wp-content/uploads/2020/05/sample-mp4-file.mp4 and it's a 2 minutes file.

Probably I had to report it to libsndfile but since I use it in context of audiowmark I think you know better to how properly describe the problem.

I tried two version of libsndfile 1.1.0 and a17e32fda6ed6883bebe0d5f7e1c83cd88409bd6 (today's master head)

autoconf issue: AC_LIB_HAVE_LINKFLAGS

from https://gitlab.com/uplex/varnish/k8s-ingress/-/jobs/4831696565

Step 6/10 : RUN git clone https://code.uplex.de/stefan/audiowmark.git &&     cd audiowmark &&     ./autogen.sh &&     make && make install &&     cd .. && rm -rf audiowmark
 ---> Running in ebe6a55a7037
Cloning into 'audiowmark'...
Running: autoreconf -i && ./configure 
aclocal: warning: couldn't open directory 'm4': No such file or directory
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, 'build-aux'.
libtoolize: copying file 'build-aux/ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'.
libtoolize: copying file 'm4/libtool.m4'
libtoolize: copying file 'm4/ltoptions.m4'
libtoolize: copying file 'm4/ltsugar.m4'
libtoolize: copying file 'm4/ltversion.m4'
libtoolize: copying file 'm4/lt~obsolete.m4'
configure.ac:35: error: AC_LIB_HAVE_LINKFLAGS
      If this token and others are legitimate, please use m4_pattern_allow.
      See the Autoconf documentation.
autoreconf: /usr/bin/autoconf failed with exit status: 1
The command '/bin/sh -c git clone https://code.uplex.de/stefan/audiowmark.git &&     cd audiowmark &&     ./autogen.sh &&     make && make install &&     cd .. && rm -rf audiowmark' returned a non-zero code: 1

The solution is probably obvious, but I do not see it straight away. We do install autoconf-archive in that build.

--with-ffmpeg: deprecation warnings with libavcodec 7:5.1.4-0+deb12u1

Compiling audiowmark on Debian 12.4, I get deprecation warnings:

$ apt info libavcodec-dev
Package: libavcodec-dev
Version: 7:5.1.4-0+deb12u1
Priority: optional
Section: libdevel
Source: ffmpeg
...

$ ./configure --with-ffmpeg && make -j20
...
hlsoutputstream.cc: In member function ‘Error HLSOutputStream::add_stream(const AVCodec**, AVCodecID)’:
hlsoutputstream.cc:116:48: warning: ‘uint64_t av_get_channel_layout(const char*)’ is deprecated [-Wdeprecated-declarations]
  116 |   uint64_t want_layout = av_get_channel_layout (m_channel_layout.c_str());
      |                          ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/x86_64-linux-gnu/libavutil/frame.h:33,
                 from /usr/include/x86_64-linux-gnu/libavutil/hwcontext.h:23,
                 from /usr/include/x86_64-linux-gnu/libavcodec/codec.h:27,
                 from /usr/include/x86_64-linux-gnu/libavformat/avformat.h:313,
                 from hlsoutputstream.hh:27,
                 from hlsoutputstream.cc:18:
/usr/include/x86_64-linux-gnu/libavutil/channel_layout.h:408:10: note: declared here
  408 | uint64_t av_get_channel_layout(const char *name);
      |          ^~~~~~~~~~~~~~~~~~~~~
hlsoutputstream.cc:119:10: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  119 |   m_enc->channel_layout = want_layout;
      |          ^~~~~~~~~~~~~~
In file included from hlsoutputstream.hh:32:
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:119:10: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  119 |   m_enc->channel_layout = want_layout;
      |          ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:119:10: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  119 |   m_enc->channel_layout = want_layout;
      |          ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:120:17: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  120 |   if ((*codec)->channel_layouts)
      |                 ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:120:17: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  120 |   if ((*codec)->channel_layouts)
      |                 ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:120:17: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  120 |   if ((*codec)->channel_layouts)
      |                 ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:122:14: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  122 |       m_enc->channel_layout = (*codec)->channel_layouts[0];
      |              ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:122:14: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  122 |       m_enc->channel_layout = (*codec)->channel_layouts[0];
      |              ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:122:14: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  122 |       m_enc->channel_layout = (*codec)->channel_layouts[0];
      |              ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:122:41: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  122 |       m_enc->channel_layout = (*codec)->channel_layouts[0];
      |                                         ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:122:41: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  122 |       m_enc->channel_layout = (*codec)->channel_layouts[0];
      |                                         ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:122:41: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  122 |       m_enc->channel_layout = (*codec)->channel_layouts[0];
      |                                         ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:123:33: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  123 |       for (int i = 0; (*codec)->channel_layouts[i]; i++)
      |                                 ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:123:33: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  123 |       for (int i = 0; (*codec)->channel_layouts[i]; i++)
      |                                 ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:123:33: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  123 |       for (int i = 0; (*codec)->channel_layouts[i]; i++)
      |                                 ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:125:25: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  125 |           if ((*codec)->channel_layouts[i] == want_layout)
      |                         ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:125:25: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  125 |           if ((*codec)->channel_layouts[i] == want_layout)
      |                         ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:125:25: warning: ‘AVCodec::channel_layouts’ is deprecated [-Wdeprecated-declarations]
  125 |           if ((*codec)->channel_layouts[i] == want_layout)
      |                         ^~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/codec.h:226:21: note: declared here
  226 |     const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
      |                     ^~~~~~~~~~~~~~~
hlsoutputstream.cc:126:22: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  126 |               m_enc->channel_layout = want_layout;
      |                      ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:126:22: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  126 |               m_enc->channel_layout = want_layout;
      |                      ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:126:22: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  126 |               m_enc->channel_layout = want_layout;
      |                      ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:129:29: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  129 |   if (want_layout != m_enc->channel_layout)
      |                             ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:129:29: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  129 |   if (want_layout != m_enc->channel_layout)
      |                             ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:129:29: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  129 |   if (want_layout != m_enc->channel_layout)
      |                             ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:131:10: warning: ‘AVCodecContext::channels’ is deprecated [-Wdeprecated-declarations]
  131 |   m_enc->channels = av_get_channel_layout_nb_channels (m_enc->channel_layout);
      |          ^~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1006:9: note: declared here
 1006 |     int channels;
      |         ^~~~~~~~
hlsoutputstream.cc:131:10: warning: ‘AVCodecContext::channels’ is deprecated [-Wdeprecated-declarations]
  131 |   m_enc->channels = av_get_channel_layout_nb_channels (m_enc->channel_layout);
      |          ^~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1006:9: note: declared here
 1006 |     int channels;
      |         ^~~~~~~~
hlsoutputstream.cc:131:10: warning: ‘AVCodecContext::channels’ is deprecated [-Wdeprecated-declarations]
  131 |   m_enc->channels = av_get_channel_layout_nb_channels (m_enc->channel_layout);
      |          ^~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1006:9: note: declared here
 1006 |     int channels;
      |         ^~~~~~~~
hlsoutputstream.cc:131:63: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  131 |   m_enc->channels = av_get_channel_layout_nb_channels (m_enc->channel_layout);
      |                                                               ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:131:63: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  131 |   m_enc->channels = av_get_channel_layout_nb_channels (m_enc->channel_layout);
      |                                                               ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:131:63: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  131 |   m_enc->channels = av_get_channel_layout_nb_channels (m_enc->channel_layout);
      |                                                               ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:131:55: warning: ‘int av_get_channel_layout_nb_channels(uint64_t)’ is deprecated [-Wdeprecated-declarations]
  131 |   m_enc->channels = av_get_channel_layout_nb_channels (m_enc->channel_layout);
      |                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavutil/channel_layout.h:449:5: note: declared here
  449 | int av_get_channel_layout_nb_channels(uint64_t channel_layout);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hlsoutputstream.cc: In member function ‘AVFrame* HLSOutputStream::alloc_audio_frame(AVSampleFormat, uint64_t, int, int, Error&)’:
hlsoutputstream.cc:154:10: warning: ‘AVFrame::channel_layout’ is deprecated [-Wdeprecated-declarations]
  154 |   frame->channel_layout = channel_layout;
      |          ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavutil/frame.h:510:14: note: declared here
  510 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:154:10: warning: ‘AVFrame::channel_layout’ is deprecated [-Wdeprecated-declarations]
  154 |   frame->channel_layout = channel_layout;
      |          ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavutil/frame.h:510:14: note: declared here
  510 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:154:10: warning: ‘AVFrame::channel_layout’ is deprecated [-Wdeprecated-declarations]
  154 |   frame->channel_layout = channel_layout;
      |          ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavutil/frame.h:510:14: note: declared here
  510 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc: In member function ‘Error HLSOutputStream::open_audio(const AVCodec*, AVDictionary*)’:
hlsoutputstream.cc:192:62: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  192 |   m_frame     = alloc_audio_frame (m_enc->sample_fmt, m_enc->channel_layout, m_enc->sample_rate, nb_samples, err);
      |                                                              ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:192:62: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  192 |   m_frame     = alloc_audio_frame (m_enc->sample_fmt, m_enc->channel_layout, m_enc->sample_rate, nb_samples, err);
      |                                                              ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:192:62: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  192 |   m_frame     = alloc_audio_frame (m_enc->sample_fmt, m_enc->channel_layout, m_enc->sample_rate, nb_samples, err);
      |                                                              ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:196:62: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  196 |   m_tmp_frame = alloc_audio_frame (AV_SAMPLE_FMT_FLT, m_enc->channel_layout, m_enc->sample_rate, nb_samples, err);
      |                                                              ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:196:62: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  196 |   m_tmp_frame = alloc_audio_frame (AV_SAMPLE_FMT_FLT, m_enc->channel_layout, m_enc->sample_rate, nb_samples, err);
      |                                                              ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:196:62: warning: ‘AVCodecContext::channel_layout’ is deprecated [-Wdeprecated-declarations]
  196 |   m_tmp_frame = alloc_audio_frame (AV_SAMPLE_FMT_FLT, m_enc->channel_layout, m_enc->sample_rate, nb_samples, err);
      |                                                              ^~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1060:14: note: declared here
 1060 |     uint64_t channel_layout;
      |              ^~~~~~~~~~~~~~
hlsoutputstream.cc:215:66: warning: ‘AVCodecContext::channels’ is deprecated [-Wdeprecated-declarations]
  215 |   av_opt_set_int        (m_swr_ctx, "in_channel_count",   m_enc->channels,       0);
      |                                                                  ^~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1006:9: note: declared here
 1006 |     int channels;
      |         ^~~~~~~~
hlsoutputstream.cc:215:66: warning: ‘AVCodecContext::channels’ is deprecated [-Wdeprecated-declarations]
  215 |   av_opt_set_int        (m_swr_ctx, "in_channel_count",   m_enc->channels,       0);
      |                                                                  ^~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1006:9: note: declared here
 1006 |     int channels;
      |         ^~~~~~~~
hlsoutputstream.cc:215:66: warning: ‘AVCodecContext::channels’ is deprecated [-Wdeprecated-declarations]
  215 |   av_opt_set_int        (m_swr_ctx, "in_channel_count",   m_enc->channels,       0);
      |                                                                  ^~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1006:9: note: declared here
 1006 |     int channels;
      |         ^~~~~~~~
hlsoutputstream.cc:218:66: warning: ‘AVCodecContext::channels’ is deprecated [-Wdeprecated-declarations]
  218 |   av_opt_set_int        (m_swr_ctx, "out_channel_count",  m_enc->channels,       0);
      |                                                                  ^~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1006:9: note: declared here
 1006 |     int channels;
      |         ^~~~~~~~
hlsoutputstream.cc:218:66: warning: ‘AVCodecContext::channels’ is deprecated [-Wdeprecated-declarations]
  218 |   av_opt_set_int        (m_swr_ctx, "out_channel_count",  m_enc->channels,       0);
      |                                                                  ^~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1006:9: note: declared here
 1006 |     int channels;
      |         ^~~~~~~~
hlsoutputstream.cc:218:66: warning: ‘AVCodecContext::channels’ is deprecated [-Wdeprecated-declarations]
  218 |   av_opt_set_int        (m_swr_ctx, "out_channel_count",  m_enc->channels,       0);
      |                                                                  ^~~~~~~~
/usr/include/x86_64-linux-gnu/libavcodec/avcodec.h:1006:9: note: declared here
 1006 |     int channels;
      |         ^~~~~~~~
  CXX      testthreadpool.o
...

add watermark twice

is it possible to support adding watermark twice, that means I could get two different watermarking from the same wav?
for example, I add watermarking for user copyright information first, then after transfered the audio file to client, add watermarking of user who download or play it.

Preserving Video Metadata When Watermarking

Hi.
When using videowmark to watermark an MP4 file, it successfully reads the MP4 and adds the watermark, but seems to loose some of the metadata, according to mediainfo.

$ videowmark add BigBuckBunny_320x180.mp4 BigBuckBunny_WM.mp4 0000111100002222
$ mediainfo BigBuckBunny_320x180.mp4 >x1
$ mediainfo BigBuckBunny_WM.mp4 >x2
$ diff -u x1 x2

The differences are as follows:

--- x1	2024-04-19 12:05:44.013193748 +0200
+++ x2	2024-04-19 12:05:49.757202964 +0200
@@ -1,19 +1,16 @@
 General
-Complete name                            : BigBuckBunny_320x180.mp4
+Complete name                            : BigBuckBunny_WM.mp4
 Format                                   : MPEG-4
 Format profile                           : Base Media
-Codec ID                                 : isom (mp41)
+Codec ID                                 : isom (isom/iso2/avc1/mp41)
 File size                                : 61.7 MiB
 Duration                                 : 9 min 56 s
-Overall bit rate mode                    : Variable
 Overall bit rate                         : 867 kb/s
 Movie name                               : Big Buck Bunny
 Performer                                : Blender Foundation
 Composer                                 : Blender Foundation
 Recorded date                            : 2008
-Encoded date                             : UTC 1970-01-01 00:00:00
-Tagged date                              : UTC 1970-01-01 00:00:00
-Writing application                      : Lavf52.14.0
+Writing application                      : Lavf58.76.100
 
 Video
 ID                                       : 1
@@ -39,8 +36,6 @@
 Scan type                                : Progressive
 Bits/(Pixel*Frame)                       : 0.508
 Stream size                              : 50.0 MiB (81%)
-Encoded date                             : UTC 1970-01-01 00:00:00
-Tagged date                              : UTC 1970-01-01 00:00:00
 Codec configuration box                  : avcC
 
 Audio
@@ -49,15 +44,18 @@
 Format/Info                              : Advanced Audio Codec Low Complexity
 Codec ID                                 : mp4a-40-2
 Duration                                 : 9 min 56 s
-Bit rate mode                            : Variable
-Bit rate                                 : 160 kb/s
+Source duration                          : 9 min 56 s
+Bit rate mode                            : Constant
+Bit rate                                 : 161 kb/s
 Channel(s)                               : 2 channels
 Channel layout                           : L R
 Sampling rate                            : 48.0 kHz
 Frame rate                               : 46.875 FPS (1024 SPF)
 Compression mode                         : Lossy
-Stream size                              : 11.4 MiB (18%)
-Encoded date                             : UTC 1970-01-01 00:00:00
-Tagged date                              : UTC 1970-01-01 00:00:00
+Stream size                              : 11.4 MiB (19%)
+Source stream size                       : 11.4 MiB (19%)
+Default                                  : Yes
+Alternate group                          : 1
+mdhd_Duration                            : 596224

I.e. "Encoded date" and "Tagged date" are lost for both streams, Video and Audio.
Also, is it expected that the audio stream turns from "Bit rate mode: Variable" before watermarking into "Bit rate mode: Constant" after watermarking?

For the record, I've tried to adjust videowmark to preserve the metadata by adding -movflags +faststart+use_metadata_tags -map_metadata 0 to the audio+video merging step. But that tends to also rename matadata tags (e.g. "Movie name"->"title" or "Performer"->"artist) without preserving any dates, so it is not really an improvement.

As for the above metadata loss, is that purely due to the different codecs used during re-encoding, or can metadata loss go beyond that?

videowamrk, audiowmark add cant, 6channel, 3h28m

I have bbb.mkv,
mediainfo bbb.mkv
Audio
ID : 1
Format : AC-3
Format/Info : Audio Coding 3
Commercial name : Dolby Digital
Codec ID : A_AC3
Duration : 3 h 28 min
Bit rate mode : Constant
Bit rate : 384 kb/s
Channel(s) : 6 channels
Channel layout : L R C LFE Ls Rs
Sampling rate : 48.0 kHz
Frame rate : 31.250 FPS (1536 SPF)
Compression mode : Lossy
Stream size : 573 MiB (11%)
Service kind : Complete Main
Default : Yes
Forced : No

First, cant output video file use videowmark,
change shell command line, ffmpeg $FFMPEG_VERBOSE -y -i "$in_file" -f wav -rf64 auto "$orig_wav"

get bbb.wav 7,211,114,634bytes

General
Complete name : bbb.wav
Format : Wave
Format profile : RF64
File size : 6.72 GiB
Duration : 3 h 28 min
Overall bit rate mode : Constant
Overall bit rate : 4 608 kb/s
Writing application : Lavf58.76.100

Audio
Format : PCM
Format settings : Little / Signed
Codec ID : 00000001-0000-0010-8000-00AA00389B71
Duration : 3 h 28 min
Bit rate mode : Constant
Bit rate : 4 608 kb/s
Channel(s) : 6 channels
Channel layout : L R C LFE Ls Rs
Sampling rate : 48.0 kHz
Bit depth : 16 bits
Stream size : 6.72 GiB (100%)

Second, cant get orginal time-length wav and audiowmark return 0
use audiowmark add bbb.wav bbb_git.wav --key temp.key 0123456

bbb_git.wav 7,211,114,540 bytes
General
Complete name : bbb_git.wav
Format : Wave
File size : 6.72 GiB
Duration : 3 h 28 min
Overall bit rate mode : Constant
Overall bit rate : 4 608 kb/s

Audio
Format : PCM
Format settings : Little / Signed
Codec ID : 1
Duration : 3 h 28 minb
Bit rate mode : Constant
Bit rate : 4 608 kb/s
Channel(s) : 6 channels
Sampling rate : 48.0 kHz
Bit depth : 16 bits
Stream size : 6.72 GiB (100%)

but ffprobe bbb_git.wav
Guessed Channel Layout for Input Stream #0.0 : 5.1
Input #0, wav, from 'bbb_git.wav':
Duration: 01:24:22.76, bitrate: 11394 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, 6 channels, s16, 4608 kb/s
play by MS mediaplayer , it is 01:24:23.

Short Files

What is the minimum length supported by audiowmark? I have short audio files (sound effects) from 100ms to 5s in length, what are your thoughts on adding a watermark? Thanks for your time.

unsupport mp3 format

use command audiowmark add in.mp3 - 0123456789abcdef0011223344556677 report error "audiowmark: error opening in.mp3: Format not recognised." @swesterfeld

audiowmark hls-prepare causes segmentation fault for some files

The process populates the vs0prep directory with 11 out of 12 out.ts files, and an empty out.m3u8.

I compiled audiowmark from commit fa2a618

I think a.mp3 is copyrighted so I'm not going to attach it here, but I may be able to email it to someone if it would help.

$ ffprobe /tmp/a.mp3
<omitted for brevity>
Input #0, mp3, from '/tmp/a.mp3':
  Duration: 00:03:09.57, start: 0.025057, bitrate: 160 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 160 kb/s
    Metadata:
      encoder         : Lavc57.89

$ ffmpeg -i /tmp/a.mp3 -f hls -map 0:a -c:a aac -ar 44100 -b:a 192k -hls_list_size 0 -hls_time 15 -hls_segment_type mpegts -hls_playlist_type vod /tmp/hls/out.m3u8
<omitted>

$ gdb --args /usr/local/bin/audiowmark hls-prepare /tmp/hls /tmp/hls/vs0prep out.m3u8 /tmp/a.mp3
(gdb) r
Starting program: /usr/local/bin/audiowmark hls-prepare /tmp/hls /tmp/hls/vs0prep out.m3u8 /tmp/a.mp3
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Detaching after fork from child process 5867]
[Detaching after fork from child process 5876]
[Detaching after fork from child process 5877]
[Detaching after fork from child process 5878]
[Detaching after fork from child process 5879]
[Detaching after fork from child process 5880]
[Detaching after fork from child process 5881]
[Detaching after fork from child process 5882]
[Detaching after fork from child process 5883]
[Detaching after fork from child process 5884]
[Detaching after fork from child process 5885]
[Detaching after fork from child process 5886]
[Detaching after fork from child process 5887]
[Detaching after fork from child process 5888]
[Detaching after fork from child process 5889]
AAC Bitrate:  195141 (detected)
Segments:     13
[Detaching after fork from child process 5890]
[Detaching after fork from child process 5899]
[Detaching after fork from child process 5908]
[Detaching after fork from child process 5917]
[Detaching after fork from child process 5926]
[Detaching after fork from child process 5935]
[Detaching after fork from child process 5944]
[Detaching after fork from child process 5953]
[Detaching after fork from child process 5962]
[Detaching after fork from child process 5971]
[Detaching after fork from child process 5980]
[Detaching after fork from child process 5989]
[Detaching after fork from child process 5998]

Program received signal SIGSEGV, Segmentation fault.
__memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:384
384     ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:384
#1  0x00005555555933f2 in std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<float> (
    __result=<optimized out>, __last=0x7fffeab29670, __first=<optimized out>)
    at /usr/include/c++/9/bits/stl_algobase.h:465
#2  std::__copy_move_a<false, float const*, float*> (__result=<optimized out>, __last=0x7fffeab29670,
    __first=<optimized out>) at /usr/include/c++/9/bits/stl_algobase.h:404
#3  std::__copy_move_a2<false, __gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > >, float*> (__result=<optimized out>, __last=..., __first=...) at /usr/include/c++/9/bits/stl_algobase.h:440
#4  std::copy<__gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > >, float*> (
    __result=<optimized out>, __last=..., __first=...) at /usr/include/c++/9/bits/stl_algobase.h:474
#5  std::__uninitialized_copy<true>::__uninit_copy<__gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > >, float*> (__result=<optimized out>, __last=..., __first=...)
    at /usr/include/c++/9/bits/stl_uninitialized.h:101
#6  std::uninitialized_copy<__gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > >, float*> (__result=<optimized out>, __last=..., __first=...) at /usr/include/c++/9/bits/stl_uninitialized.h:140
#7  std::__uninitialized_copy_a<__gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > >, float*, float> (__result=<optimized out>, __last=..., __first=...) at /usr/include/c++/9/bits/stl_uninitialized.h:307
#8  std::vector<float, std::allocator<float> >::_M_range_initialize<__gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > > > (__last=..., __first=..., this=0x7fffffffd200)
    at /usr/include/c++/9/bits/stl_vector.h:1582
#9  std::vector<float, std::allocator<float> >::vector<__gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > >, void> (__a=..., __last=..., __first=..., this=0x7fffffffd200)
    at /usr/include/c++/9/bits/stl_vector.h:654
#10 hls_prepare (in_dir="/tmp/hls", out_dir="/tmp/hls/vs0prep", filename=..., audio_master=...) at hls.cc:557
#11 0x0000555555561a6f in main (argc=<optimized out>, argv=<optimized out>)
    at /usr/include/c++/9/bits/stl_vector.h:1040

A few questions about the algorithm details.

Thank you for your awesome library, which provides great ease of use. I am interested in the principle behind it, and after reading the code, I have three questions:

  1. I notice the bit information is embedded by adjusting amplitude:
const float mag = abs(fft_data[i]);
if (mag > min_mag) {
    const float mag_factor = powf(mag, -Params::water_delta * data_bit_sign);
    fft_delta_spect[i] = fft_data[i] * (mag_factor - 1);
}

Is the position of ascending and descending chosen randomly?

  1. In this issue you said "if (umag > dmag) we decode an 1 bit, otherwise 0."
    The mag_factor is generated without considering the difference between umag and dmag. Can the adjustment be sufficient to ensure the encoding of the desired bit?
    For example, if we want to code bit 1, but currently umg < dmag, after the adjustment, can we make sure that umg > dmag ?

  2. Block A and B contain complete information, and blocks A-B can be combined for further error correction. This is interesting. What is the principle of this combined error correction? Could you please provide some reference materials?

Thank you for your reply.

sprintf is deprecated on macOS 13

make[2]: Entering directory `/opt/local/var/macports/build/_Users_runner_work_macports-ports_macports-ports_ports_audio_audiowmark/audiowmark/work/audiowmark-0.6.1/src'
    CXX      audiowmark.o
    CXX      utils.o
    CXX      random.o
    CXX      convcode.o
  utils.cc:117:7: warning: 'sprintf' is deprecated: This function is provided for compatibility reasons only.  Due to security concerns inherent in the design of sprintf(3), it is highly recommended that you use snprintf(3) instead. [-Wdeprecated-declarations]
        sprintf (buffer, "%02x", byte);
        ^
  /Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk/usr/include/stdio.h:188:1: note: 'sprintf' has been explicitly marked deprecated here
  __deprecated_msg("This function is provided for compatibility reasons only.  Due to security concerns inherent in the design of sprintf(3), it is highly recommended that you use snprintf(3) instead.")
  ^
  /Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk/usr/include/sys/cdefs.h:215:48: note: expanded from macro '__deprecated_msg'
          #define __deprecated_msg(_msg) __attribute__((__deprecated__(_msg)))
                                                        ^
  1 warning generated.

This is an uncritical but strange decision to force snprintf().

"docker build -t audiomark" failed

docker build -t audiomark .
...
...
debconf: delaying package configuration, since apt-utils is not installed
...
...
Step 14/19 : RUN ./autogen.sh
---> Running in 7c57223f35a7
/bin/sh: 1: ./autogen.sh: not found
The command '/bin/sh -c ./autogen.sh' returned a non-zero code: 127

autogen failed

$ ./autogen.sh
Running: autoreconf -i && ./configure
aclocal: warning: couldn't open directory 'm4': No such file or directory
configure.ac:79: warning: macro 'AM_PATH_LIBGCRYPT' not found in library
glibtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, 'build-aux'.
glibtoolize: copying file 'build-aux/ltmain.sh'
glibtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'.
glibtoolize: copying file 'm4/libtool.m4'
glibtoolize: copying file 'm4/ltoptions.m4'
glibtoolize: copying file 'm4/ltsugar.m4'
glibtoolize: copying file 'm4/ltversion.m4'
glibtoolize: copying file 'm4/lt~obsolete.m4'
configure.ac:79: warning: macro 'AM_PATH_LIBGCRYPT' not found in library
configure.ac:79: error: possibly undefined macro: AM_PATH_LIBGCRYPT
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
autoreconf: /opt/local/bin/autoconf failed with exit status: 1

fails to compile with clang15

I've tried compiling with clang 15.0.7:

c++ --version
FreeBSD clang version 15.0.7 (https://github.com/llvm/llvm-project.git llvmorg-15.0.7-0-g8dfdcc7b7bf6)
Target: x86_64-unknown-freebsd14.0
Thread model: posix
InstalledDir: /usr/bin

and got the following errors: audiowmark-0.6.1.log

Note: There are no issues with clang 14.0.5:

FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386ae247c)
Target: x86_64-unknown-freebsd13.2
Thread model: posix
InstalledDir: /usr/bin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.