If you prefer, you can skip this reading and pass directly to the Examples section.
You can also download AWT1 User Guide (PDF) containing all the detailed information about the product and its usage.

AWT1 at a glance

AWT1 implements a "strict watermarking" algorithm, i.e. the source, non-watermarked audio file is required in order to find and decode watermark in the watermarked recording. Watermark extraction is performed by “comparing” source and watermarked streams (read "Pros and Cons" below).

The watermark produced by AWT1 is extremely, insanely robust. Read details below or proceed to the examples section.

AWT1 encoder and decoder operate with wave PCM (.wav) audio files of almost any format - mono/stereo, with sampling rates from 8 to 192 KHz, and amplitude resolutions of 8/16/24/32 bits. Additional audio formats (such as MP3, OGG, AMR, etc.) are supported by the decoder via external tool, FFmpeg (http://ffmpeg.org).

Virtually any watermarking payload size is supported (subject to limitations in different AWT1 packages). Feasible payload sizes range from 1 to 20 bytes. Recommended watermarking payload size ensuring uncompromising robustness is up to 8 bytes.

With default parameters, watermarking data rate is 8 bps for 1-byte payload, 12 bps for 2-byte payload, 15 bps for 4-byte payload, 18 bps for 8-byte payload, etc. Thus, the watermarking data rate increases with increase of watermarking payload size due to some constant overhead for each watermark copy. A special parameter of the encoder allows adjusting the data rate making it higher or lower than default.

The algorithm can be applied to all kinds of audio data. Typical examples: music (pop, jazz, classics, rock), speech recordings, instrument samples, etc.

Each particular copy of AWT1 binaries with particular Serial Number (SN) contains a unique numeric identifier that is used during encoding to scramble watermark payload. This security feature prevents one AWT1 user (with one SN) from extracting watermarks from watermarked files created by another AWT1 user (with another SN number).

AWT1 is very fast: encoding is at least 100 times faster than the real-time on a modest Intel Core 2 Duo E6750@2.6Ghz using 1 core.


This watermarking technology is patented, U.S. Patent No. 8,116,514

    

 

Watermark robustness and aural (im)perceptibility

On robustness…

The proposed watermarking scheme demonstrates very high robustness to almost all kinds of audio conversions. Here are some typical examples:

  • lossy transcoding using MP3, Ogg Vorbis and other audio codecs (including multiple transcoding at very low bitrates)
  • acoustic coupling (i.e. traveling of sound from D/A to loudspeaker, then to the microphone via air and then to A/D)
  • mixing with other signals, noise addition
  • signal cropping, cutting
  • sample rate conversion (even down to low sample rates such as 8 KHz that is typical for phone networks); amplitude re-quantization
  • effect processing, from a simple EQ to an extreme dynamic range compression, reverberation, echo, spectral effects, etc.
  • waveform distortions such as limiting, clipping, slope manipulation, gain control
  • A/D - D/A conversion
  • transmission over the radio waves
  • almost any combination of the above

A quick example: the watermark survives even after several dozens(!) of transcodings using low MP3 bitrates and back, and then transducing of the transcoded signal via air (i.e. reproducing it with a loudspeaker and recording with a microphone) in presence of loud background speech. See “examples” section for more details.

You may ask whether this scheme is robust against time scaling (stretching). The answer is both "yes" and "no". The decoder is unable to automatically extract watermark from time-stretched audio file. However, the watermark in the stretched file still exists, so it is possible to extract it by a manual correction of the signal speed prior to decoding.

On imperceptibility…

With default parameters, the proposed watermarking algorithm demonstrates practically undistinguishable watermarking which is transparent to an average listener with audio equipment of any quality on most of audio content. For the sake of truth it should be noted that (like with any other real world technology) there are examples of very specific audio samples that may reveal some watermarking artifacts compared to original non-watermarked audio, however in these specific cases such effects are rather minor and may be noticeable only to experienced listener. Depending on the target needs, the user may adjust encoding parameters (namely, watermarking “density” and “aggressiveness”) to achieve optimal aural transparency and robustness.

 

How the watermarking algorithm works

...no, it is not another spread spectrum or echo hiding technique...

A high-level description of the patented watermarking algorithm implemented in AWT1 is described in the AWT1 User Guide (PDF), section 2.

 

Pros and cons

The main apparent disadvantage of the scheme is the requirement to have an original audio stream to decode the watermark. However, this requirement represents a security feature because it prevents third parties from extracting and even detecting watermark in watermarked file not having its original source. Additionally, such scheme demonstrates extreme robustness that is unreachable for any “blind watermarking” schemes that do not require the source.

Due to the required accuracy of signals synchronization, the decoding generally takes longer than the encoding, with the time to decode being proportional to the audio file size. Fortunately, extraction of the watermark is generally a less frequent procedure than embedding. Also, typically you do not need to process whole 3-5 minute audio recordings to extract the watermark, as 15-40 seconds of audio will generally suffice (depending on payload size used).

Watermarking data rate is quite high (12-30 bits per second, depending on parameters).

Encoding is very fast.

Thanks to the simplicity of the whole idea behind the algorithm, the latter demonstrates extreme, almost insane watermark robustness that puts this algorithm step ahead of other competitive watermarking solutions.

 

Examples

To demonstrate robustness and imperceptibility of AWT1 watermarks, I place audio samples that you can play with during your evaluation of AWT1. Please do not forget to download the demo package of AWT1 that includes AWT1 encoder, decoder, convenient GUI tool and documentation.

Here are several source audio signals (WAV PCM, 44.1 Khz, 16 bit) that are used in this demonstration:

  bach-in.wav
 
     brahms-in.wav
    
     gazebo-in.wav
    
  jarre-in.wav
 
     speech-in.wav
    
     yello-in.wav
    

These are audio recordings of different types: music (pop, electronic, classics) and speech. Note, these are source, not watermarked files that will be used for encoding in this demonstration.

All of the above source files have been encoded using AWT1 encoder, and the watermark 0xABCDEF12 (4 bytes) has been embedded into each of them. The encoding was done by running the encoder:

    awt1_enc source.wav output.wav 0xABCDEF12 -aggressiveness=1.0 -density=1.0

With these parameters the watermarking data rate is approximately 15 bits per second, that results in embedding of 28 copies of the watermarking payload per one minute of audio.

Below is a table of input and output (watermarked) files together with their distorted copies (transcoded, air transduced, mixed with speech, etc). You may download and listen to them in order to:
* check aural transparency of the watermark (by comparing source and watermarked files quality)
* get impression of the distortions introduced into the original recordings and ensure that the watermarks are still detectable by the AWT1 decoder even in so much distorted files.
You can also edit/distort these files even further to test the robustness of the watermark.

To decode the watermark you need to run:

    awt1_dec source.wav output.wav 4 -aggressiveness=1.0 -density=1.0

Below is the table containing all input and output files. You can download the wave files of just listen to them in place using embedded audio player.

Source (input) file
original, without watermark
  Output file watermarked with 0xABCDEF12
to ensure that the watermark is indeed inaudible
  Transcoded watermarked file*
to test watermark robustness
using AWT1 decoder
  Transcoded* and then air transduced partial watermarked file in presence of loud background speech**
to test watermark robustness using AWT1 decoder

bach-in.wav
(50 sec)

 
  bach-out.wav
(50 sec)

 
  bach-out-hardtranscoded.wav
(50 sec)

 
  bach-out-hardtranscoded-airtransduced-voiced_part.wav (24 sec)

 
brahms-in.wav
(50 sec)

 
  brahms-out.wav
(50 sec)

 
  brahms-out-hardtranscoded.wav
(50 sec)

 
  brahms-out-hardtranscoded-airtransduced-voiced_part.wav (34 sec)

 
gazebo-in.wav
(60 sec)

 
  gazebo-out.wav
(60 sec)

 
  gazebo-out-hardtranscoded.wav
(60 sec)

 
  gazebo-out-hardtranscoded-airtransduced-voiced_part.wav (33 sec)

 
jarre-in.wav
(60 sec)

 
  jarre-out.wav
(60 sec)

 
  jarre-out-hardtranscoded.wav
(60 sec)

 
  jarre-out-hardtranscoded-airtransduced-voiced_part.wav (31 sec)

 
speech-in.wav
(30 sec)

 
  speech-out.wav
(30 sec)

 
  speech-out-hardtranscoded.wav
(30 sec)

 
  speech-out-hardtranscoded-airtransduced-voiced_part.wav (22 sec)

 
yello-in.wav
(60 sec)

 
  yello-out.wav
(60 sec)

 
  yello-out-hardtranscoded.wav
(60 sec)

 
  yello-out-hardtranscoded-airtransduced-voiced_part.wav (33 sec)

 

(*) Transcoded files were created by multiple subsequent lossy encoding and decoding of watermarked files using MP3 (Lame) and Ogg Vorbis codecs (18 subsequent encodings at different bitrates: source WAV -> MP3 256 Kbps -> MP3 192 Kbps -> MP3 192 Kbps -> MP3 128 Kbps -> MP3 128 Kbps -> MP3 128 Kbps -> ...). You can download the batch file that was used for transcoding by clicking here.
(**) Air transducing (acoustic coupling) has been performed by reproducing the lossy transcoded output files* using multimedia loudspeakers and by recording the signal with a microphone placed at 30 cm from one of the loudspeakers. Additionally, loud background speech was disturbing the recording constantly, and the recording has been then cropped.

As a conclusion - the watermark is still detectable even in recordings made via air from bad quality 18 times transcoded MP3 outputs in presence of loud background speech while only partial recording is available! Use AWT1 decoder to make sure in that yourself.

Just for fun, I made another experiment: I took yello-out-hardtranscoded-airtransduced-voiced_part.wav, and introduced additional distortion by seriously clipping (limiting) its waverform on the entire signal duration, and then cropped the signal leaving only 14 seconds of it. The result is here: yello-out-hardtranscoded-airtransduced_voiced_cropped_clipped.wav. You can now download this file and make sure that the watermark is still detectable even in this file!

 

 

Proceed to purchase page
Download AWT1 User Guide (PDF)
Download free AWT1 evaluation package

Back to the AWT1 main page