Software System for Audio Recording Editing Detection and Localization

Silentium software package automatically detects audio recording sections with signs of editing and localizes altered cut ins with a certain error probability.

Silentium User Guide v1.0

Key Points

Effective for audio tampering detection and localizing the altered sections

Reliable results for audio recordings starting from 30 ms

Available as a standalone desktop application


To identify the distinctive features of an audio recording, a special model was developed based on a deep learning neural network. This model shows with sufficient efficiency for each section with a pause if it has been altered or not. Silentium automatically detects pauses with signs of being edited and potential cut ins pauses (pauses with cues) are localized with a certain error probability.

A specially developed software package analysed a set of phonograms with pauses of various lengths. These pauses were automatically cut out from the phonograms to form the primary database of pauses. Next, a secondary database was formed with millions of 20 ms fragments of pauses with editing and pauses without editing. Fragments of pauses with editing were created using a custom software module which cut out small fragments of pauses from different pauses randomly.

This secondary database was used as the original dataset for the deep learning neural network. The training of the neural network was carried out within the framework of the problem of binary classification into fragments of pauses with and without editing. The resulting model of pauses fragments classification is the basis of the System.

Operational speed: audio authentication process takes 1 minute for the PC with two nuclear processors per minute of the recording duration.

Interface localizations: English, Russian and Ukrainian.

Custom localization is available upon request.


An important technological feature of the Silentium system is the probabilistic characteristics of solving problems of detecting and localizing installation.

All probabilistic characteristics in the form of graphs of errors of the first and second kind were determined through extensive tests using previously prepared experimental data. The Silentium system uses these error graphs to detect and localize tampering in pauses of phonograms.

Source Audio Requirements:
  • Optimal duration: 40 ms and longer
  • Minimum duration: 30 ms
  • Maximum duration: unlimited
  • Supported file formats: .mp3, .ac3, .aac, .ogg, .wma, .aiff, .asf, .au, .flac, .mp2, .avi, .flv, .mp4, .m4a, .wav (at least 44100 Hz, 16-bit)
Hardware Requirements:
  • Processor: from 2 GHz
  • RAM: at least 4 GB (preferably 8 GB)
  • OS: Windows 10 (64 bit)