Confused on how to measure voice quality after all? Here is the answer – signal + noise + speech model + analytics = state-of-the-art approach!
Today we would like to propose a yet another concept for voice quality testing and estimation. We believe that a successor for PESQ and other methods should have a generator of test signals that allows sound signal generation according to one of the sound flow models (we explain them in further posts). It can be either a particularized set of sound signals or a signal, received as an output of statistical speech model. Generator’s signal can either be saved for follow-up usage or be exposed to processing and estimation. Bank of signals stores sound data, received as a result of signals’ generator work or from some external sources (you can ask for sample speech models from http://www.sevana.fi). Accordingly, an input of estimation block is a signal of generator directly or one of the bank of signals. Test signal is the input of the synchronizer or of the device under test, which can be for example, a vocoder, compressed/decompressed audio, just an audio/voice file or a communication channel. The output signal of the device under test is an input of synchronizer as well. The synchronizer matches in time initial signal and a processed signal. The synchronized signals in chunks are input into analytical module, which determines the degree of similarity for signals and issues the quality estimation as the measure of similarity between the initial and the processed signals. Misty? The point is that a successful voice quality testing system should have:
- Test signals generator
- Noise signals generator
- Statistical speech model
- Signals syncronizer
- Analytical module
Stay connected and we describe all of these modules in our upcoming blog posts!

