AP - The Recognized Standard In Audio Test



APx500 supports popular perceptual audio tests for evaluation of speech quality. Obtain MOS results directly in APx projects.


APx500 now supports two popular perceptual audio tests used for speech quality evaluation: PESQ (Perceptual Evaluation of Speech Quality) and POLQA (Perceptual Objective Listening Quality Assessment). Both deliver Mean Opinion Score (MOS) results directly in APx projects.

PESQ is a widely used, enhanced perceptual measurement for voice quality in telecommunications. It is licensed from OPTICOM GmbH and forms the basis of ITU-T Recommendation P.862. PESQ is specifically designed for testing voice quality on low bandwidth devices, like mobile phones and smartphones. MOS results from PESQ can achieve a very high correlation with results obtainable using human subjects. 

POLQA is licensed from OPTICOM as a successor to PESQ, and specifically targets changes in the communications landscape with support for HD Voice, 3G, 4G/LTE and VoIP technologies. Like PESQ, POLQA delivers results with a very high correlation to tests with human subjects.

When regular audio tests are not enough

Many voice communications devices, notably mobile phones and VoIP networks, employ limited bandwidth and/or bitrate, and use codecs that significantly alter sound. Despite these accommodations, people can usually understand one another quite well over these systems; this is because the compromises made are selected to allow adequate speech perception despite measurable shortcomings in distortion, frequency response, and other traditional audio metrics.

PESQ and POLQA simulate the likely results of real human subjects with real speech samples, allowing designers of speech-delivery systems to ensure that real-world customers achieve high levels of product satisfaction.

PESQ/POLQA methodology

In traditional perceptual measurements, a group of people is assembled and asked to judge the sound quality of various audio clips, typically on a scale from 1 to 5. When all the individual scores are tallied, the result is called the Mean Opinion Score (MOS).

Tests with real human subjects are expensive and time consuming, but these limitations can be overcome by using tools that incorporate algorithms based upon psychoacoustic modeling, such as PESQ or POLQA. When used properly, these models closely match results from human groups. This automation allows for rapid iterations in development as well as efficient production line verification. Measurements obtained are classified as objective because they are unaffected by human temperament or test conditions, and the results are completely repeatable.


PESQ and POLQA are based upon ITU-T P.862 and P.863 respectively, and licensed by OPTICOM GmbH.

How Perceptual Audio tests work

PESQ Block Diagram Perceptual audio test block diagram (courtesy of OPTICOM GmbH)

Both reference and degraded signals are individually level aligned and filtered with the transfer characteristics of a receiving device (handset). The two signals are time aligned in order to compensate for small time shifts that can occur in voice networks due to delay, jitter and coding. The model transforms the two aligned and filtered signals from the time-amplitude domain into a frequency-loudness domain (auditory transform), accounting for the distortions that are perceptible by a human listener.

The two signal representations are subtracted and an estimate of the audible differences is derived. The audible differences are accumulated over time, weighted differently depending on whether a distortion was added to the signal or if parts of the signal were missing after the transmission (e.g. drop-outs).

Finally after the analysis a single Mean Opinion Score (MOS) is generated. The MOS is commonly used to describe the voice quality on a scale from 1 (bad quality) to 5 (excellent quality).

MOS Score Summary
MOS Quality Impairment
5 Excellent Imperceptible
4 Good Perceptible but not annoying
3 Fair Slightly annoying
2 Poor Annoying
1 Bad Very annoying


PESQ and POLQA are similar in goal, but differ in several key applications. PESQ is an older, but very widely used tool for evaluating speech quality on communications networks with restricted bandwidth, while POLQA is newer and handles variations such as wide-band audio, acoustic transducers, DSP and level variations. Many products specify one or the other as part of a testing and approval regimen.

PESQ and POLQA differences are summarized in the table below.

Measurements with acoustic transducers No Yes
Correct scoring with high background noise No Yes
AMR vs. EVRC codec comparison No Yes
Effects of speech level in samples No Yes
Narrowband (300 Hz - 3400 Hz) Yes Yes
Wideband (100 Hz - 7000 Hz) Yes Yes
Super Wideband (50 Hz - 14000 Hz) No Yes
Linear Frequency distortion sensitivity No Yes

Perceptual audio testing in APx500

AP offers PESQ and POLQA as separate software options that operate in similar fashion, offering standard MOS results for a single speech sample and averaged measurements for accumulated samples.

The PESQ/POLQA measurement returns overall perceptual quality in MOS or PESQ units after playing each voice sample, along with a quality vs. time display to help pinpoint specific problems such as clipped words or dropouts. Additional results show average delay, delay vs. time, and the reference and the acquired waveforms.

MOS per Time

The PESQ/POLQA Average measurement allows you to run a collection of different voice samples and then display the resulting overall score.

Both measurements let you choose to analyze the entire signal, or to look at only the active speech or silence.

Fully integrated with APx projects

Audio Precision’s PESQ and POLQA software options appear in projects exactly like any other measurement, with the same ease of use that makes APx500 software fast, productive and intuitive. All standard and optional I/O modules may be used, including unbalanced/balanced analog, unbalanced/balanced/optical digital, serial digital, Bluetooth, HDMI, and PDM.


Key features

  • Perceptual analysis of degradations in voice signals now directly integrated with APx500 software.
  • Reliable, well-established and repeatable results.
  • High correlation with subjective MOS
  • Automatic time alignment
  • Use with any APx audio interfaces: PDM, analog balanced/unbalanced, AES3 and S/PDIF digital, HDMI, Bluetooth, and serial digital.
  • Incorporate into any APx project or report
  • Compatible with APx515, 525, 582 and 585.

  • ITU-T P.862.x Perceptual Evaluation of Speech Quality
    with mapping to MOS scale
  • Narrowband and Wideband operation
  • Input
    16 bit linear audio sampled at 8 kHz or 16 kHz
    Input voice files 6 to 20 seconds in length
  • Measurements
    Instantaneous MOS vs time
    Average MOS (Active Speech and Silence)
    MOS per sample file
  • ITU-T P.863 Perceptual Evaluation of Speech Quality with mapping to MOS scale
  • Narrowband, Wideband and Super Wideband operation
  • Use with acoustical interfaces
  • Enhanced accuracy
  • Input
    16 bit linear audio sampled at 8, 16 or 48 kHz
    Input voice files 6 to 12 seconds in length
  • Measurements
    Instantaneous MOS vs time
    Average MOS (Active Speech and Silence)
    MOS per sample file

Back To Top