MOS (Mean Opinion Score) models for evaluating audio quality.

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

AECMOS, DNSMOS, PLCMOS

We release the AECMOS, DNSMOS, and PLCMOS models that we have developed for evaluating audio degradations due to echo, noise, packet loss and other sources.

Prerequisites

Python 3.7 and above
librosa 0.9.1
numpy 1.21.5
onnxruntime 1.10.0
pandas
tqdm

Usage:

from speechmos import aecmos, dnsmos, plcmos

aecmos.run(sample, sr, talk_type, **kwargs)

dnsmos.run(sample, sr, **kwargs)

plcmos.run(sample, sr, **kwargs)

sample is one of the following:
- For AECMOS: dictionary of the form {'lpb': lpb, 'mic': mic, 'enh': enh} corresponding to the loopback, microphone, and enhanced audio as type np.ndarray or paths to audio files of type supported by librosa.
- For DNSMOS and PLCMOS: np.ndarray or a path to an audio file of type supported by librosa.
All audio should be single channel (mono) audio.
Alternatively, sample can be a list of items of one of the above types.
sr denotes the sampling rate. Sampling rate should be either 16000 or 48000. AECMOS is available at 48kHz, all other models are available at 16kHz. All audio should be provided at the correct sampling rate.

For AECMOS:

talk_type specifies the scenario: 'st' (far-end single talk), 'nst' (near-end single talk), or 'dt' (double talk) if known. talk_type can be None in which case the 16kHz scenarioless model can be used. The performance is about 2% lower in correlation with the ground truth than the scenario based model.

For DNSMOS:

model_type controls which DNSMOS model to use: 'dnsmos' or 'dnsmos_personalized'. The default is 'dnsmos'.

Additional arguments:

return_df controls whether a pandas dataframe is returned containing sample information and MOS scores when evaluating a list of samples. The default is return_df = True. If set to False, a list of dictionaries is returned instead.
verbose controls whether more details are printed on the screen. The default is verbose = False.

Usage examples:

AECMOS usage example with `sample` as a dictionary of numpy arrays and unknown `talk_type`.

import librosa
from speechmos import aecmos

lpb, _ = librosa.load("d:/data/example/lpb.wav", sr=16000)
mic, _ = librosa.load("d:/data/example/mic.wav", sr=16000)
enh, _ = librosa.load("d:/data/example/enh.wav", sr=16000)

sample = {'lpb': lpb, 'mic': mic, 'enh': enh}

aecmos.run(sample, sr= 16000, verbose= True)

Output:

Model version aecmos_scenarioless_16kHz.
The model sampling rate is 16000.
{'echo_mos': 4.9999470710754395, 'deg_mos': 3.4854962825775146, 'talk_type': None, 'model_name': 'aecmos_scenarioless_16kHz'}

AECMOS usage example with `sample` as a list of dictionaries of paths to audio files.

from speechmos import aecmos
aecmos.run(sample_list, sr=48000, 'dt', verbose = True)

Output:

Using model aecmos_48kHz to evaluate 3 samples.
Model sampling rate is 48000.
0it [00:00, ?it/s]
1it [00:00,  8.59it/s]
3it [00:00, 25.77it/s]
{'lpb_path': 'D:/data/example/lpb.wav', 'mic_path': 'D:/data/example/mic.wav', 'enh_path': 'D:/data/example/enh.wav', 'echo_mos': 3.2400383949279785, 'deg_mos': 3.4087774753570557, 'talk_type': 'dt', 'model_name': 'aecmos_48kHz'}
{'lpb_path': 'D:/data/example/lpb.wav', 'mic_path': 'D:/data/example/mic.wav', 'enh_path': 'D:/data/example/enh.wav', 'echo_mos': 3.2400383949279785, 'deg_mos': 3.4087774753570557, 'talk_type': 'dt', 'model_name': 'aecmos_48kHz'}
{'lpb_path': 'D:/data/example/lpb.wav', 'mic_path': 'D:/data/example/mic.wav', 'enh_path': 'D:/data/example/enh.wav', 'echo_mos': 3.2400383949279785, 'deg_mos': 3.4087774753570557, 'talk_type': 'dt', 'model_name': 'aecmos_48kHz'}
       echo_mos   deg_mos
count  3.000000  3.000000
mean   3.240038  3.408777
std    0.000000  0.000000
min    3.240038  3.408777
25%    3.240038  3.408777
50%    3.240038  3.408777
75%    3.240038  3.408777
max    3.240038  3.408777

DNSMOS usage example with `sample` as a numpy array:

import librosa
from speechmos import dnsmos

audio, _ = librosa.load("D:/data/example/enh.wav", sr=16000)
dnsmos.run(audio, sr=16000)

Output:

{'filename': 'D:/data/example/enh.wav',
 'ovrl_mos': 2.2067626609880104,
 'sig_mos': 3.290418848414798,
 'bak_mos': 2.141338429075571,
 'p808_mos': 3.0722866}

PLCMOS usage example with `sample` as a path to an audio file:

import librosa
from speechmos import plcmos

plcmos.run("D:/data/example/enh.wav", sr=16000)

Output:

{'filename': 'D:/data/example/enh.wav',
 'plcmos': 2.5210512320200604,
 'model': 'plcmos_v2'}

Citation:

C. K. A. Reddy, V. Gopal and R. Cutler, "Dnsmos P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 2022, pp. 886-890, doi: 10.1109/ICASSP43922.2022.9746108.

L. Diener, M. Purin, S. Sootla, A. Saabas, R. Aichner, and R. Cutler, "PLCMOS--a data-driven non-intrusive metric for the evaluation of packet loss concealment algorithms." arXiv preprint arXiv:2305.15127 (2023).

M. Purin, S. Sootla, M. Sponza, A. Saabas and R. Cutler, "AECMOS: A Speech Quality Assessment Metric for Echo Impairment," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 2022, pp. 901-905, doi: 10.1109/ICASSP43922.2022.9747836.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.0.1.1

Jul 13, 2023

0.0.1

Jun 30, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speechmos-0.0.1.1.tar.gz (9.4 MB view hashes)

Uploaded Jul 13, 2023 Source

Built Distribution

speechmos-0.0.1.1-py3-none-any.whl (9.4 MB view hashes)

Uploaded Jul 13, 2023 Python 3

Hashes for speechmos-0.0.1.1.tar.gz

Hashes for speechmos-0.0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`f65040b408a5114b808fba8abcbd22aa09dc77e60bea0c2d2b060efebca53dec`
MD5	`a9e5ccad5faf0df8aff2e753368c75e9`
BLAKE2b-256	`1e03b9f7fb53094b7919feb7a37d0dbc445f20276cff0743f518cd8d2726074a`

Hashes for speechmos-0.0.1.1-py3-none-any.whl

Hashes for speechmos-0.0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`31c4c9d3234f6ee10102edff74333014c50006a3f389daf7ecacae34e68ebbf7`
MD5	`8e2e17753b417b1e60a6e51abc50c9d4`
BLAKE2b-256	`300e08369f3574447acfe78a7678f9a5e2b9c6629888be25a5e2407d616a6c02`

speechmos 0.0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

AECMOS, DNSMOS, PLCMOS

Prerequisites

Usage:

Usage examples:

AECMOS usage example with `sample` as a dictionary of numpy arrays and unknown `talk_type`.

AECMOS usage example with `sample` as a list of dictionaries of paths to audio files.

DNSMOS usage example with `sample` as a numpy array:

PLCMOS usage example with `sample` as a path to an audio file:

Citation:

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

speechmos 0.0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

AECMOS, DNSMOS, PLCMOS

Prerequisites

Usage:

Usage examples:

AECMOS usage example with sample as a dictionary of numpy arrays and unknown talk_type.

AECMOS usage example with sample as a list of dictionaries of paths to audio files.

DNSMOS usage example with sample as a numpy array:

PLCMOS usage example with sample as a path to an audio file:

Citation:

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

AECMOS usage example with `sample` as a dictionary of numpy arrays and unknown `talk_type`.

AECMOS usage example with `sample` as a list of dictionaries of paths to audio files.

DNSMOS usage example with `sample` as a numpy array:

PLCMOS usage example with `sample` as a path to an audio file: