{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Split utterances using VAD" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let say you have a long audio sample, and you want to cut to small samples based on utterances. Malaya-speech can help you!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " | Size (MB) | \n", "Quantized Size (MB) | \n", "Accuracy | \n", "
---|---|---|---|
vggvox-v1 | \n", "70.8 | \n", "17.70 | \n", "0.9500 | \n", "
vggvox-v2 | \n", "31.1 | \n", "7.92 | \n", "0.9594 | \n", "
speakernet | \n", "20.3 | \n", "5.18 | \n", "0.9000 | \n", "