{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Speech-to-Text CTC + pyctcdecode + MLM" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Encoder model + CTC loss + pyctcdecode with Masked Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " | Size (MB) | \n", "Quantized Size (MB) | \n", "malay-malaya | \n", "Language | \n", "
---|---|---|---|---|
hubert-conformer-tiny | \n", "36.6 | \n", "10.3 | \n", "{'WER': 0.238714008166, 'CER': 0.060899814, 'W... | \n", "[malay] | \n", "
hubert-conformer | \n", "115 | \n", "31.1 | \n", "{'WER': 0.2387140081, 'CER': 0.06089981404, 'W... | \n", "[malay] | \n", "
hubert-conformer-large | \n", "392 | \n", "100 | \n", "{'WER': 0.2203140421, 'CER': 0.0549270416, 'WE... | \n", "[malay] | \n", "