{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Force Alignment using Transducer\n", "\n", "Forced alignment is a technique to take an orthographic transcription of an audio file and generate a time-aligned version. In this example, I am going to use Malay and Singlish models." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " | Size (MB) | \n", "Quantized Size (MB) | \n", "Language | \n", "
---|---|---|---|
conformer-transducer | \n", "120 | \n", "32.3 | \n", "[malay] | \n", "
conformer-transducer-mixed | \n", "120 | \n", "32.3 | \n", "[malay, singlish] | \n", "
conformer-transducer-singlish | \n", "120 | \n", "32.3 | \n", "[singlish] | \n", "