Contents

GPU Environment PyTorch

Contents

GPU Environment PyTorch#

This tutorial is available as an IPython notebook at malaya-speech/example/gpu-environment-pytorch.

[1]:

import os

os.environ['CUDA_VISIBLE_DEVICES'] = '1'

[2]:

%%time

import malaya_speech
import logging
logging.basicConfig(level = logging.INFO)

`pyaudio` is not available, `malaya_speech.streaming.stream` is not able to use.

CPU times: user 3.29 s, sys: 3.78 s, total: 7.08 s
Wall time: 3.09 s

List available GPU#

You must install Pytorch GPU version first to enable GPU hardware acceleration.

[3]:

import torch

torch.cuda.device_count()

[3]:

Run model inside GPU#

Once you initiate cuda method from pytorch object, all inputs will auto cast to cuda.

[4]:

malaya_speech.stt.ctc.available_huggingface()

INFO:malaya_speech.stt:for `malay-fleur102` language, tested on FLEURS102 `ms_my` test set, https://github.com/huseinzol05/malaya-speech/tree/master/pretrained-model/prepare-stt
INFO:malaya_speech.stt:for `malay-malaya` language, tested on malaya-speech test set, https://github.com/huseinzol05/malaya-speech/tree/master/pretrained-model/prepare-stt
INFO:malaya_speech.stt:for `singlish` language, tested on IMDA malaya-speech test set, https://github.com/huseinzol05/malaya-speech/tree/master/pretrained-model/prepare-stt

[4]:

	Size (MB)	malay-malaya	malay-fleur102	singlish	Language
mesolitica/wav2vec2-xls-r-300m-mixed	1180	{'WER': 0.194655128, 'CER': 0.04775798, 'WER-L...	{'WER': 0.2373861259, 'CER': 0.07055478, 'WER-...	{'WER': 0.127588595, 'CER': 0.0494924979, 'WER...	[malay, singlish]
mesolitica/wav2vec2-xls-r-300m-mixed-v2	1180	{'WER': 0.154782923, 'CER': 0.035164031, 'WER-...	{'WER': 0.2013994374, 'CER': 0.0518170369, 'WE...	{'WER': 0.2258822139, 'CER': 0.082982312, 'WER...	[malay, singlish]
mesolitica/wav2vec2-xls-r-300m-12layers-ms	657	{'WER': 0.1494983789, 'CER': 0.0342059992, 'WE...	{'WER': 0.217107489, 'CER': 0.0546614199, 'WER...	NaN	[malay]
mesolitica/wav2vec2-xls-r-300m-6layers-ms	339	{'WER': 0.22481538553, 'CER': 0.0484392694, 'W...	{'WER': 0.38642364985, 'CER': 0.0928960677, 'W...	NaN	[malay]

[8]:

model = malaya_speech.stt.ctc.huggingface(model = 'mesolitica/wav2vec2-xls-r-300m-6layers-ms')

[7]:

_ = model.cuda()

[10]:

y, _ = malaya_speech.load('speech/example-speaker/husein-zolkepli.wav')

[12]:

model.predict([y])

[12]:

['testing nama saya husin bin zokapli']