GPU Environment PyTorch#

This tutorial is available as an IPython notebook at malaya-speech/example/gpu-environment-pytorch.

[1]:
import os

os.environ['CUDA_VISIBLE_DEVICES'] = '1'
[2]:
%%time

import malaya_speech
import logging
logging.basicConfig(level = logging.INFO)
`pyaudio` is not available, `malaya_speech.streaming.stream` is not able to use.
CPU times: user 3.29 s, sys: 3.78 s, total: 7.08 s
Wall time: 3.09 s

List available GPU#

You must install Pytorch GPU version first to enable GPU hardware acceleration.

[3]:
import torch

torch.cuda.device_count()
[3]:
1

Run model inside GPU#

Once you initiate cuda method from pytorch object, all inputs will auto cast to cuda.

[4]:
malaya_speech.stt.ctc.available_huggingface()
INFO:malaya_speech.stt:for `malay-fleur102` language, tested on FLEURS102 `ms_my` test set, https://github.com/huseinzol05/malaya-speech/tree/master/pretrained-model/prepare-stt
INFO:malaya_speech.stt:for `malay-malaya` language, tested on malaya-speech test set, https://github.com/huseinzol05/malaya-speech/tree/master/pretrained-model/prepare-stt
INFO:malaya_speech.stt:for `singlish` language, tested on IMDA malaya-speech test set, https://github.com/huseinzol05/malaya-speech/tree/master/pretrained-model/prepare-stt
[4]:
Size (MB) malay-malaya malay-fleur102 singlish Language
mesolitica/wav2vec2-xls-r-300m-mixed 1180 {'WER': 0.194655128, 'CER': 0.04775798, 'WER-L... {'WER': 0.2373861259, 'CER': 0.07055478, 'WER-... {'WER': 0.127588595, 'CER': 0.0494924979, 'WER... [malay, singlish]
mesolitica/wav2vec2-xls-r-300m-mixed-v2 1180 {'WER': 0.154782923, 'CER': 0.035164031, 'WER-... {'WER': 0.2013994374, 'CER': 0.0518170369, 'WE... {'WER': 0.2258822139, 'CER': 0.082982312, 'WER... [malay, singlish]
mesolitica/wav2vec2-xls-r-300m-12layers-ms 657 {'WER': 0.1494983789, 'CER': 0.0342059992, 'WE... {'WER': 0.217107489, 'CER': 0.0546614199, 'WER... NaN [malay]
mesolitica/wav2vec2-xls-r-300m-6layers-ms 339 {'WER': 0.22481538553, 'CER': 0.0484392694, 'W... {'WER': 0.38642364985, 'CER': 0.0928960677, 'W... NaN [malay]
[8]:
model = malaya_speech.stt.ctc.huggingface(model = 'mesolitica/wav2vec2-xls-r-300m-6layers-ms')
[7]:
_ = model.cuda()
[10]:
y, _ = malaya_speech.load('speech/example-speaker/husein-zolkepli.wav')
[12]:
model.predict([y])
[12]:
['testing nama saya husin bin zokapli']