GPU Environment
Contents
GPU Environment#
This tutorial is available as an IPython notebook at malaya-speech/example/gpu-environment.
[1]:
%%time
import malaya_speech
import logging
logging.basicConfig(level = logging.INFO)
CPU times: user 4.7 s, sys: 1.5 s, total: 6.19 s
Wall time: 14.1 s
Limit GPU memory#
By default Malaya-Speech
will not set max cap for GPU memory, to put a cap, override gpu_limit
parameter in any load model API. gpu_limit
should 0 < gpu_limit
< 1. If gpu_limit
= 0.3, it means the model will not use more than 30% of GPU memory.
malaya_speech.vad.deep_model(gpu_limit = 0.3)
.
Not all operations supported by GPU#
Yes, some models might faster in CPU due to head cost transitioning from CPU to GPU too frequently.
N Models to N gpus#
To allocate a model to another GPU, set device
to different GPU, eg, GPU:1
, default is GPU:0
.
malaya_speech.emotion.deep_model(gpu_limit = 0.5, device = 'GPU:0')
malaya_speech.language_detection.deep_model(gpu_limit = 0.5, device = 'GPU:1')
malaya_speech.noise_reduction.deep_model(gpu_limit = 0.5, device = 'GPU:2')
malaya_speech.vad.deep_model(gpu_limit = 0.5, device = 'GPU:3')
GPU Rules#
Malaya-Speech
will not consumed all available GPU memory, but slowly grow based on batch size. This growth only towards positive (use more GPU memory) dynamically, but will not reduce GPU memory if feed small batch size.Use
malaya_speech.utils.clear_session
to clear session from unused models but this will not free GPU memory.