GPU Environment#

This tutorial is available as an IPython notebook at malaya-speech/example/gpu-environment.

[1]:
%%time

import malaya_speech
import logging
logging.basicConfig(level = logging.INFO)
CPU times: user 4.7 s, sys: 1.5 s, total: 6.19 s
Wall time: 14.1 s

List available GPU#

[2]:
malaya_speech.utils.available_gpu()
[2]:
[]

Limit GPU memory#

By default Malaya-Speech will not set max cap for GPU memory, to put a cap, override gpu_limit parameter in any load model API. gpu_limit should 0 < gpu_limit < 1. If gpu_limit = 0.3, it means the model will not use more than 30% of GPU memory.

malaya_speech.vad.deep_model(gpu_limit = 0.3).

Not all operations supported by GPU#

Yes, some models might faster in CPU due to head cost transitioning from CPU to GPU too frequently.

N Models to N gpus#

To allocate a model to another GPU, set device to different GPU, eg, GPU:1, default is GPU:0.

malaya_speech.emotion.deep_model(gpu_limit = 0.5, device = 'GPU:0')
malaya_speech.language_detection.deep_model(gpu_limit = 0.5, device = 'GPU:1')
malaya_speech.noise_reduction.deep_model(gpu_limit = 0.5, device = 'GPU:2')
malaya_speech.vad.deep_model(gpu_limit = 0.5, device = 'GPU:3')

GPU Rules#

  1. Malaya-Speech will not consumed all available GPU memory, but slowly grow based on batch size. This growth only towards positive (use more GPU memory) dynamically, but will not reduce GPU memory if feed small batch size.

  2. Use malaya_speech.utils.clear_session to clear session from unused models but this will not free GPU memory.