{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Malaya-Speech Cache" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "This tutorial is available as an IPython notebook at [malaya-speech/example/caching](https://github.com/huseinzol05/malaya-speech/tree/master/example/caching).\n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**This module only useful if use default Malaya-Speech repository, for huggingface, read more at** https://huggingface.co/docs/datasets/v1.12.0/cache.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Default Cache location\n", "\n", "You actually can know where is your Malaya default caching folder. Caching folder will save any models, vocabs, and rules downloaded for specific modules." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import malaya_speech" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'/Users/huseinzolkepli/Malaya-Speech'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "malaya_speech.__home__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Change default Cache location\n", "\n", "To change default cache location, you need to set `MALAYA_SPEECH_CACHE` OS environment before import Malaya-Speech,\n", "\n", "```bash\n", "export MALAYA_SPEECH_CACHE=/Users/huseinzolkepli/Documents/malaya-speech-cache\n", "```\n", "\n", "Or you can set in bashenv to make it permanent if you want." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "os.environ['MALAYA_CACHE'] = '/Users/huseinzolkepli/Documents/malaya-speech-cache'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cache subdirectories\n", "\n", "Malaya-Speech put models in subdirectories, you can print it by simply," ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Malaya-Speech/\n", "├── gender/\n", "│ ├── deep-speaker/\n", "│ │ ├── model.pb\n", "│ │ └── version\n", "│ ├── vggvox-v2/\n", "│ │ ├── model.pb\n", "│ │ └── version\n", "│ └── vggvox-v2-quantized/\n", "│ ├── model.pb\n", "│ └── version\n", "├── language-model/\n", "│ ├── bahasa/\n", "│ │ ├── ctc-bahasa.json\n", "│ │ ├── model.trie.klm\n", "│ │ └── version\n", "│ └── bahasa-combined/\n", "│ ├── ctc-bahasa.json\n", "│ ├── model.trie.klm\n", "│ └── version\n", "├── multispeaker-separation-wav/\n", "│ ├── fastsep-4/\n", "│ │ ├── model.pb\n", "│ │ └── version\n", "│ └── fastsep-4-quantized/\n", "│ ├── model.pb\n", "│ └── version\n", "├── speaker-vector/\n", "│ └── vggvox-v2/\n", "│ ├── model.pb\n", "│ └── version\n", "├── speech-to-text/\n", "│ ├── conformer-mixed/\n", "│ │ ├── model.pb\n", "│ │ ├── transducer-mixed.subword.subwords\n", "│ │ └── version\n", "│ ├── conformer-mixed-quantized/\n", "│ │ ├── model.pb\n", "│ │ ├── transducer-mixed.subword.subwords\n", "│ │ └── version\n", "│ ├── small-conformer-mixed/\n", "│ │ ├── model.pb\n", "│ │ ├── transducer-mixed.subword.subwords\n", "│ │ └── version\n", "│ └── small-conformer-mixed-quantized/\n", "│ ├── model.pb\n", "│ ├── transducer-mixed.subword.subwords\n", "│ └── version\n", "├── speech-to-text-ctc/\n", "│ ├── wav2vec2-conformer/\n", "│ │ ├── ctc-bahasa.json\n", "│ │ ├── model.pb\n", "│ │ └── version\n", "│ └── wav2vec2-conformer-quantized/\n", "│ ├── ctc-bahasa.json\n", "│ ├── model.pb\n", "│ └── version\n", "├── super-resolution/\n", "│ ├── srgan-128/\n", "│ │ ├── model.pb\n", "│ │ └── version\n", "│ └── srgan-256/\n", "│ ├── model.pb\n", "│ └── version\n", "├── tts/\n", "│ ├── fastspeech2-female-singlish/\n", "│ │ ├── model.pb\n", "│ │ ├── quantized/\n", "│ │ │ ├── model.pb\n", "│ │ │ └── version\n", "│ │ └── version\n", "│ ├── stats/\n", "│ │ └── female-singlish.npy\n", "│ └── tacotron2-female-singlish/\n", "│ ├── model.pb\n", "│ ├── quantized/\n", "│ │ ├── model.pb\n", "│ │ └── version\n", "│ └── version\n", "├── vad/\n", "│ └── vggvox-v2/\n", "│ ├── model.pb\n", "│ └── version\n", "├── version\n", "└── vocoder-melgan/\n", " ├── female-singlish/\n", " │ ├── model.pb\n", " │ └── version\n", " ├── universal-1024/\n", " │ ├── model.pb\n", " │ └── version\n", " └── universal-1024-quantized/\n", " ├── model.pb\n", " └── version\n" ] } ], "source": [ "malaya_speech.utils.print_cache()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Deleting specific model\n", "\n", "you can specifically choose which model you want to delete." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "malaya_speech.utils.delete_cache('language-model/bahasa-combined')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What happen if a directory does not exist?" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "ename": "Exception", "evalue": "folder not exist, please check path from `malaya-speech.utils.print_cache()`", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mException\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmalaya_speech\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mutils\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdelete_cache\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'language-model/bahasa-combined2'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m~/Documents/tf-1.15/env/lib/python3.7/site-packages/malaya_boilerplate-0.0.3-py3.7.egg/malaya_boilerplate/utils.py\u001b[0m in \u001b[0;36mdelete_cache\u001b[0;34m(location)\u001b[0m\n\u001b[1;32m 188\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mexists\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlocation\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 189\u001b[0m raise Exception(\n\u001b[0;32m--> 190\u001b[0;31m \u001b[0;34mf'folder not exist, please check path from `{__package__}.utils.print_cache()`'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 191\u001b[0m )\n\u001b[1;32m 192\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0misdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlocation\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mException\u001b[0m: folder not exist, please check path from `malaya-speech.utils.print_cache()`" ] } ], "source": [ "malaya_speech.utils.delete_cache('language-model/bahasa-combined2')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Purge cache\n", "\n", "You can simply delete all models, totally purge it. By simply," ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "malaya_speech.utils.delete_all_cache" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I am not gonna to run it, because I prefer to keep it for now?" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }