|
|
|
|status| |documentation| |codeql| |license| |pypi| |pyversion| |downloads| |black| |
|
|
|
.. |status| image:: http://www.repostatus.org/badges/latest/active.svg |
|
:target: http://www.repostatus.org/ |
|
:alt: Project Status: Active – The project has reached a stable, usable state and is being actively developed. |
|
|
|
.. |documentation| image:: https://readthedocs.com/projects/nvidia-nemo/badge/?version=main |
|
:alt: Documentation |
|
:target: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/ |
|
|
|
.. |license| image:: https://img.shields.io/badge/License-Apache%202.0-brightgreen.svg |
|
:target: https://github.com/NVIDIA/NeMo/blob/master/LICENSE |
|
:alt: NeMo core license and license for collections in this repo |
|
|
|
.. |pypi| image:: https://badge.fury.io/py/nemo-toolkit.svg |
|
:target: https://badge.fury.io/py/nemo-toolkit |
|
:alt: Release version |
|
|
|
.. |pyversion| image:: https://img.shields.io/pypi/pyversions/nemo-toolkit.svg |
|
:target: https://badge.fury.io/py/nemo-toolkit |
|
:alt: Python version |
|
|
|
.. |downloads| image:: https://static.pepy.tech/personalized-badge/nemo-toolkit?period=total&units=international_system&left_color=grey&right_color=brightgreen&left_text=downloads |
|
:target: https://pepy.tech/project/nemo-toolkit |
|
:alt: PyPi total downloads |
|
|
|
.. |codeql| image:: https://github.com/nvidia/nemo/actions/workflows/codeql.yml/badge.svg?branch=main&event=push |
|
:target: https://github.com/nvidia/nemo/actions/workflows/codeql.yml |
|
:alt: CodeQL |
|
|
|
.. |black| image:: https://img.shields.io/badge/code%20style-black-000000.svg |
|
:target: https://github.com/psf/black |
|
:alt: Code style: black |
|
|
|
.. _main-readme: |
|
|
|
**NVIDIA NeMo** |
|
=============== |
|
|
|
Introduction |
|
------------ |
|
|
|
NVIDIA NeMo is a conversational AI toolkit built for researchers working on automatic speech recognition (ASR), |
|
text-to-speech synthesis (TTS), large language models (LLMs), and |
|
natural language processing (NLP). |
|
The primary objective of NeMo is to help researchers from industry and academia to reuse prior work (code and pretrained models) |
|
and make it easier to create new `conversational AI models <https://developer.nvidia.com/conversational-ai |
|
|
|
All NeMo models are trained with `Lightning <https://github.com/Lightning-AI/lightning>`_ and |
|
training is automatically scalable to 1000s of GPUs. |
|
Additionally, NeMo Megatron LLM models can be trained up to 1 trillion parameters using tensor and pipeline model parallelism. |
|
NeMo models can be optimized for inference and deployed for production use-cases with `NVIDIA Riva <https://developer.nvidia.com/riva>`_. |
|
|
|
Getting started with NeMo is simple. |
|
State of the Art pretrained NeMo models are freely available on `HuggingFace Hub <https://huggingface.co/models?library=nemo&sort=downloads&search=nvidia>`_ and |
|
`NVIDIA NGC <https://catalog.ngc.nvidia.com/models?query=nemo&orderBy=weightPopularDESC>`_. |
|
These models can be used to transcribe audio, synthesize speech, or translate text in just a few lines of code. |
|
|
|
We have extensive `tutorials <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/starthere/tutorials.html>`_ that |
|
can all be run on `Google Colab <https://colab.research.google.com>`_. |
|
|
|
For advanced users that want to train NeMo models from scratch or finetune existing NeMo models |
|
we have a full suite of `example scripts <https://github.com/NVIDIA/NeMo/tree/main/examples>`_ that support multi-GPU/multi-node training. |
|
|
|
For scaling NeMo LLM training on Slurm clusters or public clouds, please see the `NVIDIA NeMo Megatron Launcher <https://github.com/NVIDIA/NeMo-Megatron-Launcher>`_. |
|
The NM launcher has extensive recipes, scripts, utilities, and documentation for training NeMo LLMs and also has an `Autoconfigurator <https://github.com/NVIDIA/NeMo-Megatron-Launcher |
|
which can be used to find the optimal model parallel configuration for training on a specific cluster. |
|
|
|
Also see our `introductory video <https://www.youtube.com/embed/wBgpMf_KQVw>`_ for a high level overview of NeMo. |
|
|
|
Key Features |
|
------------ |
|
|
|
* Speech processing |
|
* `HuggingFace Space for Audio Transcription (File, Microphone and YouTube) <https://huggingface.co/spaces/smajumdar/nemo_multilingual_language_id>`_ |
|
* `Automatic Speech Recognition (ASR) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/intro.html>`_ |
|
* Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC, FastConformer-CTC, FastConformer-Transducer... |
|
* Supports CTC and Transducer/RNNT losses/decoders |
|
* NeMo Original `Multi-blank Transducers <https://arxiv.org/abs/2211.03541>`_ |
|
* Beam Search decoding |
|
* `Language Modelling for ASR <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/asr_language_modeling.html>`_: N-gram LM in fusion with Beam Search decoding, Neural Rescoring with Transformer |
|
* Streaming and Buffered ASR (CTC/Transducer) - `Chunked Inference Examples <https://github.com/NVIDIA/NeMo/tree/stable/examples/asr/asr_chunked_inference>`_ |
|
* `Support of long audios for Conformer with memory efficient local attention <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/results.html |
|
* `Speech Classification, Speech Command Recognition and Language Identification <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/speech_classification/intro.html>`_: MatchboxNet (Command Recognition), AmberNet (LangID) |
|
* `Voice activity Detection (VAD) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/speech_classification/models.html |
|
* ASR with VAD Inference - `Example <https://github.com/NVIDIA/NeMo/tree/stable/examples/asr/asr_vad>`_ |
|
* `Speaker Recognition <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/speaker_recognition/intro.html>`_: TitaNet, ECAPA_TDNN, SpeakerNet |
|
* `Speaker Diarization <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/speaker_diarization/intro.html>`_ |
|
* Clustering Diarizer: TitaNet, ECAPA_TDNN, SpeakerNet |
|
* Neural Diarizer: MSDD (Multi-scale Diarization Decoder) |
|
* `Speech Intent Detection and Slot Filling <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/speech_intent_slot/intro.html>`_: Conformer-Transformer |
|
* `Pretrained models on different languages. <https://ngc.nvidia.com/catalog/collections/nvidia:nemo_asr>`_: English, Spanish, German, Russian, Chinese, French, Italian, Polish, ... |
|
* `NGC collection of pre-trained speech processing models. <https://ngc.nvidia.com/catalog/collections/nvidia:nemo_asr>`_ |
|
* Natural Language Processing |
|
* `NeMo Megatron pre-training of Large Language Models <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/nemo_megatron/intro.html>`_ |
|
* `Neural Machine Translation (NMT) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/machine_translation/machine_translation.html>`_ |
|
* `Punctuation and Capitalization <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/punctuation_and_capitalization.html>`_ |
|
* `Token classification (named entity recognition) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/token_classification.html>`_ |
|
* `Text classification <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/text_classification.html>`_ |
|
* `Joint Intent and Slot Classification <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/joint_intent_slot.html>`_ |
|
* `Question answering <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/question_answering.html>`_ |
|
* `GLUE benchmark <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/glue_benchmark.html>`_ |
|
* `Information retrieval <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/information_retrieval.html>`_ |
|
* `Entity Linking <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/entity_linking.html>`_ |
|
* `Dialogue State Tracking <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/sgd_qa.html>`_ |
|
* `Prompt Learning <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/nemo_megatron/prompt_learning.html>`_ |
|
* `NGC collection of pre-trained NLP models. <https://ngc.nvidia.com/catalog/collections/nvidia:nemo_nlp>`_ |
|
* `Synthetic Tabular Data Generation <https://developer.nvidia.com/blog/generating-synthetic-data-with-transformers-a-solution-for-enterprise-data-challenges/>`_ |
|
* `Speech synthesis (TTS) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/tts/intro.html |
|
* Spectrogram generation: Tacotron2, GlowTTS, TalkNet, FastPitch, FastSpeech2, Mixer-TTS, Mixer-TTS-X |
|
* Vocoders: WaveGlow, SqueezeWave, UniGlow, MelGAN, HiFiGAN, UnivNet |
|
* End-to-end speech generation: FastPitch_HifiGan_E2E, FastSpeech2_HifiGan_E2E, VITS |
|
* `NGC collection of pre-trained TTS models. <https://ngc.nvidia.com/catalog/collections/nvidia:nemo_tts>`_ |
|
* `Tools <https://github.com/NVIDIA/NeMo/tree/stable/tools>`_ |
|
* `Text Processing (text normalization and inverse text normalization) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/text_normalization/intro.html>`_ |
|
* `CTC-Segmentation tool <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/tools/ctc_segmentation.html>`_ |
|
* `Speech Data Explorer <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/tools/speech_data_explorer.html>`_: a dash-based tool for interactive exploration of ASR/TTS datasets |
|
|
|
|
|
Built for speed, NeMo can utilize NVIDIA's Tensor Cores and scale out training to multiple GPUs and multiple nodes. |
|
|
|
Requirements |
|
------------ |
|
|
|
1) Python 3.8 or above |
|
2) Pytorch 1.10.0 or above |
|
3) NVIDIA GPU for training |
|
|
|
Documentation |
|
------------- |
|
|
|
.. |main| image:: https://readthedocs.com/projects/nvidia-nemo/badge/?version=main |
|
:alt: Documentation Status |
|
:scale: 100% |
|
:target: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/ |
|
|
|
.. |stable| image:: https://readthedocs.com/projects/nvidia-nemo/badge/?version=stable |
|
:alt: Documentation Status |
|
:scale: 100% |
|
:target: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/ |
|
|
|
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+ |
|
| Version | Status | Description | |
|
+=========+=============+==========================================================================================================================================+ |
|
| Latest | |main| | `Documentation of the latest (i.e. main) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/>`_ | |
|
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+ |
|
| Stable | |stable| | `Documentation of the stable (i.e. most recent release) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/>`_ | |
|
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+ |
|
|
|
Tutorials |
|
--------- |
|
A great way to start with NeMo is by checking `one of our tutorials <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/starthere/tutorials.html>`_. |
|
|
|
Getting help with NeMo |
|
---------------------- |
|
FAQ can be found on NeMo's `Discussions board <https://github.com/NVIDIA/NeMo/discussions>`_. You are welcome to ask questions or start discussions there. |
|
|
|
|
|
Installation |
|
------------ |
|
|
|
Conda |
|
~~~~~ |
|
|
|
We recommend installing NeMo in a fresh Conda environment. |
|
|
|
.. code-block:: bash |
|
|
|
conda create --name nemo python==3.8.10 |
|
conda activate nemo |
|
|
|
Install PyTorch using their `configurator <https://pytorch.org/get-started/locally/>`_. |
|
|
|
.. code-block:: bash |
|
|
|
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia |
|
|
|
The command used to install PyTorch may depend on your system. Please use the configurator linked above to find the right command for your system. |
|
|
|
Pip |
|
~~~ |
|
Use this installation mode if you want the latest released version. |
|
|
|
.. code-block:: bash |
|
|
|
apt-get update && apt-get install -y libsndfile1 ffmpeg |
|
pip install Cython |
|
pip install nemo_toolkit['all'] |
|
|
|
Depending on the shell used, you may need to use ``"nemo_toolkit[all]"`` instead in the above command. |
|
|
|
Pip from source |
|
~~~~~~~~~~~~~~~ |
|
Use this installation mode if you want the version from a particular GitHub branch (e.g main). |
|
|
|
.. code-block:: bash |
|
|
|
apt-get update && apt-get install -y libsndfile1 ffmpeg |
|
pip install Cython |
|
python -m pip install git+https://github.com/NVIDIA/NeMo.git@{BRANCH} |
|
|
|
|
|
From source |
|
~~~~~~~~~~~ |
|
Use this installation mode if you are contributing to NeMo. |
|
|
|
.. code-block:: bash |
|
|
|
apt-get update && apt-get install -y libsndfile1 ffmpeg |
|
git clone https://github.com/NVIDIA/NeMo |
|
cd NeMo |
|
./reinstall.sh |
|
|
|
If you only want the toolkit without additional conda-based dependencies, you may replace ``reinstall.sh`` |
|
with ``pip install -e .`` when your PWD is the root of the NeMo repository. |
|
|
|
RNNT |
|
~~~~ |
|
Note that RNNT requires numba to be installed from conda. |
|
|
|
.. code-block:: bash |
|
|
|
conda remove numba |
|
pip uninstall numba |
|
conda install -c conda-forge numba |
|
|
|
NeMo Megatron |
|
~~~~~~~~~~~~~ |
|
NeMo Megatron training requires NVIDIA Apex to be installed. |
|
Install it manually if not using the NVIDIA PyTorch container. |
|
|
|
.. code-block:: bash |
|
|
|
git clone https://github.com/NVIDIA/apex.git |
|
cd apex |
|
git checkout 03c9d80ed54c0eaa5b581bf42ceca3162f085327 |
|
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--fast_layer_norm" --global-option="--distributed_adam" --global-option="--deprecated_fused_adam" ./ |
|
|
|
It is highly recommended to use the NVIDIA PyTorch or NeMo container if having issues installing Apex or any other dependencies. |
|
|
|
While installing Apex, it may raise an error if the CUDA version on your system does not match the CUDA version torch was compiled with. |
|
This raise can be avoided by commenting it here: https://github.com/NVIDIA/apex/blob/master/setup.py |
|
|
|
cuda-nvprof is needed to install Apex. The version should match the CUDA version that you are using: |
|
|
|
.. code-block:: bash |
|
|
|
conda install -c nvidia cuda-nvprof=11.8 |
|
|
|
packaging is also needed: |
|
|
|
.. code-block:: bash |
|
|
|
pip install -y packaging |
|
|
|
|
|
Transformer Engine |
|
~~~~~~~~~~~~~~~~~~ |
|
NeMo Megatron GPT has been integrated with `NVIDIA Transformer Engine <https://github.com/NVIDIA/TransformerEngine>`_ |
|
Transformer Engine enables FP8 training on NVIDIA Hopper GPUs. |
|
`Install <https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/installation.html>`_ it manually if not using the NVIDIA PyTorch container. |
|
|
|
.. code-block:: bash |
|
|
|
pip install --upgrade git+https://github.com/NVIDIA/TransformerEngine.git@stable |
|
|
|
It is highly recommended to use the NVIDIA PyTorch or NeMo container if having issues installing Transformer Engine or any other dependencies. |
|
|
|
Transformer Engine requires PyTorch to be built with CUDA 11.8. |
|
|
|
NeMo Text Processing |
|
~~~~~~~~~~~~~~~~~~~~ |
|
NeMo Text Processing, specifically (Inverse) Text Normalization, is now a separate repository `https://github.com/NVIDIA/NeMo-text-processing <https://github.com/NVIDIA/NeMo-text-processing>`_. |
|
|
|
Docker containers: |
|
~~~~~~~~~~~~~~~~~~ |
|
We release NeMo containers alongside NeMo releases. For example, NeMo ``r1.16.0`` comes with container ``nemo:23.01``, you may find more details about released containers in `releases page <https://github.com/NVIDIA/NeMo/releases>`_. |
|
|
|
To use built container, please run |
|
|
|
.. code-block:: bash |
|
|
|
docker pull nvcr.io/nvidia/nemo:23.01 |
|
|
|
To build a nemo container with Dockerfile from a branch, please run |
|
|
|
.. code-block:: bash |
|
|
|
DOCKER_BUILDKIT=1 docker build -f Dockerfile -t nemo:latest . |
|
|
|
|
|
If you chose to work with main branch, we recommend using NVIDIA's PyTorch container version 23.02-py3 and then installing from GitHub. |
|
|
|
.. code-block:: bash |
|
|
|
docker run --gpus all -it --rm -v <nemo_github_folder>:/NeMo --shm-size=8g \ |
|
-p 8888:8888 -p 6006:6006 --ulimit memlock=-1 --ulimit \ |
|
stack=67108864 --device=/dev/snd nvcr.io/nvidia/pytorch:23.02-py3 |
|
|
|
Examples |
|
-------- |
|
|
|
Many examples can be found under the `"Examples" <https://github.com/NVIDIA/NeMo/tree/stable/examples>`_ folder. |
|
|
|
|
|
Contributing |
|
------------ |
|
|
|
We welcome community contributions! Please refer to the `CONTRIBUTING.md <https://github.com/NVIDIA/NeMo/blob/stable/CONTRIBUTING.md>`_ CONTRIBUTING.md for the process. |
|
|
|
Publications |
|
------------ |
|
|
|
We provide an ever growing list of publications that utilize the NeMo framework. Please refer to `PUBLICATIONS.md <https://github.com/NVIDIA/NeMo/tree/stable/PUBLICATIONS.md>`_. We welcome the addition of your own articles to this list ! |
|
|
|
License |
|
------- |
|
NeMo is under `Apache 2.0 license <https://github.com/NVIDIA/NeMo/blob/stable/LICENSE>`_. |
|
|