NeMo Framework

type access

  • Operating System:

  • Terminal:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • NVIDIA Libraries:

  • Extension:

type access

  • Operating System:

  • Terminal:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Database:

  • NVIDIA Libraries:

  • Extension:

NVIDIA NeMo is an advanced end-to-end platform for building and customizing generative AI models. It offers a robust set of tools for tasks such as model training, fine-tuning, retrieval-augmented generation, guardrailing, data curation, and more. With NeMo, you can seamlessly scale AI models from research to production.

For more detailed information about NeMo, refer to the official user guide.

For basic usage of the integrated notebook environment, check the JupyterLab application.

Initialization

To learn how to use the Initialization parameter, please refer to the following documentation sections:

Batch Mode

This option allows you to submit a Bash script (*.sh) that will be executed once the job starts. The job will stop after the script execution completes. This is useful for running predefined tasks without manual intervention.

Enabling SSH Access

The app supports optional SSH access from an external client. To enable SSH:

  1. Upload an SSH public key through the panel located in the Resources section of the UCloud navigation menu.

  2. Select Enable SSH server. A random port will be assigned for connection.

  3. The connection details will appear on the job progress view page.

Training LLMs

NeMo is equipped to train Large Language Models (LLMs) of all sizes, from a few billion parameters to hundreds of billions or even trillions. Training models of varying sizes presents unique challenges, which NeMo addresses with:

  • Optimized and scalable data loaders.

  • Model parallelism techniques.

  • Memory optimizations.

  • Comprehensive training recipes.

These features ensure efficient and effective training across different model scales. NeMo also integrates seamlessly with several community LLMs, providing tools from training to deployment.

To get the full list of supported models, run the following command in the app's integrated terminal:

$ ngc registry model list nvidia/nemo/*

To download a specific model using the NGC command line interface, use:

$ ngc registry model download-version "nvidia/nemo/llama-3_1-8b-instruct-nemo:1.0"

For more information on all supported models, visit this page.

SFT

NeMo provides tools for Full-Parameter Fine-Tuning, which is also referred to as Supervised Fine-Tuning (SFT). In SFT, all of the model parameters are updated to produce outputs that are adapted to the task.

For a practical example, refer to this tutorial.

PEFT

Parameter-Efficient Fine-Tuning (PEFT), on the other hand, tunes a much smaller number of parameters which are inserted into the base model at strategic locations. When fine-tuning with PEFT, the base model weights remain frozen, and only the adapter modules are trained. As a result, the number of trainable parameters is significantly reduced (<< 1%).

NeMo supports five PEFT tuning methods:

For relevant tutorials, check the following links: