Distributed and Stable LLM Training on a Large-Scale Cluster
Автор: C-DAC
Загружено: 2025-09-03
Просмотров: 257
Third session from the webinar series jointly organized by @NVIDIA and @CDACOfficial Pune, focused on training large language models (LLMs) from scratch.
In this session, we explored parallelism techniques (data, tensor, and pipeline), how they work together for scaling large models, and the role of mixed-precision training in improving efficiency. The discussion highlighted best practices and demonstrated how frameworks like NeMo and Megatron-LM support reliable large-scale training.
For any queries, please contact: [email protected]
#NPSF #GPU #CDACPune #HPCAI #AI #PARAMSiddhiAI #LLM
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: