NVIDIA Opens Doors for Enterprises Worldwide NVIDIA Brings Large Language AI Models to Enterprises WorldwideNVIDIA NeMo Megatron Framework; Megatron 530B Customizable Large Language Model; Multi-GPU, Multinode Triton Inference Server Empower Development, Deployment of Language-Based AI to Advance Industries and ScienceNVIDIA opened the door for enterprises worldwide to develop and deploy large language models (LLM) by enabling them to build their own domain-specific chatbots, personal assistants and other AI applications that understand language with unprecedented levels of subtlety and nuance. The company unveiled the NVIDIA NeMo Megatron framework for training language models with trillions of parameters, the Megatron 530B customizable LLM that can be trained for new domains and languages, and NVIDIA Triton Inference Server with multi-GPU, multinode distributed inference functionality. Combined with NVIDIA DGX systems, these tools provide a production-ready, enterprise-grade solution to simplify the development and deployment of large language models. “Large language models have proven to be flexible and capable, able to answer deep domain questions, translate languages, comprehend and summarize documents, write stories and compute programs, all without specialized training or supervision,” said Bryan Catanzaro, vice president of Applied Deep Learning Research at NVIDIA. “Building large language models for new languages and domains is likely the largest supercomputing application yet, and now these capabilities are within reach for the world’s enterprises.” NVIDIA
NeMo Megatron and Megatron 530B Speed LLM Development The NeMo Megatron framework enables enterprises to overcome the challenges of training sophisticated natural language processing models. It is optimized to scale out across the large-scale accelerated computing infrastructure of NVIDIA DGX SuperPOD. NeMo Megatron automates the complexity of LLM training with data processing libraries that ingest, curate, organize and clean data. Using advanced technologies for data, tensor and pipeline parallelization, it enables the training of large language models to be distributed efficiently across thousands of GPUs. Enterprises can use the NeMo Megatron framework to train LLMs for their specific domains and languages. NVIDIA
Triton Inference Server Powers Real-Time LLM Inference With Triton Inference Server, Megatron 530B can run on two NVIDIA DGX systems to shorten the processing time from over a minute on a CPU server to half a second, making it possible to deploy LLMs for real-time applications. Massive
Custom Language Models Developed Worldwide SiDi, one of Brazil’s largest AI research and development institutes, has adapted the Samsung virtual assistant for use by the nation’s 200 million Brazilian Portuguese speakers. “The SiDi team has extensive experience developing AI virtual assistants and chatbots, which require both powerful AI performance and specialized software that is trained and adapted to the changing nuances of human language,” said John Yi, CEO of SiDi. “NVIDIA DGX SuperPOD is ideal for powering the advanced work of our team to help us bring world-leading AI services to Portuguese speakers in Brazil.” JD Explore Academy, the research and development division of JD.com, a leading supply chain-based technology and service provider, is utilizing NVIDIA DGX SuperPOD to develop NLP for the application of smart customer service, smart retail, smart logistics, IoT, healthcare and more. VinBrain, a Vietnam-based healthcare AI company, has used a DGX SuperPOD to develop and deploy a clinical language model for radiologists and telehealth in 100 hospitals, where it is used by over 600 healthcare practitioners. Source: NVIDIA media announcement |