New NVIDIA microservices boost sovereign AI

To reflect local values and regulations, nations are increasingly pursuing sovereign AI strategies by developing AI that utilizes their own infrastructure, data, and expertise. NVIDIA supports this trend with the launch of four new NVIDIA Neural Inference Microservices (NIM).

These microservices simplify the creation and deployment of generative AI applications, supporting regionally-tailored community models. They promise deeper user engagement by enhancing understanding of local languages and cultural nuances, leading to more accurate and relevant responses.

This development comes as the Asia-Pacific generative AI software market is expected to experience a significant boom. ABI Research predicts revenue will surge from $5 billion this year to an impressive $48 billion by 2030.

Among the new offerings are two regional language models: Llama-3-Swallow-70B, trained on Japanese data, and Llama-3-Taiwan-70B, optimized for Mandarin. These models are designed to better understand local laws, regulations, and cultural nuances.

The RakutenAI 7B model family further strengthens the Japanese language offering. Based on Mistral-7B and trained on English and Japanese datasets, these models are available as two separate NIM microservices for Chat and Instruct functions. Notably, Rakuten’s models achieved the highest average score among open Japanese large language models in the LM Evaluation Harness benchmark from January to March 2024.

Training LLMs on regional languages is essential for improving output efficacy. By accurately reflecting cultural and linguistic subtleties, these models enable more precise and nuanced communication. Compared to base models like Llama 3, these regional variants show superior performance in understanding Japanese and Mandarin, handling regional legal tasks, and translating and summarizing text.

The global push for sovereign AI infrastructure is reflected in significant investments from countries such as Singapore, UAE, South Korea, Sweden, France, Italy, and India.

“LLMs are not mechanical tools that provide uniform benefits; rather, they interact with human culture and creativity. The impact is reciprocal: models are shaped by the data they are trained on, and our culture and data are influenced by LLMs,” said Rio Yokota, professor at the Global Scientific Information and Computing Center at the Tokyo Institute of Technology.

“Therefore, developing sovereign AI models that adhere to our cultural norms is crucial. The availability of Llama-3-Swallow as an NVIDIA NIM microservice will allow developers to easily access and deploy the model for Japanese applications across various industries.”

NVIDIA’s NIM microservices enable businesses, government bodies, and universities to host native LLMs within their own environments. Developers can create advanced copilots, chatbots, and AI assistants with these services, which are optimized for inference using the open-source NVIDIA TensorRT-LLM library, promising improved performance and deployment speed.

Leave a Comment

Your email address will not be published. Required fields are marked *