Bringing AI-RAN to a Telco Near You

Inferencing for generative AI and AI agents will drive the need for AI compute infrastructure to be distributed from edge to central clouds. IDC predicts that “Business AI (consumer excluded) will contribute $19.9 trillion to the global economy and account for 3.5% of GDP by 2030.”

5G networks must also evolve to serve this new incoming AI traffic. At the same time, there is an opportunity for telcos to become the local AI compute infrastructure for hosting enterprise AI workloads, independent of network connectivity while meeting their data privacy and sovereignty requirements. This is where an accelerated computing infrastructure shines – with the ability to accelerate both Radio signal processing and AI workloads. And most importantly, the same compute infrastructure can be used to process AI and radio access network (RAN) services. This combination has been called AI-RAN by the telecoms industry.

NVIDIA is introducing Aerial RAN Computer-1, the world’s first AI-RAN deployment platform, that can serve AI and RAN workloads concurrently, on a common accelerated infrastructure.

Following the launch of the AI-RAN Innovation Center by T-Mobile, the Aerial RAN Computer-1 turns AI-RAN into reality with a deployable platform that telcos can adopt globally. It can be used in small, medium, or large configurations for deployment at cell sites, distributed or centralized sites, effectively turning the network into a multi-purpose infrastructure that serves voice, video, data, and AI traffic.

This is a transformative solution that reimagines wireless networks for AI, with AI. It is also a huge opportunity for telcos to fuel the AI flywheel, leveraging their distributed network infrastructure, low latency, guaranteed quality of service, massive scale, and ability to preserve data privacy, security, and localization – all key requirements for AI inferencing and agentic AI applications.

GPU Compute	40 PFLOPS FP4 \| 20 PFLOPS FP8/FP6 10x GH200
GPU Memory	Up to 384 GB
CPU	144 Core ARMv9, 960 GB LPDDR5, 1.4x perf & 30% lower power than 2x SPR
CPU to GPU NVLink C2C	Per GPU 900 GB/s bi-dir. & cache-coherent
GPU to GPU NVLink	1,800 GB/s bi-dir., NVLink
Scale-Out	Spectrum-X Ethernet or InfiniBand Connect-X or BlueField
OS	Single OS with unified address space covering 2 CPU + 2 GPU
System Power	Full System ~3,500W, configurable
Schedule	Sample: Q4 2024 MP: Q1 2025

Bringing AI-RAN to a Telco Near You

AI-RAN, AI Aerial, and Aerial RAN Computer-1

Benefits of Aerial RAN Computer-1

Building blocks of Aerial RAN Computer-1

NVIDIA GB200 NVL2

NVIDIA Blackwell GPU

NVIDIA Grace CPU

NVLink2 C2C

Fifth-generation NVIDIA NVLink

Key-value caching

MGX reference architecture

Real-time mainstream LLM inference

Supporting hardware for Aerial RAN Computer-1

NVIDIA BlueField-3

NVIDIA networking Spectrum-X

Software stacks on Aerial RAN Computer-1

NVIDIA Aerial CUDA-Accelerated RAN

NVIDIA AI Enterprise and NVIDIA NIM

NVIDIA Cloud Functions

Deployment options and performance

Conclusion

10 Best Project Portfolio Management (PPM) Software for 2024

Cybersecurity Awareness Month: A Wake-Up Call for Modern Threats in the Age of AI

OneSpan’s New VISION FX Solution Ups the Industry’s Bar for Phishing-Resistant Banking Security

DataStax Announces New AI Development Platform, Built with NVIDIA AI

Enterprise IT Facing Imminent AI Agent Revolution

Related articles

Microsoft Cuts Off Azure OpenAI Access for Chinese Developers

Meta Cracks Down on Sextortion on Instagram: No More Screenshots in DMs

Supermicro Launches NVIDIA BlueField-Powered JBOF to Optimize AI Storage

Microchip Demonstrates Flashtec 5016 Enterprise SSD Controller