Press Releases
March 2, 2026
DOCOMO and NTT Successfully Demonstrate Low-Latency AI Video Analytics Using
In-Network Computing with Remote GPU Resources
— Advancing Toward the Realization of Networks That Enable AI and Robots to Fully Unlock Their Potential in the 6G Era —
Key Highlights:
- Implemented In-Network Computing Edge (INC Edge)*1 by connecting distributed GPU resources and the 5G network via IOWN All-Photonics Network (IOWN APN),*2 establishing technology to control AI inference processing*3 directly from the network alongside communications.
- Demonstrated that In-Network Computing with remote GPU resources can meet latency requirements assumed for remote robot control in the 6G era. *4
TOKYO, JAPAN, March 2, 2026 --- NTT DOCOMO, INC. and NTT, Inc. have successfully demonstrated low-latency AI video analysis using In-Network Computing (INC)*5 with INC Edge, which connects remotely distributed GPU resources and 5G networks via IOWN APN.
In this demonstration, AI inference was controlled directly from the network through INC Edge implemented on the 5G core network. This setup enabled video data sent from devices to be analyzed with minimal delay using remotely connected GPU resources via IOWN APN.
The results will be exhibited at
MWC Barcelona 2026. DOCOMO and NTT will continue testing and standardizing INC technology to support widespread use of simplified devices, aiming to realize a network that enables AI and robots to fully unlock their potential in the 6G era.
Background
In the 6G era, new services leveraging immersive XR, AI, and robotics are expected to emerge. These services often require high-volume, low-latency data transfer and large-scale data processing. For example, an autonomous robot may need to capture surrounding video and sensor data, analyze obstacles with AI, and provide real-time feedback for navigation. For applications running AI inference on small robots or lightweight wearable devices, maintaining a seamless user experience requires not only device-side processing but also the ability to process large volumes of data in real time outside the device. 6G networks are thus expected to handle both communication and service data processing to ensure service quality.
Traditionally, distributed AI inference has been controlled by applications or servers, with networks mainly serving as data transport. This made service quality highly dependent on the location of GPU resources and network latency, often requiring nearby computational resources to minimize delays, limiting flexible resource usage.
To address these challenges, DOCOMO and NTT have been developing INC as a key technology for 6G networks. INC distributes computing resources, including GPUs, across the network and controls both communication and service computation centrally, enabling high-quality delivery of AI and other advanced services.
Demonstration Overview
In this demonstration, distributed remote GPU resources were connected to the 5G network via INC Edge over the IOWN APN to test AI inference processing. Video data sent from devices was integrally controlled with communication processing through INC Edge, then transmitted to remote GPU resources via IOWN APN, and processed by AI video analysis applications.
Normally, distributed AI inference assumes GPUs are located nearby, as communication delays between distant resources can significantly slow overall processing. This demonstration tested whether INC Edge, controlling AI inference in-network alongside communication, could maintain high inference performance even using geographically distant GPU resources.
For this demonstration, the INC Edge was newly implemented with two key network functions: a feature to connect the IOWN APN with the mobile network, and a mechanism to split AI inference into pre-processing and execution stages. Pre-processed data was transmitted and distributed to remote GPUs with low latency using network functions implemented on INC Edge. Priority control in DOCOMO's commercial 5G core network on AWS ensured high-bandwidth, low-latency transmission. Combining these capabilities with INC Edge enabled fast, in-network AI video analysis. In this demonstration, the combined end-to-end latency of communication and AI video analysis was confirmed to be within the latency requirements assumed for autonomous robot operation in close proximity to humans. These results indicate that sufficiently low latency can be achieved to enable remote robot control in the 6G era.
Figure 1: System Configuration of the Demonstration Roles of Each Company
DOCOMO
- Planned and managed the overall demonstration.
- Provided commercial 5G SA environment and expertise, including core network and radio base station equipment.
- Designed and deployed the IOWN APN for the demonstration.
- Developed implementation methods and network configuration.
NTT
- Provided the INC platform.
- Supplied INC Edge, connecting and integrating the 5G core network with INC via IOWN APN to enable distributed inference.
- Developed implementation methods and network configuration.
Outlook
The results of this demonstration are expected to be applicable to data transmission and processing for AI and robotics in the 6G era. Going forward, DOCOMO and NTT will continue advancing INC as a core 6G network technology by promoting further research, validation, and international standardization. Through comprehensive integration of communication and data processing, they aim to support the widespread adoption of simplified devices and realize a network that enables AI and robots to fully unlock their potential in the 6G era.
Related Announcement
- An edge function implemented on the 5G core network that connects the 5G network with IOWN APN and enables network-controlled AI inference integrated with communication. INC Edge consists of a DPU-offloaded distributed UPF (dUPF), which connects and integrates the 5G network, IOWN APN, and INC, and an AI Proxy, which divides AI inference into pre-processing and execution stages and transfers pre-processed data to remote GPU resources via IOWN APN to enable distributed inference.
- An optical network infrastructure based on the IOWN concept, offering ultra-low latency, high bandwidth, and low power consumption. In this demonstration, it was used to connect distributed remote GPU resources to the 5G network with low latency and high stability.
- The process in which AI applies learned knowledge to new data, such as images or audio, to generate predictions or decisions.
- As of March 2, 2026. Based on research conducted by DOCOMO and NTT. The demonstration was successfully completed within the required latency thresholds specified under certain conditions-such as the distance between a robot and a human and the human movement speed-as defined in ISO/TS 15066, which outlines safety requirements for collaborative robots.
- A technology concept that integrates application-layer processing with network data transfer control to deliver high-performance services while reducing latency and device power consumption. The network actively manages where and how processing is executed, offloading workloads to accelerators and computing resources within the network to reduce device burden.