Real-Time Surgical Guidance by Fusing Multi-Modal Imaging with NVIDIA Holoscan

Share post:

Developers in the fields of image-guided surgery and surgical vision face unique challenges in creating systems and applications that can significantly improve surgical workflows. One such challenge is efficiently combining multi-modal imaging data, such as preoperative 3D patient images with intra-operative video. This is key to providing surgeons with real-time, accurate guidance during minimally invasive or robotic-assisted procedures. 

In this post, we walk you through the use of state-of-the-art AI and imaging techniques, with a highlight of  ImFusion’s integration of NVIDIA Holoscan for real-time sensor processing, AI, and I/O. We explore how NVIDIA Holoscan enabled us to double the pipeline performance and we explain how combining two imaging modalities can contribute to enhanced surgical accuracy, reduced complications, and better outcomes.  

Challenges in combining multi-modal surgical data

During minimally invasive or robotic image-assisted surgery, accurate navigation and detailed understanding of the patient’s anatomy are crucial to the success of a surgical procedure. 

During preoperative planning, surgeons often rely on multi-modal imaging techniques, including 3D diagnostic imaging modalities, such as computed tomography (CT) scans, to identify abnormalities, designate target zones, and pinpoint critical structures such as blood vessels. 

However, combining these preoperative 3D image datasets with intra-operative video seamlessly during live surgical procedures remains a significant challenge, as surgeons often lack adequate access to this valuable preoperative data during surgery. 

The next generation of medical devices serving as procedural guidance in the operating room requires applications that use pre– and intra-operative multi-modal data. These must simultaneously execute multiple computationally intensive pipelines and perform the following functions:

  • Track anatomical structures in real-time: Accurately monitor the surface of targeted tissues and organs during the procedure.
  • Fuse preoperative and intra-operative data: Seamlessly blend 3D preoperative images with live surgical video feeds.
  • Provide low-latency visualization: Deliver fused, multi-modal information during surgery with sub-100 ms end-to-end latency for adequate hand-eye coordination and real-time decision-making. 

This requires a unique combination of AI, accelerated computing, and advanced visualization capabilities. 

NVIDIA Holoscan is a domain-specific computing platform that delivers an accelerated, full-stack infrastructure required for scalable, software-defined, and real-time processing of streaming data at the clinical edge. It includes a library of reference applications to jumpstart the developer’s timeline to build and optimize their own AI applications for production deployment. 

NVIDIA partners contribute to this library for the Holoscan AI sensor processing community to share applications, enabling you to reuse and contribute components and sample applications, foster innovation, and accelerate the development of advanced medical devices. 

Real-time 3D surgical guidance fusing pre-op and live data 

To address the live data limitation, ImFusion, a Munich-based company, and NVIDIA Connect Program member, used NVIDIA Holoscan to create a system that can integrate preoperative data with real-time intra-operative imaging. 

The novel system tracks the surface of a targeted anatomical structure in the form of a 3D mesh—a digital model that accurately depicts the structure’s shape and contours—and blends it smoothly into the surgeon’s view. The mesh is extracted from a preoperative CT scan and then overlaid in near real-time onto the live video feed from a laparoscopic camera. 

This enables surgeons to visualize the blended intra-operative and preoperative patient information and make more informed real-time decisions.

Using NVIDIA Holoscan and NVIDIA IGX for real-time surgical data fusion

ImFusion’s solution is built on their proprietary ImFusion SDK, which bundles algorithms for image processing, registration, analysis, and visualization. Integrating NVIDIA Holoscan into the ImFusion SDK unlocked new levels of performance, efficiency, and flexibility. 

As Alexander Ladikos, head of Computer Vision at ImFusion explained, “Integrating Holoscan into our ImFusion SDK has helped us achieve near real-time performance, crucial for surgical applications. It has accelerated our development process, saving us time and allowing us to reuse existing and custom operators for future projects.”

This integration enabled ImFusion to build and run low-latency, AI-enhanced, sensor-streaming applications, setting the stage for next-generation software as a medical device (SaMD) that enables surgeons to simultaneously view live and fused preoperative data. 

At the core of ImFusion’s system are three key neural networks, each using Holoscan acceleration capabilities [AND OPEN SOURCE LIBRARIES OF REF APPS]: 

  • Stereo Depth Estimation: This network generates depth information from endoscopic stereo video frames, using a state-of-the-art CNN-based model trained on synthetic data. Holoscan’s real-time processing capabilities enabled instant depth estimation from video streams, providing crucial spatial information for surgical guidance. 
  • Optical Flow Estimation: Calculating 2D pixel displacements between frames, this network ensures robust performance across various surgical scenarios. Holoscan enabled rapid 2D flow estimation for subsequent projection into 3D space, enhancing the system’s ability to track movement within the surgical field. 
  • Segmentation: Developed by the ORSI Academy, a global leading robotic surgery training institute based in Belgium, this deep learning segmentation model identifies surgical instruments and target tissue, crucial for accurate tracking and overlay. Holoscan enabled quick analysis of 3D flow estimates in segmented tissue regions, so the system could precisely identify and track specific anatomical structures and instruments in real time. 
Real-Time Surgical Guidance by Fusing Multi-Modal Imaging with NVIDIA Holoscan
Figure 1. Three models estimate depth, optical flow, and segmentation of the target tissue

Building upon these three neural networks, ImFusion’s system achieves impressive real-time performance. 

A surface mesh from a preoperative CT is manually registered with the underlying anatomical structure and then tracked automatically throughout the procedure. Using an NVIDIA IGX Developer Kit equipped with an NVIDIA RTX 6000 Ada GPU, the system achieves a median frame rate of ~13.5 Hz and an end-to-end latency below 75ms. 

While this latency continues to be optimized further, it represents a significant performance improvement, with Holoscan flow benchmarking showing a 50% reduction in end-to-end latency compared to previous hardware configurations and before  NVIDIA TensorRT AI model inference optimization. 

Screenshot of the Holoscan tissue tracking application. Within the ImFusion application, using the Holoscan tissue tracking application. the surface mesh is colored magenta, does not occlude any surgical instruments, and fits to the underlying target tissue.
FIGURE 2. Surface mesh tracking in the endoscopic video feed

This high level of performance is crucial for live surgical applications, as it enables surgeons to receive instantaneous visual feedback and provides an unprecedented view of the surgical scene. 

ORSI Academy, Europe’s largest robotic surgery training center, contributed to this advancement by partnering with both NVIDIA and ImFusion to guide the development and strengthen its clinical relevance. 

Dr Pieter De Backer, engineer and surgical resident leading Orsi Innotech, the surgical AI department of ORSI Academy, says, “Seamlessly blending live video feeds with overlaid 3D mesh projections can enhance our ability as surgeons to navigate complex anatomical structures during minimally-invasive or robotic-assisted procedures. In challenging renal surgery cases, for example, live visualization of the endophytic tumor surface mesh can enhance tumor delineation, and minimize damage to healthy tissue.“

The integration of Holoscan-SDK for low-latency tasks and AI inferencing workloads accelerates the development of AI-enhanced SaMD. Its compatibility with domain-specific frameworks such as ImFusion-SDK creates a powerful development environment that shortens development time and reduces costs.

Ecosystem collaboration and open-source contributions

The collaboration between ImFusion and NVIDIA Holoscan is upleveling the art of the possible in minimally-invasive and robotic-assisted procedures, combining AI, accelerated computing, and domain specificity to enhance precision, performance, and safety. ImFusion’s contributions to Holoscan reference applications can be integrated and built upon by its medtech customers, and its multimodal data fusion application will be available soon.

We invite you to explore and contribute to the Holoscan reference application repository to expand the ecosystem, accelerate the development of AI-enhanced medical devices, and advance real-time sensor fusion for surgical guidance.

Related articles

Supermicro Launches NVIDIA BlueField-Powered JBOF to Optimize AI Storage

The growth of AI is driving exponential growth in computing power and a doubling of networking speeds every...

NVIDIA CEO Jensen Huang to Spotlight Innovation at India’s AI Summit

The NVIDIA AI Summit India, taking place October 23–25 at the Jio World Convention Centre in...

Advancing the Accuracy-Efficiency Frontier with Llama-3.1-Nemotron-51B

Today, NVIDIA released a unique language model that delivers an unmatched accuracy-efficiency performance. Llama 3.1-Nemotron-51B, derived from Meta’s...