Reconstruction of endoscopic scenes is crucial for various medical applications, from post-surgery analysis to educational training. Neural rendering has recently shown promise in reconstructing endoscopic scenes with deforming tissue. However, existing methods are limited by static endoscopes, restricted deformation, or dependence on external tracking devices for camera pose information.
In this paper, we present Flow-optimized Local Hexplanes (FLex), a novel approach addressing the challenges of a moving stereo endoscope in a highly dynamic environment. FLex implicitly separates the scene into multiple overlapping 4D neural radiance fields (NeRFs) and employs a progressive optimization scheme for joint reconstruction and camera pose estimation from scratch. Tested on sequences of length up to 5,000 frames, which is five times the length handled in the experiment of previous methods, this technique enhances usability substantially. It scales highly detailed reconstruction capabilities to significantly longer surgical videos, all without requiring external tracking information
Qualitative results showing two images at two different timesteps from a 1,000 frame scene with breathing deformations and camera motion. The reference image is the ground truth image, and the dark blue framed images are the zoomed-in sections of the upper images. Especially the zoomed-in sections highlight a finer image quality of both FLex variants compared to HexPlane and ForPlane (strongest two baselines).
@article{stilz2024flex,
title={FLex: Joint Pose and Dynamic Radiance Fields Optimization for Stereo Endoscopic Videos},
author={Stilz, Florian Philipp and Karaoglu, Mert Asim and Tristram, Felix and Navab, Nassir and Busam, Benjamin and Ladikos, Alexander},
journal={arXiv preprint arXiv:2403.12198},
year={2024}
}