Stereotactic neurosurgeries require submillimeter precision due to the potential neurological risks involved. While robotic assistance provides the accuracy necessary for procedures such as deep brain stimulation or stereoelectroencephalography, current medical robots remain limited to one-dimensional trajectories, restricting their ability to avoid critical brain regions. To overcome this limitation, Robeauté is developing a microrobot capable of autonomous three-dimensional neuronavigation. This microrobot is integrated within a dedicated localization workflow. Its real-time three-dimensional position is obtained through a ultrasound tracking system, while pre-operative CT and MRI provide cranial geometry and detailed neuroanatomy. Because non-linear MRI distortions are non-negligible at the microrobot scale, the MRI volume is non-rigidly registered to the CT, used as a distortion-free reference. Precise implant localization, based on the centers of emission of the ultrasound emitters, embeds the tracking coordinate system into the CT frame with submillimetric uncertainty to maintain stereotactic safety margins. Furthermore, a voxel-wise registration error estimation (REE) method is required, targeting accuracy below 25%.
Since existing techniques do not meet these precision, modality, and resolution requirements, we developed manual and semi-automatic baseline methods to measure registration error and implant localization in preclinical tests. Building upon these foundations, a two-step REE framework was introduced: first, a regression U-Net trained with synthetic B-spline deformations predicts voxel-wise registration error for mono-modal alignment; second, a generative model synthesizes a pseudo-CT from MRI, enabling voxel-level REE between pseudo-CT and real CT volumes, thereby extending the method to multi-modal registration. This pipeline produces voxel-wise non-linear error maps with a mean relative error below 10% for CT/MRI alignment.
For implant localization, an automatic detection strategy using a YOLO-based network, trained on preclinical data, reconstructs transducer emission centers in 3D with an absolute error below 1 mm. Future work aims to improve the robustness of the multi-modal model and extend detection to volumetric networks for clinical translation.