Convolutional neural networks (CNNs) have recently proven their excellent ability to segment 2D cardiac ultrasound images. However, the majority of attempts to perform full sequence segmentation of cardiac ultrasound videos either rely on models trained only on end-diastole (ED) and end-systole (ES) images, or fail to directly extract temporal features from a sequence of cardiac ultrasound images while also not preserving the topology of segmentation maps. To address these issues, in this work, we consider the estimation of sequence segmentation of ultrasound video as a registration estimation problem and then present a novel method for diffeomorphic image registration using neural ordinary differential equations (Neural ODEs) to extract temporal information from ultrasound video. In particular, we consider the problem of the registration field vector field between frames as a continuous trajectory ODE. The estimated registration field is then applied to the segmentation mask of the first frame to obtain a segment for the whole cardiac cycle. The proposed method, Echo-ODE, introduces several key improvements compared to the previous SOTAs. Firstly, by solving a continuous ODE, the proposed method achieves smoother segmentation, preserving the topology of segmentation maps over the whole sequence (Hausdorff distance: 3.7–4.4). Secondly, our method ensures temporal consistency between frames without explicitly optimizing for temporal consistency attributes, while still achieving temporal consistency in 91% of the videos in the dataset. Lastly, our method is able to maintain the clinical accuracy of the segmentation maps (mean absolute error (MAE) of the left-ventricular ejection fraction (LVEF): 2.7–3.1). The results show that the proposed method surpasses the previous state-of-the-art in multiple aspects, demonstrating the importance of spatial-temporal data processing for the implementation of NODEs in medical imaging applications. These findings open up new research directions for solving echocardiography segmentation tasks.
Performance of the methods on temporal consistency, anatomical accuracy, and clinical metric. The Hausdorff distance (HD), Dice score and mean absolute error of ejection fraction (EF) are reported in terms of the mean and the standard deviation across 98 test sequences. The temporal error (Temp. Error) is the number of sequences that have at least one temporal inconsistency. The mDice is the mean Dice score of the LV myocardium and LV epicardium. The Dice LV$_{epi}$ and Dice LV$_{myo}$ are the Dice scores of the LV epicardium and LV myocardium, respectively. The best results are highlighted in bold.
BibTex Code Here