The U-net does its job – so what next?
The U-net is currently the second-most successful paper (in terms of citations) in the 21 years MICCAI history. U-net based architectures have demonstrated very high performance in a wide range of medical image segmentation tasks, but a powerful segmentation architecture alone is only one part of building clinically applicable tools. In my talk I'll present three projects from the DeepMind Health Research team that address these challenges.
The first project, a collaboration with University College London Hospital, deals with the challenging task of the precise segmentation of radiosensitive head and neck anatomy in CT scans, an essential input for radiotherapy planning . With a 3D U-net we reach a performance similar to human experts on the majority of anatomical classes. Beside some minor architectural adaptations, e.g. to tackle the large imbalance of foreground to background voxels, a substantial focus of the project was in generating a high-quality test set  where each scan was manually segmented by two independent experts. Furthermore we introduced a new surface based performance metric, the surface DSC , designed to be a better proxy for the expected performance in a real-world radiotherapy setting than existing metrics.
The second project, together with Moorfields Eye Hospital, developed a system that analyses 3D OCT (optical coherence tomography) eye scans to provide referral decisions for patients . The performance was on par with world experts with over 20 years experience. We use two network ensembles to decouple the variations induced by the imaging system from the patient-to-patient variations. The first ensemble of 3D U-nets creates clinically interpretable device-independent tissue map hypotheses; the second (3D dense-net based) ensemble maps the tissue map hypotheses to the diagnoses and referral recommendation. Adaptation to a new scanning device type only needed sparse manual segmentations on 152 scans, while the diagnosis model (trained with 14,884 OCT scans) could be reused without changes.
The third project deals with the segmentation of ambiguous images [5,6]. This is of particular relevance in medical imaging where ambiguities can often not be resolved from the image context alone. We propose a combination of a U-net with a conditional variational autoencoder that is capable of efficiently producing an unlimited number of plausible segmentation map hypotheses for a given ambiguous image. We show that each hypothesis provides a globally consistent segmentation, and that the probabilities of these hypotheses are well calibrated.
 Nikolov et al. (2018) "Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy", ArXiv 1809.04430, https://arxiv.org/abs/1809.04430
 Dataset available at https://github.com/deepmind/tcia-ct-scan-dataset
 Implementation available at https://github.com/deepmind/surface-distance
 De Fauw, et al. (2018) "Clinically applicable deep learning for diagnosis and referral in retinal disease" Nature Medicine 24(9), 1342--1350. https://doi.org/10.1038/s41591-018-0107-6 (fulltext available from https://deepmind.com/blog/moorfields-major-milestone/ )
 Kohl, et al. (2018) "A Probabilistic U-Net for Segmentation of Ambiguous Images". NIPS 2018 (accepted). Preprint available at https://arxiv.org/abs/1806.05034.
 Implementation available at https://github.com/SimonKohl/probabilistic_unet