Autor: |
Dorent R; Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA., Haouchine N; Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA., Kogl F; Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA., Joutard S; King's College London, London, United Kingdom., Juvekar P; Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA., Torio E; Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA., Golby A; Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA., Ourselin S; King's College London, London, United Kingdom., Frisken S; Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA., Vercauteren T; King's College London, London, United Kingdom., Kapur T; Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA., Wells WM 3rd; Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA.; Massachusetts Institute of Technology, Cambridge, MA, USA. |
Abstrakt: |
We introduce MHVAE, a deep hierarchical variational autoencoder (VAE) that synthesizes missing images from various modalities. Extending multi-modal VAEs with a hierarchical latent structure, we introduce a probabilistic formulation for fusing multi-modal images in a common latent representation while having the flexibility to handle incomplete image sets as input. Moreover, adversarial learning is employed to generate sharper images. Extensive experiments are performed on the challenging problem of joint intra-operative ultrasound (iUS) and Magnetic Resonance (MR) synthesis. Our model outperformed multi-modal VAEs, conditional GANs, and the current state-of-the-art unified method (ResViT) for synthesizing missing images, demonstrating the advantage of using a hierarchical latent representation and a principled probabilistic fusion operation. Our code is publicly available. |