Paper: https://doi.org/10.1117/12.3046796
Highlights
- Novel anatomy normalization approach to standardize medical images for improved model generalization.
- High-resolution class activation mapping (CAM) for fine-grained spatial interpretability.
- Improved prediction accuracy through anatomically-aligned feature learning.
- Enhanced clinical interpretability enabling identification of critical anatomical regions contributing to xerostomia risk.
Abstract
Radiation-induced xerostomia remains a significant challenge in head and neck cancer radiotherapy. While deep learning models have shown promise in predicting treatment outcomes, their clinical adoption is limited by lack of interpretability and challenges in handling anatomical variations across patients. We propose a deep learning framework that incorporates anatomy normalization to standardize patient-specific anatomical variations and employs high-resolution class activation maps (CAM) to provide spatially-precise explanations of model predictions. The anatomy normalization module aligns anatomical structures across patients, enabling the model to learn more generalizable features. The high-resolution CAM provides fine-grained visualization of which anatomical regions contribute most to xerostomia risk, offering valuable insights for clinicians. Our approach achieves superior prediction performance while maintaining high interpretability, demonstrating the importance of combining domain knowledge with deep learning for medical outcome prediction.
Method
Our method consists of two key innovations: anatomy normalization and high-resolution class activation mapping.
The anatomy normalization module addresses the challenge of anatomical variation across patients. By aligning key anatomical structures before feature extraction, we enable the model to learn features that are robust to patient-specific anatomical differences. This normalization is performed using deformable registration guided by anatomical landmarks and segmentation masks.
The high-resolution CAM module provides detailed spatial explanations of model predictions. Unlike traditional CAM methods that produce low-resolution activation maps, our approach generates high-resolution visualizations that precisely localize anatomical regions contributing to prediction. This is achieved through a specialized upsampling strategy that preserves spatial details while maintaining semantic meaning.
Results
Our framework demonstrates superior performance in xerostomia prediction while providing clinically meaningful interpretations. The anatomy normalization significantly improves model generalization across diverse patient populations. The high-resolution CAMs successfully identify known risk factors such as parotid gland dose distributions and reveal novel spatial patterns associated with xerostomia risk.
Ablation studies confirm that both anatomy normalization and high-resolution CAM contribute to improved performance and interpretability. Clinical evaluation by radiation oncologists validates that the CAM visualizations align with clinical knowledge and provide actionable insights for treatment planning.
Conclusion
This article is only meant for a brief introduction.
We present a deep learning framework for xerostomia prediction that combines anatomy normalization with high-resolution class activation mapping. The anatomy normalization module enables robust feature learning across patients with varying anatomical structures, improving model generalization. The high-resolution CAM provides fine-grained spatial interpretability, identifying specific anatomical regions contributing to xerostomia risk. Our approach achieves state-of-the-art prediction performance while maintaining clinical interpretability, demonstrating the value of incorporating medical domain knowledge into deep learning models. This work represents an important step toward clinically deployable AI systems for personalized radiation therapy planning.
