Abstract
Deep learning (DL) based Convolutional Neural Networks (CNNs) have achieved strong performance in image classification tasks. However, their limited transparency raises concerns about reliability, particularly in medical applications. High predictive accuracy alone does not guarantee anatomically meaningful decision making, especially in critical healthcare scenarios. This issue is evident in COVID-19 classification from chest X-ray (CXR) images, where models may produce correct predictions without focusing on clinically relevant regions. This motivates the need for a systematic approach to evaluate whether model decisions align with meaningful anatomical structures. To address this, this thesis proposes a quantitative framework for evaluating model transparency by comparing Gradient-weighted Class Activation Mapping (Grad-CAM) explanations with radiologist-defined lung masks. Six pre-trained CNN architectures were trained using transfer learning on a local dataset of 1300 CXR images. Classification performance was assessed using standard metrics, while interpretability was evaluated by binarizing Grad-CAM heatmaps and computing Intersection-over-Union (IoU) and Dice scores. All models achieved test accuracies between approximately 91% and 96%. Xception achieved the highest accuracy (95.90%) and the strongest anatomical alignment, while VGG16 and VGG19 showed minimal overlap with lung regions despite competitive accuracy. Additional experiments on a reduced Xception model preserved accuracy but reduced anatomical alignment. These findings demonstrate that accuracy and interpretability represent distinct aspects of model behavior. The proposed framework provides a consistent method for assessing model transparency and supports the inclusion of interpretability metrics in medical artificial intelligence (AI) evaluation.









