ShapBPT: Image Feature Attributions using Data-Aware Binary Partition Trees

Published in AAAI-26 | 40th Annual AAAI Conference on Artificial Intelligence, 2026

Pixel-level feature attributions play a key role in Explainable Computer Vision (XCV) by revealing how visual features influence model predictions. While hierarchical Shapley methods based on the Owen formula offer a principled explanation framework, existing approaches overlook the multiscale and morphological structure of images, resulting in inefficient computation and weak semantic alignment.

To bridge this gap, we introduce ShapBPT, a data-aware XCV method that integrates hierarchical Shapley values with a Binary Partition Tree (BPT) representation of images. By assigning Shapley coefficients directly to a multiscale, image-adaptive hierarchy, ShapBPT produces explanations that align naturally with intrinsic image structures while significantly reducing computational cost. Experimental results demonstrate improved efficiency and structural faithfulness compared to existing XCV methods, and a 20-subject user study confirms that ShapBPT explanations are consistently preferred by humans.

Contributions 📃

In this research, we introduces;

  1. A novel hierarchical model-agnostic XCV method for images, named \emph{ShapBPT}, that integrates an adaptive multi-scale partitioning algorithm with the Owen approximation of the Shapley coefficients. We repurpose the BPT (Binary Partition Tree) algorithm~\cite{salembier2000BPT} to effectively construct hierarchical structures for explainability. This approach overcomes the limitations of the inflexible hierarchies of state-of-the-art methods such as SHAP.
  2. An empirical assessment of the proposed method on natural color images showcasing its efficacy across various scoring targets, in comparison to established state-of-the-art XCV methods, and a controlled human-subject study comparing explanation interpretability across methods.

Method Availability

Datasets and Models

  • Dataset: ImageNet, MC Coco, MVTec, CelebA-HQ.
  • Model: ViT, SwinViT, ResNet-50, Yolo-v11, Custom CNN, VAE-GAN.

Experiments Summary

IDDatasetSizeModelShort Description
E1ImageNet-S50574ResNet50Common ImageNet setup
E2ImageNet-S50574IdealLinear ideal model
E3ImageNet-S50621SwinViTVision Transformer
E4MS-COCO274YOLO11sObject detection
E5CelebA400CNNFacial attribute localization
E6MVTec280VAE-GANAnomaly Detection
E7ImageNet-S50593ViT-Base16Vision Transformer
E8User preference study using E1 saliency maps

Authors ✍️

Sr. No.Author NameAffiliationGoogle Scholar
1.Muhammad RashidUniversity of Torino, Dept. of Computer Science, Torino, ItalyMuhammad Rashid
2.Elvio G. AmparoreUniversity of Torino, Dept. of Computer Science, Torino, ItalyElvio G. Amparore
3.Enrico FerrariRulex Innovation Labs, Rulex Inc., Genova, ItalyEnrico Ferrari
4.Damiano VerdaRulex Innovation Labs, Rulex Inc., Genova, ItalyDamiano Verda

Keywords 🔍

Shapley Values · Binary Partition Trees · eXplainable AI · XAI · Image Feature Attributions

Recommended citation: