CHITRA: Clustering Hidden-layer Interpretations through Technical Ranking and Attribution
DOI:
https://doi.org/10.58445/rars.3136Keywords:
Machine Learning, Artifical Intelligence, Cosine Similarity, Singular Value Decomposition, K-means Clustering, Neural Networks, Interpretability, Computer ScienceAbstract
Developing transparent and reliable AI systems demands a deep understanding of how neural networks make decisions. This project introduces CHITRA, a novel algorithm designed for the analysis of neural network hidden layers. CHITRA combines cosine similarity, Singular Value Decomposition (SVD), and K-means clustering to identify and group similar neurons. Its unique approach involves clustering these neurons and tracking their activation paths across different inputs to deduce their function logically.
Existing interpretability techniques like SHAP and LIME often struggle with scalability and consistency in larger, more complex networks. In contrast, experiments using CNNs and RNNs on the MNIST and CIFAR-10 datasets showed that CHITRA outperformed these methods in terms of quantitative metrics, including accuracy, precision, and recall, as well as a human-centered interpretability score.
To quantitatively evaluate CHITRA’s performance in identifying neuron function, we established a ‘ground truth’ for a small, manually-labeled subset of neuron clusters. This involved a detailed, manual analysis of the features each cluster was known to detect. For each cluster, we systematically presented the model with specific inputs and recorded its activation patterns. For example, if a cluster consistently showed high activation exclusively for images of the digit ‘5’, we manually labeled it as a ‘5-detector’. This rigorous process created a reliable set of ground-truth labels against which we could compare CHITRA’s predictions.
Our analysis demonstrated that CHITRA’s predictions aligned with these ground-truth labels, showing a notable improvement in quantitative metrics over existing methods. In addition, the algorithm also excelled in qualitative evaluations, such as the human interpretability assessment. Given the past incidents of poorly trained models causing misdiagnoses, significant financial losses, and even fatalities, CHITRA provides a critical tool for researchers to fully understand their models, thereby preventing such incidents in the future.
References
Steven L. Brunton and J. Nathan Kutz, Singular Value Decomposition, in Data-Driven Science
and Engineering, 2019.
Alter O., Brown P. O., Botstein D., Singular value decomposition for genome-wide expression
data. Proceedings of the National Academy of Sciences, 97(18), 10101–10106, 2000.
J. MacQueen, Some Methods for Classification and Analysis of Multivariate Observations. University of California Press, 1967.
H. Steinhaus, Sur la division des corps materiels en parties. Bull. Acad. Polon. Sci., 1956.
S. P. Lloyd, Least Squares Quantization in PCM. Bell Telephone Laboratories Paper, 1957.
Johannes Bl¨omer, Kathrin Lammersen, Melanie Schmidt, Christian Sohler, Theoretical Analysis
of the k-Means Algorithm – A Survey. arXiv preprint arXiv:1602.08254, 2016.
Megha Suyal and Savita Sharma, A Review on Analysis of K-Means Clustering Machine Learning Algorithm Based on Unsupervised Learning. Journal of Artificial Intelligence and Systems,
Chunjie Luo, Jun Zhan, Li Wang, and Qiang Yang, Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks. arXiv preprint arXiv:1702.05870, 2017.
Netflix Research, Is Cosine-Similarity of Embeddings Really About Similarity? Netflix Research Blog, 2023. https://research.netflix.com/publication/
is-cosine-similarity-of-embeddings-really-about-similarity
Scott R. Milford, Bernice S. Elger, and David M. Shaw, Believe me! Why Tesla’s recent
alleged malfunction further highlights the need for transparent dialogue. Frontiers in Future
Transportation, 2023. https://doi.org/10.3389/ffutr.2023.1137469
Axios, Tesla Autopilot verdict sends a chill across the industry. Axios, August 2025. https:
//www.axios.com/2025/08/06/tesla-autopilot-verdict-safety
Wall Street Journal, Inside the WSJ’s Investigation of Tesla’s Autopilot
Crash Risks. Wall Street Journal, 2024. https://www.wsj.com/business/autos/
tesla-autopilot-crash-investigation-997b0129
Additional Files
Posted
Categories
License
Copyright (c) 2025 Pranav Aravindan

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.