CHITRA: Clustering Hidden-layer Interpretations through Technical Ranking and Attribution

Pranav Aravindan

doi:10.58445/rars.3136

##article.authors##

Pranav Aravindan Student

DOI:

https://doi.org/10.58445/rars.3136

Keywords:

Machine Learning, Artifical Intelligence, Cosine Similarity, Singular Value Decomposition, K-means Clustering, Neural Networks, Interpretability, Computer Science

Abstract

Developing transparent and reliable AI systems demands a deep understanding of how neural networks make decisions. This project introduces CHITRA, a novel algorithm designed for the analysis of neural network hidden layers. CHITRA combines cosine similarity, Singular Value Decomposition (SVD), and K-means clustering to identify and group similar neurons. Its unique approach involves clustering these neurons and tracking their activation paths across different inputs to deduce their function logically.

Existing interpretability techniques like SHAP and LIME often struggle with scalability and consistency in larger, more complex networks. In contrast, experiments using CNNs and RNNs on the MNIST and CIFAR-10 datasets showed that CHITRA outperformed these methods in terms of quantitative metrics, including accuracy, precision, and recall, as well as a human-centered interpretability score.

To quantitatively evaluate CHITRA’s performance in identifying neuron function, we established a ‘ground truth’ for a small, manually-labeled subset of neuron clusters. This involved a detailed, manual analysis of the features each cluster was known to detect. For each cluster, we systematically presented the model with specific inputs and recorded its activation patterns. For example, if a cluster consistently showed high activation exclusively for images of the digit ‘5’, we manually labeled it as a ‘5-detector’. This rigorous process created a reliable set of ground-truth labels against which we could compare CHITRA’s predictions.

Our analysis demonstrated that CHITRA’s predictions aligned with these ground-truth labels, showing a notable improvement in quantitative metrics over existing methods. In addition, the algorithm also excelled in qualitative evaluations, such as the human interpretability assessment. Given the past incidents of poorly trained models causing misdiagnoses, significant financial losses, and even fatalities, CHITRA provides a critical tool for researchers to fully understand their models, thereby preventing such incidents in the future.

References

Steven L. Brunton and J. Nathan Kutz, Singular Value Decomposition, in Data-Driven Science

and Engineering, 2019.

Alter O., Brown P. O., Botstein D., Singular value decomposition for genome-wide expression

data. Proceedings of the National Academy of Sciences, 97(18), 10101–10106, 2000.

J. MacQueen, Some Methods for Classification and Analysis of Multivariate Observations. University of California Press, 1967.

H. Steinhaus, Sur la division des corps materiels en parties. Bull. Acad. Polon. Sci., 1956.

S. P. Lloyd, Least Squares Quantization in PCM. Bell Telephone Laboratories Paper, 1957.

Johannes Bl¨omer, Kathrin Lammersen, Melanie Schmidt, Christian Sohler, Theoretical Analysis

of the k-Means Algorithm – A Survey. arXiv preprint arXiv:1602.08254, 2016.

Megha Suyal and Savita Sharma, A Review on Analysis of K-Means Clustering Machine Learning Algorithm Based on Unsupervised Learning. Journal of Artificial Intelligence and Systems,

Chunjie Luo, Jun Zhan, Li Wang, and Qiang Yang, Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks. arXiv preprint arXiv:1702.05870, 2017.

Netflix Research, Is Cosine-Similarity of Embeddings Really About Similarity? Netflix Research Blog, 2023. https://research.netflix.com/publication/

is-cosine-similarity-of-embeddings-really-about-similarity

Scott R. Milford, Bernice S. Elger, and David M. Shaw, Believe me! Why Tesla’s recent

alleged malfunction further highlights the need for transparent dialogue. Frontiers in Future

Transportation, 2023. https://doi.org/10.3389/ffutr.2023.1137469

Axios, Tesla Autopilot verdict sends a chill across the industry. Axios, August 2025. https:

//www.axios.com/2025/08/06/tesla-autopilot-verdict-safety

Wall Street Journal, Inside the WSJ’s Investigation of Tesla’s Autopilot

Crash Risks. Wall Street Journal, 2024. https://www.wsj.com/business/autos/

tesla-autopilot-crash-investigation-997b0129

CHITRA: Clustering Hidden-layer Interpretations through Technical Ranking and Attribution

##article.authors##

DOI:

Keywords:

Abstract

References

Additional Files

Posted

Categories

License