Position: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience
Preprint in arXiv (June 2024)
The most recent citing publications are shown below. View all 19 publications that cite this research output on Dimensions.
Preprint in arXiv (June 2024)
Article in European Journal of Neuroscience (April 2024)
Article in Bulletin of Mathematical Biology (August 2023)