Suchir Salhan is a PhD Candidate in the Department of Computer Science & Technology at the University of Cambridge (Gonville & Caius College) researching Small Language Models and Cognitively-Inspired AI. He previously completed a BA and MEng in Computer Science & Linguistics at Gonville & Caius College, obtaining a “starred First” (Class I with Distinction) and a Distinction respectively.
Biography
2024 - current PhD Candidate in Computer Science, University of Cambridge (Gonville & Caius College)
2020 - 2024 Computer Science & Linguistics Triposes, University of Cambridge (Gonville & Caius College)
My academic work spans Machine Learning and Cognitive Science, with a focus on Explainable and Interpretable Machine Learning, and fundamental questions about the human capacity for natural language.
Before my PhD, I completed the Linguistics and Computer Science Triposes at the University of Cambridge, where I had the opportunity to work on a funded internship in the ALTA Institute with Prof Paula Buttery, Dr Andrew Caines, Dr Russell Moore and Dr Thiemo Wambsganss, as a Research Assistant on a code-switching project with Dr Li Nguyen, and as a research student with Prof Nigel Collier. My past experience includes work on Multimodal Vision-Language Models in the Language Technology Lab with Prof Nigel Collier and Fangyu Liu (now at Google DeepMind). I have probed vision-language models, such as CLIP, investigating their semantic representations, and explored Nearest Neighbour Algorithms for Offline Imitation Learning (IL). I have also researched Explainable AI, Argumentation Mining, and Shortcut Learning in Natural Language Inference. Within Linguistics, I have interests in Typology (and typological applications in multilingual NLP), Syntactic Theory (especially Neo-Emergentism and Biolinguistics), and Morphological and Phonological Theory.
Research
My research is primarily concerned with engineering more cognitively plausible Foundation Models. This emerging research paradigm aims to enhance the cognitive capabilities of cutting-edge computational systems within a cognitively plausible environment. Supervised by Professor Paula Buttery, in my PhD, I am working toward creating cognitively-inspired computational systems, including general-purpose Small-Scale Language Models (SSLMs) that can outperform larger models across several NLP tasks and designing techniques to adapt SSLMs to domain-specific applications.
Publications
Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies.
Suchir Salhan, Richard Diehl-Martinez, Zebulon Goriely, Paula Buttery
Presented at EMNLP 2024 (Miami, FL, USA in November 2024)
On the Potential for Maximising Minimal Means in Transformer Language Models: A Dynamical Systems Perspective.
Suchir Salhan
In Cambridge Occasional Papers in Linguistics, Department of Theoretical & Applied Linguistics, 2023
Human-Validated Grammar Profiles for Language Model Evaluation.
Presented in a Colloquium Organised with Prof Detmar Meurers (Tubingen, Germany, March 2025)
Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies.
Presented at the HumanCLAIM Workshop organised by Prof Lisa Beinborn (Göttingen, Germany, March 2025)
LLMs “off-the-shelf” or Pretrain-from-Scratch? Recalibrating Biases and Improving Transparency using Small-Scale Language Models.
Suchir Salhan, Richard Diehl-Martinez, Zebulon Goriely, Andrew Caines, Paula Buttery
Learning & Human Intelligence Group, Department of Computer Science & Technology, 2024
On the Potential for Maximising Minimal Means in Transformer Language Models: A Dynamical Systems Perspective. (BA Dissertation)
Suchir Salhan
SyntaxLab (organised by Dr Theresa Biberauer), University of Cambridge, March 2023
Teaching and Supervisions
Guest Lecturer and Teaching Assistant for L95 (ACS/Part III) Introduction to Natural Language Syntax and Parsing (Prof Paula Buttery, Dr Fermin Moscoso del Prado Martin).
Teaching Assistant for Machine Learning & Real World Data (Part IA, Computer Science Tripos)
Guest Lecturer for MPhil in Advanced Computer Science – delivered a lecture on Language Model Evaluation and Mechanistic Interpretability (Nov 2024).
Supervisions
Machine Learning and Bayesian Inference (Part II, Computer Science Tripos)
Formal Models of Language (Part IB, Computer Science Tripos)
Artificial Intelligence (Part IB, Computer Science Tripos)
Probability (Part IA, Computer Science Tripos)
College Supervisor for Linguistics Tripos (Gonville & Caius College) – Linguistic Theory (Part IIB, Linguistics Tripos), Part I Linguistics Tripos.
Other
Co-organised a Phonological Theory Discussion Group with Prof Bert Vaux (2022-23)
Supervisor for MPhil in Advanced Computer Science Dissertation on Small Language Models (Vision-Language Models) and Learning Dynamics (2024 - 25)
Other Professional Activities
Organiser of the Natural Language & Information Processing (NLIP) Seminars 2024 - 25.
Reviewer for the BabyLM Shared Task (in EMNLP 2024).
ACL 2025 Emergency Reviewer.