skip to content

Cambridge Language Sciences

Interdisciplinary Research Centre


I am a computational semanticist.  I pursue research as an Academic Fellow at the Department of Computer Science & Technology, pursue teaching as a College Lecturer at Gonville & Caius College, and promote interdisciplinary work as an Executive Director of Cambridge Language Sciences. I also support revitalisation of the Hokkien language, and enjoy ballroom and latin dancing.

I first arrived in Cambridge in 2009, to study mathematics as an undergraduate at Trinity College, before switching to a masters in computer science, with a focus on computational linguistics.  I then spent one year studying at Saarland University and working at DFKI (the German Research Centre for Artificial Intelligence), before returning to Cambridge to pursue a PhD under the supervision of Ann Copestake, which I completed in 2018.

I was born in Singapore and grew up in London.  I speak English (native), German (fluent), French (fluent), Hokkien (advanced), Mandarin (intermediate), and bits and pieces of others, including Greek, Georgian, Swedish, Dutch, and Rhine-Franconian.


The main focus of my research is on semantics.  How do people understand language, and how do people learn to do that?  I approach this with a foot in two worlds: the logical world of formal semantics, and the data-driven world of distributional semantics.  The aim of formal semantics is to develop mathematical models of meaning, with a particular focus on semantic composition and logical inference.  The aim of distributional semantics is to develop computational models of meaning, using algorithms that can be run on a corpus of text.  Combining the two opens up new opportunities.  From the computational perspective, formal semantic structure enables a model to learn and generalise more effectively.  From the formal perspective, a computational model allows us to tackle research questions that would be impossible to handle with pen and paper.

For a gentle introduction to my work, see the following one-hour talk I gave at the ILFC Seminar jointly organised by Université Paris Cité and Université du Québec à Montréal: "Learning meaning in a logically structured model: An introduction to Functional Distributional Semantics"

For an overview of how I see the connection between machine learning and language, see this article on "Language and AI".

I have more general research interests beyond semantics, including: machine learning (how can models work with structure?), philosophy of language (what does it mean to know a language?), morphosyntax (what are the components of language?), and NLP for low-resource languages (how can we make sure NLP works for everyone?).

I am also keen to support researchers in the humanities and social sciences who would like to use machine learning to further their work.  If you would like to discuss any ideas (whether you're at an exploratory stage, or looking for technical feedback), please don't hesitate to get in touch!


Key publications: 

Guy Emerson. 2018. "Functional Distributional Semantics: Learning Linguistically Informed Representations from a Precisely Annotated Corpus". PhD Thesis, University of Cambridge.

Honourable Mention (top 3) for the 2019 E.W. Beth Dissertation Prize.
Highly Commended (top 3) for the 2019 CPHC/BCS Distinguished Dissertation Award.

Subsequent papers building on the thesis:

Guy Emerson. 2020. "Autoencoding Pixies: Amortised Variational Inference with Graph Convolutions for Functional Distributional Semantics". In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). (This paper is summarised in this virtual poster, presented at the June 2020 Language Sciences Symposium.)

Guy Emerson. 2020. "What are the Goals of Distributional Semantics?". In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL).

Guy Emerson. 2020. "Linguists Who Use Probabilistic Models Love Them: Quantification in Functional Distributional Semantics". In Proceedings of the Probability and Meaning Conference (PaM2020). (This paper is summarised in this video.)

Yinhong Liu and Guy Emerson. 2022. "Learning Functional Distributional Semantics with Visual Data". In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL).

Guy Emerson. 2023. "Probabilistic Lexical Semantics: From Gaussian Embeddings to Bernoulli Fields". In Probabilistic Approaches to Linguistic Theory, chapter 3, CSLI Publications.

Chun Hei Lo, Hong Cheng, Wai Lam, and Guy Emerson. 2023. "Functional Distributional Semantics at Scale". In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM).

Xuyou Cheng, Michael Sejr Schlichtkrull, and Guy Emerson. 2023. "Are Embedded Potatoes Still Vegetables? On the Limitations of WordNet Embeddings for Lexical Semantics". In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP).

Kin Chun (Bruce) Cheung and Guy Emerson. 2024. "Colour Me Uncertain: Representing Vagueness with Probabilistic Semantics". In Proceedings of the Third Workshop on Understanding Implicit and Underspecified Language (UnImplicit).

Chun Hei Lo, Wai Lam, Hong Cheng, and Guy Emerson. 2024. "Distributional Inclusion Hypothesis and Quantifications: Probing for Hypernymy in Functional Distributional Semantics". To appear in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL).

Other publications: 

Wenxi Li, Yutong Zhang, Guy Emerson, and Weiwei Sun. 2024. "UG-schematic Annotation for Event Nominals: A Case Study in Mandarin Chinese". In Computational Linguistics.

Fangyu Liu, Guy Emerson, and Nigel Collier. 2023. "Visual Spatial Reasoning". In Transactions of the Association for Computational Linguistics (TACL).

Olga Zamaraeva, Chris Curtis, Guy Emerson, Antske Fokkens, Michael Wayne Goodman, Kristen Howell, T.J. Trimble, and Emily M. Bender. 2022. "20 years of the Grammar Matrix: cross-linguistic hypothesis testing of increasingly complex interactions". In Journal of Language Modelling, volume 10, number 1.

Ștefania Preda and Guy Emerson. 2022. "Using dependency parsing for few-shot learning in distributional semantics". In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop.

James Hargreaves, Andreas Vlachos, and Guy Emerson. 2021 "Incremental Beam Manipulation for Natural Language Generation". In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL).

Emily M. Bender and Guy Emerson. 2021. "Computational linguistics and grammar engineering". In Head-Driven Phrase Structure Grammar: The handbook, chapter 25. Language Science Press.

Jun-Yen Leung, Guy Emerson, and Ryan Cotterell. 2020. "Investigating Cross-Linguistic Adjective Ordering Tendencies with a Latent-Variable Model". In Proceedings of the 25th Conference on Empirical Methods in Natural Language Processing (EMNLP).

Olga Zamaraeva and Guy Emerson. 2020. "Multiple Question Fronting without Relational Constraints: An Analysis of Russian as a Basis for Cross-Linguistic Modeling". In Proceedings of the 27th International Conference on Head-driven Phrase Structure Grammar (HPSG).

Sebastian Borgeaud and Guy Emerson. 2020. "Leveraging sentence similarity in natural language generation: Improving beam search using range voting". In Proceedings of the 4th Workshop on Neural Generation and Translation (WNGT).

Jeroen Van Hautte, Guy Emerson, and Marek Rei. 2019. "Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models". In Proceedings of the 2nd Workshop on Deep Learning for Low-Resource NLP.

Paula Czarnowska, Guy Emerson, and Ann Copestake. 2019. "Words are Vectors, Dependencies are Matrices: Learning Word Embeddings from Dependency Graphs". In Proceedings of the 13th International Conference on Computational Semantics (IWCS).

Guy Emerson and Ann Copestake. 2017. "Semantic Composition via Probabilistic Model Theory". In Proceedings of the 12th International Conference on Computational Semantics (IWCS).

Guy Emerson and Ann Copestake. 2017. "Variational Inference for Logical Inference". In Proceedings of the 2017 Conference on Logic and Machine Learning for Natural Language (LaML).

Guy Emerson and Ann Copestake. 2016. "Functional Distributional Semantics". In Proceedings of the ACL 2016 Workshop on Representation Learning for NLP (RepL4NLP). (see the poster here)

Ann Copestake, Guy Emerson, Michael Wayne Goodman, Matic Horvat, Alexander Kuhnle, and Ewa Muszyńska. 2016. "Resources for Building Applications with Dependency Minimal Recursion Semantics". In Proceedings of 10th International Conference on Language Resources and Evaluation (LREC).

Guy Emerson and Ann Copestake. 2015. "Lacking Integrity: HPSG as a Morphosyntactic Theory". In Proceedings of the 22nd International Conference on Head-Driven Phrase Structure Grammar (HPSG).

Guy Emerson and Ann Copestake. 2015. "Leveraging a Semantically Annotated Corpus to Disambiguate Prepositional Phrase Attachment". In Proceedings of the 11th International Conference on Computational Semantics (IWCS). (watch the presentation here)

Guy Emerson and Thierry Declerck. 2014. "SentiMerge: Combining Sentiment Lexicons in a Bayesian Framework". In Proceedings of the 2014 COLING Workshop on Lexical and Grammatical Resources for Language Processing.

Guy Emerson, Liling Tan, Susanne Fertmann, Alexis Palmer, and Michaela Regneri. 2014. "SeedLing: Building and using a seed corpus for the Human Language Project". In Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages.

Guy Emerson. 2013. "Using Distributional Semantics to Improve Parse Ranking". Master's Thesis, University of Cambridge.

Teaching and Supervisions


I have supervised or lectured for the following courses:

  • Computer Science, Part IA
    • Machine Learning and Real-World Data
    • Introduction to Probability
    • Foundations of Computer Science
    • Algorithms
  • Computer Science, Part IB
    • Formal Models of Language
    • Computation Theory
    • Complexity Theory
  • Computer Science, Part II
    • Natural Language Processing
    • Data Science: Principles and Practice
    • Information Theory
  • Computer Science, Part III / MPhil
    • Machine Learning for Language Processing
    • Introduction to Computational Semantics
  • Linguistics, Part II
    • Computational Linguistics
  • Mathematics, Part II
    • Automata and Formal Languages
Research supervision: 

For current Cambridge students: I supervise MPhil / Part III dissertations and also Part II dissertations.  See here for previous project suggestions, but feel free to get in touch to discuss any ideas that broadly fit with my research interests.

For prospective PhD students: I am looking for students to work with me on topics connected to Functional Distributional Semantics (see "Publications" tab above).  I would also be interested in supervising topics where there is a clear linguistic research question, and computational modelling is important for answering that question (see "Research" tab for examples).  Feel free to get in touch to discuss your ideas!  To show that you've read this page, please include "hypothetical rhubarb" in the subject line of your email.

Other Professional Activities

Member of DELPH-IN.

Committee member for the Beth Dissertation Prize.

Senior area chair for ACL Rolling Review, and former co-organiser of SemEval.

I have appeared as a guest on these podcasts:

You can also follow me on Mastodon:

Executive Director, Cambridge Language Sciences
College Lecturer and Bye-Fellow, Gonville & Caius College
Departmental Early-Career Academic Fellow, Department of Computer Science and Technology
Dr Guy Edward Toh Emerson

Contact Details

Email address: