skip to content

Cambridge Language Sciences

Interdisciplinary Research Centre
 
Read more at: Mr Martin Moore

Mr Martin Moore

Technology in language teaching


Read more at: Haeng-A Kim

Haeng-A Kim

Diagnostic Assessment in English for Academic Purposes

- Validity by Design / Impact by Design

 

Dynamic Cognitive Assessment for Reading Comprehension Skills Needs Analysis

- Validation through the think-aloud method


Read more at: Dr Mark Brenchley

Dr Mark Brenchley

Corpora/corpus linguistics; Lexico-grammar; First language acquisition; Language testing; Linguistic theory; Second language acquisition; Syntax; Writing Development


Read more at: Dr Kevin Yet Fong Cheung

Dr Kevin Yet Fong Cheung

Psychometrics, Automated assessment of writing and speaking, Computer adaptive testing, Cognitive processes in writing, Identity develpment in writing, Plagiarism


Read more at: Dr Marek Rei

Dr Marek Rei

Machine learning;
neural network models;
sequence labeling tasks;
automated assessment

Conference proceedings

2019

  • Rei, M. and Sogaard, A., 2019. Jointly Learning to Label Sentences and Tokens THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE,
  • 2018 (Accepted for publication)

  • Rei, M. and Søgaard, A., 2018 (Accepted for publication). Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens
    Doi: http://doi.org/10.17863/CAM.35110
  • 2018

  • Rei, M., Gerz, D. and Vulić, I., 2018. Scoring lexical entailment with a supervised directional similarity network ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 2
    Doi: http://doi.org/10.18653/v1/p18-2101
  • Stathopoulos, YA., Baker, S., Rei, M. and Teufel, S., 2018. Variable typing: Assigning meaning to variables in mathematical text NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1
  • Barrett, M., Bingel, J., Hollenstein, N., Rei, M. and Søgaard, A., 2018. Sequence classification with human attention CoNLL 2018 - 22nd Conference on Computational Natural Language Learning, Proceedings,
    Doi: http://doi.org/10.18653/v1/k18-1030
  • 2017 (Accepted for publication)

  • Rei, M., Bulat, LT., Kiela, D. and Shutova, E., 2017 (Accepted for publication). Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection
  • 2017

  • Rei, M., 2017. Detecting Off-topic Responses to Visual Prompts
  • Rei, M. and Giannakoudaki, E., 2017. Auxiliary Objectives for Neural Error Detection Models
  • Giannakoudaki, E., Rei, M., Andersen, OE. and Yuan, Z., 2017. Neural Sequence-Labelling Models for Grammatical Error Correction Proceedings of the 2017 Conference on Empirical Methods in natural Language Processing, v. D17-1
    Doi: http://doi.org/10.18653/v1/D17-1297
  • Rei, M., Felice, M., Yuan, Z. and Briscoe, T., 2017. Artificial Error Generation with Machine Translation and Syntactic Patterns Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications,
  • Farag, Y., Rei, M. and Briscoe, T., 2017. An Error-Oriented Approach to Word Embedding Pre-Training Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications,
  • Rei, M., 2017. Semi-supervised multitask learning for sequence labeling ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 1
    Doi: http://doi.org/10.18653/v1/P17-1194
  • 2016 (Accepted for publication)

  • Rei, M. and Cao, K., 2016 (Accepted for publication). A Joint Model for Word Embedding and Word Morphology
  • 2016

  • Alikaniotis, D., Yannakoudakis, H. and Rei, M., 2016. Automatic text scoring using neural networks 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, v. 2
    Doi: http://doi.org/10.18653/v1/p16-1068
  • Rei, M. and Yannakoudakis, H., 2016. Compositional sequence labeling models for error detection in learner writing 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, v. 2
  • Rei, M. and Cummins, R., 2016. Sentence Similarity Measures for Fine-Grained Estimation of Topical Relevance in Learner Essays https://aclweb.org/anthology/volumes/proceedings-of-the-11th-workshop-on-innovative-use-of-nlp-for-building-educational-applications/,
    Doi: http://doi.org/10.18653/v1/W16-05
  • Rei, M., Crichton, GKO. and Pyysalo, S., 2016. Attending to characters in neural sequence labeling models COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,
  • 2015

  • Rei, M., 2015. Online representation learning in recurrent neural language models Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing,
  • 2014

  • Rei, M. and Briscoe, T., 2014. Parser lexicalisation through self-learning NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference,
  • Rei, M. and Briscoe, T., 2014. Looking for Hyponyms in Vector Space. CoNLL,
  • 2011

  • Rei, M. and Briscoe, T., 2011. Unsupervised Entailment Detection between Dependency Graph Fragments
    Doi: http://doi.org/10.17863/CAM.21358
  • 2010

  • Rei, M. and Briscoe, T., 2010. Combining manual rules and supervised learning for hedge cue and scope detection Proceedings of the Fourteenth Conference on Computational Natural Language Learning: Shared Task,
  • Journal articles

    2017

  • Farag, Y., Rei, M. and Briscoe, T., 2017. An Error-Oriented Approach to Word Embedding Pre-Training. CoRR, v. abs/1707.06841
  • Rei, M., Felice, M., Yuan, Z. and Briscoe, T., 2017. Artificial Error Generation with Machine Translation and Syntactic Patterns. CoRR, v. abs/1707.05236
  • 2011

  • Briscoe, T., Harrison, K., Naish, A., Parker, A., Rei, M., Siddharthan, A., Sinclair, D., Slater, M. and Watson, R., 2011. Intelligent Information Access from Scientific Papers Current Challenges in Patent Information Retrieval,
  • Theses / dissertations

    2013

  • Rei, M., 2013. Minimally supervised dependency-based methods for natural language processing

  • Read more at: Jocelyn Wyburd

    Jocelyn Wyburd

    Learning environments; curriculum design; language learner autonomy; use of technology to support language teaching and learning


    Read more at: Mariano Felice

    Mariano Felice

    Grammatical error detection and correction in non-native English text


    Read more at: Dr Ardeshir Geranpayeh

    Dr Ardeshir Geranpayeh

    Psychometrics; automated assessment of writing and speaking; Multi-level Testing, Computer Adaptive Testing (CAT), cheating detection; Learning Oriented Assessment (LOA)


    Read more at: Dr Andrew Caines

    Dr Andrew Caines

    Second language learning; first language acquisition; speech; corpus linguistics; language evolution

    Journal articles

    2024

  • Ahrenberg, L., Ainiala, T., Aldrin, E., Holdt, ŠA., Caines, A., Dalianis, H., Dannélls, D., Dobnik, S., Grouin, C., Hämäläinen, L., Henriksson, A., Kokkinakis, D., Lassus, J., Tiedemann, TL., Lison, P., Lindén, K., Ljunglöf, P., Sánchez, RM., Nelson, B., Nordman, L., Pilán, I., Raheja, V., Scheffler, T., Torra, V., Vakili, T., Vydiswaran, VGV., Volodina, E. and Vu, XS., 2024. Introduction CALD-pseudo 2024 - Workshop on Computational Approaches to Language Data Pseudonymization, Proceedings of the Workshop,
  • Davis, C., Caines, A., Andersen, Ø., Taslimipoor, S., Yannakoudakis, H., Yuan, Z., Bryant, C., Rei, M. and Buttery, P., 2024. Prompting open-source and commercial language models for grammatical error correction of English learner text Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • 2023

  • Benedetto, L., Cremonesi, P., Caines, A., Buttery, P., Cappelli, A., Giussani, A. and Turrin, R., 2023. A Survey on Recent Approaches to Question Difficulty Estimation from Text ACM Computing Surveys, v. 55
    Doi: 10.1145/3556538
  • Zhou, L., Caines, A., Pete, I. and Hutchings, A., 2023. Automated hate speech detection and span extraction in underground hacking and extremist forums Natural Language Engineering, v. 29
    Doi: 10.1017/S1351324922000262
  • Goodman, JR., Caines, A. and Foley, RA., 2023. Shibboleth: An agent-based model of signalling mimicry. PLoS One, v. 18
    Doi: http://doi.org/10.1371/journal.pone.0289333
  • Goriely, Z., Caines, A. and Buttery, P., 2023. Word segmentation from transcriptions of child-directed speech using lexical and sub-lexical cues. J Child Lang,
    Doi: http://doi.org/10.1017/S0305000923000491
  • 2021

  • Katushemererwe, F., Caines, A. and Buttery, P., 2021. Building natural language processing tools for Runyakitara Applied Linguistics Review, v. 12
    Doi: http://doi.org/10.1515/applirev-2020-2004
  • 2019

  • Caines, A., Altmann-Richer, E. and Buttery, P., 2019. The cross-linguistic performance of word segmentation models over time. J Child Lang, v. 46
    Doi: http://doi.org/10.1017/S0305000919000485
  • 2018

  • Caines, A., Pastrana, S., Hutchings, A. and Buttery, PJ., 2018. Automatically identifying the function and intent of posts in underground forums Crime Science, v. 7
    Doi: http://doi.org/10.1186/s40163-018-0094-4
  • Conference proceedings

    2024

  • Velentzas, G., Caines, A., Borgo, R., Pacquetet, E., Hamilton, C., Arnold, T., Nicholls, D., Buttery, P., Gaillat, T., Yannakoudakis, H. and Ballier, N., 2024. Logging Keystrokes in Writing by English Learners 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings,
  • Chan, KWH., Bryant, C., Nguyen, L., Caines, A. and Yuan, Z., 2024. Grammatical Error Correction for Code-Switched Sentences by Learners of English 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings,
  • Moore, R., Caines, A. and Buttery, P., 2024. Recurrent Neural Collaborative Filtering for Knowledge Tracing Communications in Computer and Information Science, v. 2150 CCIS
    Doi: http://doi.org/10.1007/978-3-031-64315-6_36
  • 2023

  • Caines, A., Benedetto, L., Taslimipoor, S., Davis, C., Gao, Y., Andersen, Ø., Yuan, Z., Elliott, M., Moore, R., Bryant, C., Rei, M., Yannakoudakis, H., Mullooly, A., Nicholls, D. and Buttery, P., 2023. On the application of Large Language Models for language teaching and assessment technology CEUR Workshop Proceedings, v. 3487
  • Diehl Martinez, R., Goriely, Z., McGovern, H., Davis, C., Caines, A., Buttery, P. and Beinborn, L., 2023. CLIMB – Curriculum Learning for Infant-inspired Model Building CoNLL 2023 - BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, Proceedings,
  • 2022

  • Wambsganss, T., Caines, A. and Buttery, P., 2022. ALEN App: Persuasive Writing Support To Foster English Language Learning BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,
  • Tyen, G., Brenchley, M., Caines, A. and Buttery, P., 2022. Towards an open-domain chatbot for language practice BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,
    Doi: 10.18653/v1/2022.bea-1.28
  • Rietsche, R., Caines, A., Schramm, C., Pfütze, D. and Buttery, P., 2022. The Specificity and Helpfulness of Peer-to-Peer Feedback in Higher Education BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,
  • Pete, I., Hughes, J., Caines, A., Vu, AV., Gupta, H., Hutchings, A., Anderson, R. and Buttery, P., 2022. PostCog: A tool for interdisciplinary research into underground forums at scale Proceedings - 7th IEEE European Symposium on Security and Privacy Workshops, Euro S and PW 2022,
    Doi: http://doi.org/10.1109/EuroSPW55150.2022.00016
  • Davis, C., Bryant, C., Caines, A., Rei, M. and Buttery, P., 2022. Probing for targeted syntactic knowledge through grammatical error detection CoNLL 2022 - 26th Conference on Computational Natural Language Learning, Proceedings of the Conference,
  • Chua, H., Caines, A. and Yannakoudakis, H., 2022. A unified framework for cross-domain and cross-task learning of mental health conditions NLP4PI 2022 - 2nd Workshop on NLP for Positive Impact, Proceedings of the Workshop,
    Doi: 10.18653/v1/2022.nlp4pi-1.1
  • 2020

  • Hughes, J., Aycock, S., Caines, A., Buttery, P. and Hutchings, A., 2020. Detecting Trending Terms in Cybersecurity Forum Discussions Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020),
    Doi: 10.18653/v1/2020.wnut-1.15
  • Caines, A., Bentz, C., Knill, K., Rei, M. and Buttery, P., 2020. Grammatical error detection in transcriptions of spoken English COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference,
  • Caines, A. and Buttery, P., 2020. REPROLANG 2020: Automatic proficiency scoring of Czech, English, German, Italian, and Spanish learner essays LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings,
  • MacSween, R., Caines, A. and Buttery, P., 2020. An Expectation Maximisation Algorithm for Automated Cognate Detection CoNLL 2020 - 24th Conference on Computational Natural Language Learning, Proceedings of the Conference,
  • Zaidi, A., Caines, A., Moore, R., Buttery, P. and Rice, A., 2020. Adaptive Forgetting Curves for Spaced Repetition Language Learning Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 12164 LNAI
    Doi: http://doi.org/10.1007/978-3-030-52240-7_65
  • Craighead, H., Caines, A., Buttery, P. and Yannakoudakis, H., 2020. Investigating the effect of auxiliary objectives for the automated grading of learner english speech transcriptions Proceedings of the Annual Meeting of the Association for Computational Linguistics,
    Doi: 10.18653/v1/2020.acl-main.206
  • 2019

  • Baur, C., Caines, A., Chua, C., Gerlach, J., Qian, M., Rayner, M., Russell, M., Strik, H. and Wei, X., 2019. Overview of the 2019 Spoken CALL Shared Task 8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19,
    Doi: http://doi.org/10.21437/SLaTE.2019-1
  • Knill, K., Gales, M., Manakul, P. and Caines, A., 2019. Automatic grammatical error detection of non-native spoken learner English ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
    Doi: 10.1109/icassp.2019.8683755
  • Aglionby, G., Davis, C., Mishra, P., Caines, A., Yannakoudakis, H., Rei, M., Shutova, E. and Buttery, P., 2019. CAMsterdam at SemEval-2019 task 6: Neural and graph-based feature extraction for the identification of offensive tweets NAACL HLT 2019 - International Workshop on Semantic Evaluation, SemEval 2019, Proceedings of the 13th Workshop,
  • Moore, R., Caines, A., Rice, A. and Buttery, P., 2019. Behavioural cloning of teachers for automatic homework selection Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 11625 LNAI
    Doi: http://doi.org/10.1007/978-3-030-23204-7_28
  • Knill, KM., Gales, MJF., Manakul, PP. and Caines, AP., 2019. Automatic Grammatical Error Detection of Non-native Spoken Learner English ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, v. 2019-May
    Doi: 10.1109/ICASSP.2019.8683080
  • Moore, R., Caines, A., Elliott, M., Zaidi, A., Rice, A. and Buttery, P., 2019. Skills embeddings: A neural approach to multicomponent representations of students and tasks EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining,
  • Zaidi, AH., Caines, A., Davis, C., Moore, R., Buttery, P. and Rice, A., 2019. Accurate modelling of language learning tasks and students using representations of grammatical proficiency EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining,
  • 2018

  • Caines, A., Pastrana, S., Hutchings, A. and Buttery, P., 2018. Aggressive language in an online hacking forum 2nd Workshop on Abusive Language Online - Proceedings of the Workshop, co-located with EMNLP 2018,
  • Pastrana, S., Hutchings, A., Caines, A. and Buttery, P., 2018. Characterizing eve: Analysing cybercrime actors in a large underground forum Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 11050 LNCS
    Doi: http://doi.org/10.1007/978-3-030-00470-5_10
  • Knill, KM., Gales, MJF., Kyriakopoulos, K., Malinin, A., Ragni, A., Wang, Y. and Caines, AP., 2018. Impact of ASR performance on free speaking language assessment Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2018-September
    Doi: http://doi.org/10.21437/Interspeech.2018-1312
  • Baur, C., Caines, A., Chua, C., Gerlach, J., Qian, M., Rayner, M., Russell, M., Strik, H. and Wei, X., 2018. Overview of the 2018 spoken call shared task Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2018-September
    Doi: http://doi.org/10.21437/Interspeech.2018-97
  • 2017

  • Caines, A., 2017. Spoken CALL Shared Task system description 7th ISCA Workshop on Speech and Language Technology in Education, SLaTE 2017,
    Doi: http://doi.org/10.21437/SLaTE.2017-14
  • Flint, E., Ford, E., Thomas, O., Caines, A. and Buttery, P., 2017. A Text Normalisation System for Non-Standard English Words 3rd Workshop on Noisy User-Generated Text, W-NUT 2017 - Proceedings of the Workshop,
  • Caines, A., Flint, E. and Buttery, P., 2017. Collecting fluency corrections for spoken learner english EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop,
  • Caines, A., McCarthy, M. and Buttery, P., 2017. Parsing transcripts of speech EMNLP 2017 - 1st Workshop on Speech-Centric Natural Language Processing, SCNLP 2017 - Proceedings of the Workshop,
  • 2016

  • Moore, R., Caines, A., Graham, C. and Buttery, P., 2016. Automated speech-unit delimitation in spoken learner English COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,
  • Caines, A., Bentz, C., Graham, C., Polzehl, T. and Buttery, P., 2016. Crowdsourcing a multilingual speech corpus: Recording, transcription and annotation of the CROWDED corpus Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016,
  • Zhang, W., Caines, A., Alikaniotis, D. and Buttery, P., 2016. Predicting author age from Weibo microblog posts Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016,
  • Caines, A., Bentz, C., Alikaniotis, D., Katushemererwe, F. and Buttery, P., 2016. The Glottolog Data Explorer: Mapping the world’s languages Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). Workshop proceedings,
  • 2015

  • Moore, R., Caines, A., Graham, C. and Buttery, P., 2015. Incremental dependency parsing and disfluency detection in spoken learner English Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 9302
    Doi: http://doi.org/10.1007/978-3-319-24033-6_53
  • 2014

  • Caines, A. and Buttery, P., 2014. The effect of disfluencies and learner errors on the parsing of spoken learner language Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages SPMRL-SANCL 2014), Co-located with COLING,
  • 2012

  • Buttery, P. and Caines, A., 2012. Reclassifying subcategorization frames for experimental analysis and stimulus generation Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012,
  • Caines, A. and Buttery, P., 2012. Annotating progressive aspect constructions in the spoken section of the british national Corpus Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012,
  • 2010

  • Caines, A. and Buttery, P., 2010. ‘You talking to me?’ A predictive model for zero auxiliary constructions Proceedings of the Workshop on Natural Language Processing and Linguistics, Finding the Common Ground, Annual Meeting of the Association for Computational Linguistics,
  • Book chapters

    2024

  • Benedetto, L., Taslimipoor, S., Caines, A., Galvan-Sosa, D., Dueñas, G., Loukina, A. and Zesch, T., 2024. Workshop on Automatic Evaluation of Learning and Assessment Content
    Doi: http://doi.org/10.1007/978-3-031-64312-5_60
  • 2018

  • Caines, A., McCarthy, M. and Buttery, P., 2018. 'You still talking to me?': The zero auxiliary progressive in spoken British english twenty years on
  • 2017

  • Caines, A. and Buttery, P., 2017. The Effect of Task and Topic on Opportunity of Use in Learner Corpora
  • 2016

  • Caines, A., McCarthy, M. and O’Keeffe, A., 2016. Spoken language corpora and pedagogical applications
    Doi: http://doi.org/10.4324/9781315657899-39
  • Caines, A., McCarthy, M. and O'Keeffe, A., 2016. Spoken language corpora and pedagogical applications
    Doi: http://doi.org/10.4324/9781315657899
  • 2012

  • Caines, A. and Buttery, P., 2012. Normalising frequency counts to account for ‘opportunity of use’ in learner corpora
  • Caines, A., 2012. ‘You talking to me?’ Testing corpus data with a shadowing experiment
  • Reports

    2024

  • Nicholls, D., Caines, A. and Buttery, P., 2024. The Write & Improve Corpus 2024: Error-annotated and CEFR-labelled essays by learners of English
    Doi: http://doi.org/10.17863/CAM.112997
  • 2017

  • Caines, AP., Nicholls, D. and Buttery, P., 2017. Annotating errors and disfluencies in transcriptions of speech

  • Read more at: Min Du

    Min Du

    Virtual learning; online learning; course design; EFL assessment; second language education


    What we do

    Cambridge Language Sciences is an Interdisciplinary Research Centre at the University of Cambridge. Our virtual network connects researchers from five schools across the university as well as other world-leading research institutions. Our aim is to strengthen research collaborations and knowledge transfer across disciplines in order to address large-scale multi-disciplinary research challenges relating to language research.

    JOIN OUR NETWORK

    JOIN OUR MAILING LIST

    CONTACT US