Language testing and assessment

Mr Martin Moore

Technology in language teaching

Haeng-A Kim

Diagnostic Assessment in English for Academic Purposes

- Validity by Design / Impact by Design

Dynamic Cognitive Assessment for Reading Comprehension Skills Needs Analysis

- Validation through the think-aloud method

Dr Mark Brenchley

Corpora/corpus linguistics; Lexico-grammar; First language acquisition; Language testing; Linguistic theory; Second language acquisition; Syntax; Writing Development

Dr Kevin Yet Fong Cheung

Psychometrics, Automated assessment of writing and speaking, Computer adaptive testing, Cognitive processes in writing, Identity develpment in writing, Plagiarism

Dr Marek Rei

Machine learning;
neural network models;
sequence labeling tasks;
automated assessment

Conference proceedings

2019

Rei, M. and Sogaard, A., 2019. Jointly Learning to Label Sentences and Tokens THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE,

2018 (Accepted for publication)

Rei, M. and Søgaard, A., 2018 (Accepted for publication). Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens
Doi: http://doi.org/10.17863/CAM.35110

2018

Rei, M., Gerz, D. and Vulić, I., 2018. Scoring lexical entailment with a supervised directional similarity network ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 2
Doi: http://doi.org/10.18653/v1/p18-2101

Stathopoulos, YA., Baker, S., Rei, M. and Teufel, S., 2018. Variable typing: Assigning meaning to variables in mathematical text NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1

Barrett, M., Bingel, J., Hollenstein, N., Rei, M. and Søgaard, A., 2018. Sequence classification with human attention CoNLL 2018 - 22nd Conference on Computational Natural Language Learning, Proceedings,
Doi: http://doi.org/10.18653/v1/k18-1030

2017 (Accepted for publication)

Rei, M., Bulat, LT., Kiela, D. and Shutova, E., 2017 (Accepted for publication). Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection

2017

Rei, M., 2017. Detecting Off-topic Responses to Visual Prompts

Rei, M. and Giannakoudaki, E., 2017. Auxiliary Objectives for Neural Error Detection Models

Giannakoudaki, E., Rei, M., Andersen, OE. and Yuan, Z., 2017. Neural Sequence-Labelling Models for Grammatical Error Correction Proceedings of the 2017 Conference on Empirical Methods in natural Language Processing, v. D17-1
Doi: http://doi.org/10.18653/v1/D17-1297

Rei, M., Felice, M., Yuan, Z. and Briscoe, T., 2017. Artificial Error Generation with Machine Translation and Syntactic Patterns Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications,

Farag, Y., Rei, M. and Briscoe, T., 2017. An Error-Oriented Approach to Word Embedding Pre-Training Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications,

Rei, M., 2017. Semi-supervised multitask learning for sequence labeling ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 1
Doi: http://doi.org/10.18653/v1/P17-1194

2016 (Accepted for publication)

Rei, M. and Cao, K., 2016 (Accepted for publication). A Joint Model for Word Embedding and Word Morphology

2016

Alikaniotis, D., Yannakoudakis, H. and Rei, M., 2016. Automatic text scoring using neural networks 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, v. 2
Doi: http://doi.org/10.18653/v1/p16-1068

Rei, M. and Yannakoudakis, H., 2016. Compositional sequence labeling models for error detection in learner writing 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, v. 2

Rei, M. and Cummins, R., 2016. Sentence Similarity Measures for Fine-Grained Estimation of Topical Relevance in Learner Essays https://aclweb.org/anthology/volumes/proceedings-of-the-11th-workshop-on-innovative-use-of-nlp-for-building-educational-applications/,
Doi: http://doi.org/10.18653/v1/W16-05

Rei, M., Crichton, GKO. and Pyysalo, S., 2016. Attending to characters in neural sequence labeling models COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,

2015

Rei, M., 2015. Online representation learning in recurrent neural language models Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing,

2014

Rei, M. and Briscoe, T., 2014. Parser lexicalisation through self-learning NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference,

Rei, M. and Briscoe, T., 2014. Looking for Hyponyms in Vector Space. CoNLL,

2011

Rei, M. and Briscoe, T., 2011. Unsupervised Entailment Detection between Dependency Graph Fragments
Doi: http://doi.org/10.17863/CAM.21358

2010

Rei, M. and Briscoe, T., 2010. Combining manual rules and supervised learning for hedge cue and scope detection Proceedings of the Fourteenth Conference on Computational Natural Language Learning: Shared Task,

Journal articles

2017

Farag, Y., Rei, M. and Briscoe, T., 2017. An Error-Oriented Approach to Word Embedding Pre-Training. CoRR, v. abs/1707.06841

Rei, M., Felice, M., Yuan, Z. and Briscoe, T., 2017. Artificial Error Generation with Machine Translation and Syntactic Patterns. CoRR, v. abs/1707.05236

2011

Briscoe, T., Harrison, K., Naish, A., Parker, A., Rei, M., Siddharthan, A., Sinclair, D., Slater, M. and Watson, R., 2011. Intelligent Information Access from Scientific Papers Current Challenges in Patent Information Retrieval,

Theses / dissertations

2013

Rei, M., 2013. Minimally supervised dependency-based methods for natural language processing

Jocelyn Wyburd

Learning environments; curriculum design; language learner autonomy; use of technology to support language teaching and learning

Mariano Felice

Grammatical error detection and correction in non-native English text

Dr Ardeshir Geranpayeh

Psychometrics; automated assessment of writing and speaking; Multi-level Testing, Computer Adaptive Testing (CAT), cheating detection; Learning Oriented Assessment (LOA)

Dr Andrew Caines

Second language learning; first language acquisition; speech; corpus linguistics; language evolution

Journal articles

2024

Ahrenberg, L., Ainiala, T., Aldrin, E., Holdt, ŠA., Caines, A., Dalianis, H., Dannélls, D., Dobnik, S., Grouin, C., Hämäläinen, L., Henriksson, A., Kokkinakis, D., Lassus, J., Tiedemann, TL., Lison, P., Lindén, K., Ljunglöf, P., Sánchez, RM., Nelson, B., Nordman, L., Pilán, I., Raheja, V., Scheffler, T., Torra, V., Vakili, T., Vydiswaran, VGV., Volodina, E. and Vu, XS., 2024. Introduction CALD-pseudo 2024 - Workshop on Computational Approaches to Language Data Pseudonymization, Proceedings of the Workshop,

Davis, C., Caines, A., Andersen, Ø., Taslimipoor, S., Yannakoudakis, H., Yuan, Z., Bryant, C., Rei, M. and Buttery, P., 2024. Prompting open-source and commercial language models for grammatical error correction of English learner text Proceedings of the Annual Meeting of the Association for Computational Linguistics,

2023

Benedetto, L., Cremonesi, P., Caines, A., Buttery, P., Cappelli, A., Giussani, A. and Turrin, R., 2023. A Survey on Recent Approaches to Question Difficulty Estimation from Text ACM Computing Surveys, v. 55
Doi: 10.1145/3556538

Zhou, L., Caines, A., Pete, I. and Hutchings, A., 2023. Automated hate speech detection and span extraction in underground hacking and extremist forums Natural Language Engineering, v. 29
Doi: 10.1017/S1351324922000262

Goodman, JR., Caines, A. and Foley, RA., 2023. Shibboleth: An agent-based model of signalling mimicry. PLoS One, v. 18
Doi: http://doi.org/10.1371/journal.pone.0289333

Goriely, Z., Caines, A. and Buttery, P., 2023. Word segmentation from transcriptions of child-directed speech using lexical and sub-lexical cues. J Child Lang,
Doi: http://doi.org/10.1017/S0305000923000491

2021

Katushemererwe, F., Caines, A. and Buttery, P., 2021. Building natural language processing tools for Runyakitara Applied Linguistics Review, v. 12
Doi: http://doi.org/10.1515/applirev-2020-2004

2019

Caines, A., Altmann-Richer, E. and Buttery, P., 2019. The cross-linguistic performance of word segmentation models over time. J Child Lang, v. 46
Doi: http://doi.org/10.1017/S0305000919000485

2018

Caines, A., Pastrana, S., Hutchings, A. and Buttery, PJ., 2018. Automatically identifying the function and intent of posts in underground forums Crime Science, v. 7
Doi: http://doi.org/10.1186/s40163-018-0094-4

Conference proceedings

2024

Velentzas, G., Caines, A., Borgo, R., Pacquetet, E., Hamilton, C., Arnold, T., Nicholls, D., Buttery, P., Gaillat, T., Yannakoudakis, H. and Ballier, N., 2024. Logging Keystrokes in Writing by English Learners 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings,

Chan, KWH., Bryant, C., Nguyen, L., Caines, A. and Yuan, Z., 2024. Grammatical Error Correction for Code-Switched Sentences by Learners of English 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings,

Moore, R., Caines, A. and Buttery, P., 2024. Recurrent Neural Collaborative Filtering for Knowledge Tracing Communications in Computer and Information Science, v. 2150 CCIS
Doi: http://doi.org/10.1007/978-3-031-64315-6_36

2023

Caines, A., Benedetto, L., Taslimipoor, S., Davis, C., Gao, Y., Andersen, Ø., Yuan, Z., Elliott, M., Moore, R., Bryant, C., Rei, M., Yannakoudakis, H., Mullooly, A., Nicholls, D. and Buttery, P., 2023. On the application of Large Language Models for language teaching and assessment technology CEUR Workshop Proceedings, v. 3487

Diehl Martinez, R., Goriely, Z., McGovern, H., Davis, C., Caines, A., Buttery, P. and Beinborn, L., 2023. CLIMB – Curriculum Learning for Infant-inspired Model Building CoNLL 2023 - BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, Proceedings,

2022

Wambsganss, T., Caines, A. and Buttery, P., 2022. ALEN App: Persuasive Writing Support To Foster English Language Learning BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,

Tyen, G., Brenchley, M., Caines, A. and Buttery, P., 2022. Towards an open-domain chatbot for language practice BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,
Doi: 10.18653/v1/2022.bea-1.28

Rietsche, R., Caines, A., Schramm, C., Pfütze, D. and Buttery, P., 2022. The Specificity and Helpfulness of Peer-to-Peer Feedback in Higher Education BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,

Pete, I., Hughes, J., Caines, A., Vu, AV., Gupta, H., Hutchings, A., Anderson, R. and Buttery, P., 2022. PostCog: A tool for interdisciplinary research into underground forums at scale Proceedings - 7th IEEE European Symposium on Security and Privacy Workshops, Euro S and PW 2022,
Doi: http://doi.org/10.1109/EuroSPW55150.2022.00016

Davis, C., Bryant, C., Caines, A., Rei, M. and Buttery, P., 2022. Probing for targeted syntactic knowledge through grammatical error detection CoNLL 2022 - 26th Conference on Computational Natural Language Learning, Proceedings of the Conference,

Chua, H., Caines, A. and Yannakoudakis, H., 2022. A unified framework for cross-domain and cross-task learning of mental health conditions NLP4PI 2022 - 2nd Workshop on NLP for Positive Impact, Proceedings of the Workshop,
Doi: 10.18653/v1/2022.nlp4pi-1.1

2020

Hughes, J., Aycock, S., Caines, A., Buttery, P. and Hutchings, A., 2020. Detecting Trending Terms in Cybersecurity Forum Discussions Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020),
Doi: 10.18653/v1/2020.wnut-1.15

Caines, A., Bentz, C., Knill, K., Rei, M. and Buttery, P., 2020. Grammatical error detection in transcriptions of spoken English COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference,

Caines, A. and Buttery, P., 2020. REPROLANG 2020: Automatic proficiency scoring of Czech, English, German, Italian, and Spanish learner essays LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings,

MacSween, R., Caines, A. and Buttery, P., 2020. An Expectation Maximisation Algorithm for Automated Cognate Detection CoNLL 2020 - 24th Conference on Computational Natural Language Learning, Proceedings of the Conference,

Zaidi, A., Caines, A., Moore, R., Buttery, P. and Rice, A., 2020. Adaptive Forgetting Curves for Spaced Repetition Language Learning Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 12164 LNAI
Doi: http://doi.org/10.1007/978-3-030-52240-7_65

Craighead, H., Caines, A., Buttery, P. and Yannakoudakis, H., 2020. Investigating the effect of auxiliary objectives for the automated grading of learner english speech transcriptions Proceedings of the Annual Meeting of the Association for Computational Linguistics,
Doi: 10.18653/v1/2020.acl-main.206

2019

Baur, C., Caines, A., Chua, C., Gerlach, J., Qian, M., Rayner, M., Russell, M., Strik, H. and Wei, X., 2019. Overview of the 2019 Spoken CALL Shared Task 8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19,
Doi: http://doi.org/10.21437/SLaTE.2019-1

Knill, K., Gales, M., Manakul, P. and Caines, A., 2019. Automatic grammatical error detection of non-native spoken learner English ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
Doi: 10.1109/icassp.2019.8683755

Aglionby, G., Davis, C., Mishra, P., Caines, A., Yannakoudakis, H., Rei, M., Shutova, E. and Buttery, P., 2019. CAMsterdam at SemEval-2019 task 6: Neural and graph-based feature extraction for the identification of offensive tweets NAACL HLT 2019 - International Workshop on Semantic Evaluation, SemEval 2019, Proceedings of the 13th Workshop,

Moore, R., Caines, A., Rice, A. and Buttery, P., 2019. Behavioural cloning of teachers for automatic homework selection Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 11625 LNAI
Doi: http://doi.org/10.1007/978-3-030-23204-7_28

Knill, KM., Gales, MJF., Manakul, PP. and Caines, AP., 2019. Automatic Grammatical Error Detection of Non-native Spoken Learner English ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, v. 2019-May
Doi: 10.1109/ICASSP.2019.8683080

Moore, R., Caines, A., Elliott, M., Zaidi, A., Rice, A. and Buttery, P., 2019. Skills embeddings: A neural approach to multicomponent representations of students and tasks EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining,

Zaidi, AH., Caines, A., Davis, C., Moore, R., Buttery, P. and Rice, A., 2019. Accurate modelling of language learning tasks and students using representations of grammatical proficiency EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining,

2018

Caines, A., Pastrana, S., Hutchings, A. and Buttery, P., 2018. Aggressive language in an online hacking forum 2nd Workshop on Abusive Language Online - Proceedings of the Workshop, co-located with EMNLP 2018,

Pastrana, S., Hutchings, A., Caines, A. and Buttery, P., 2018. Characterizing eve: Analysing cybercrime actors in a large underground forum Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 11050 LNCS
Doi: http://doi.org/10.1007/978-3-030-00470-5_10

Knill, KM., Gales, MJF., Kyriakopoulos, K., Malinin, A., Ragni, A., Wang, Y. and Caines, AP., 2018. Impact of ASR performance on free speaking language assessment Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2018-September
Doi: http://doi.org/10.21437/Interspeech.2018-1312

Baur, C., Caines, A., Chua, C., Gerlach, J., Qian, M., Rayner, M., Russell, M., Strik, H. and Wei, X., 2018. Overview of the 2018 spoken call shared task Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2018-September
Doi: http://doi.org/10.21437/Interspeech.2018-97

2017

Caines, A., 2017. Spoken CALL Shared Task system description 7th ISCA Workshop on Speech and Language Technology in Education, SLaTE 2017,
Doi: http://doi.org/10.21437/SLaTE.2017-14

Flint, E., Ford, E., Thomas, O., Caines, A. and Buttery, P., 2017. A Text Normalisation System for Non-Standard English Words 3rd Workshop on Noisy User-Generated Text, W-NUT 2017 - Proceedings of the Workshop,

Caines, A., Flint, E. and Buttery, P., 2017. Collecting fluency corrections for spoken learner english EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop,

Caines, A., McCarthy, M. and Buttery, P., 2017. Parsing transcripts of speech EMNLP 2017 - 1st Workshop on Speech-Centric Natural Language Processing, SCNLP 2017 - Proceedings of the Workshop,

2016

Moore, R., Caines, A., Graham, C. and Buttery, P., 2016. Automated speech-unit delimitation in spoken learner English COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,

Caines, A., Bentz, C., Graham, C., Polzehl, T. and Buttery, P., 2016. Crowdsourcing a multilingual speech corpus: Recording, transcription and annotation of the CROWDED corpus Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016,

Zhang, W., Caines, A., Alikaniotis, D. and Buttery, P., 2016. Predicting author age from Weibo microblog posts Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016,

Caines, A., Bentz, C., Alikaniotis, D., Katushemererwe, F. and Buttery, P., 2016. The Glottolog Data Explorer: Mapping the world’s languages Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). Workshop proceedings,

2015

Moore, R., Caines, A., Graham, C. and Buttery, P., 2015. Incremental dependency parsing and disfluency detection in spoken learner English Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 9302
Doi: http://doi.org/10.1007/978-3-319-24033-6_53

2014

Caines, A. and Buttery, P., 2014. The effect of disfluencies and learner errors on the parsing of spoken learner language Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages SPMRL-SANCL 2014), Co-located with COLING,

2012

Buttery, P. and Caines, A., 2012. Reclassifying subcategorization frames for experimental analysis and stimulus generation Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012,

Caines, A. and Buttery, P., 2012. Annotating progressive aspect constructions in the spoken section of the british national Corpus Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012,

2010

Caines, A. and Buttery, P., 2010. ‘You talking to me?’ A predictive model for zero auxiliary constructions Proceedings of the Workshop on Natural Language Processing and Linguistics, Finding the Common Ground, Annual Meeting of the Association for Computational Linguistics,

Book chapters

2024

Benedetto, L., Taslimipoor, S., Caines, A., Galvan-Sosa, D., Dueñas, G., Loukina, A. and Zesch, T., 2024. Workshop on Automatic Evaluation of Learning and Assessment Content
Doi: http://doi.org/10.1007/978-3-031-64312-5_60

2018

Caines, A., McCarthy, M. and Buttery, P., 2018. 'You still talking to me?': The zero auxiliary progressive in spoken British english twenty years on

2017

Caines, A. and Buttery, P., 2017. The Effect of Task and Topic on Opportunity of Use in Learner Corpora

2016

Caines, A., McCarthy, M. and O’Keeffe, A., 2016. Spoken language corpora and pedagogical applications
Doi: http://doi.org/10.4324/9781315657899-39

Caines, A., McCarthy, M. and O'Keeffe, A., 2016. Spoken language corpora and pedagogical applications
Doi: http://doi.org/10.4324/9781315657899

2012

Caines, A. and Buttery, P., 2012. Normalising frequency counts to account for ‘opportunity of use’ in learner corpora

Caines, A., 2012. ‘You talking to me?’ Testing corpus data with a shadowing experiment

Reports

2024

Nicholls, D., Caines, A. and Buttery, P., 2024. The Write & Improve Corpus 2024: Error-annotated and CEFR-labelled essays by learners of English
Doi: http://doi.org/10.17863/CAM.112997

2017

Caines, AP., Nicholls, D. and Buttery, P., 2017. Annotating errors and disfluencies in transcriptions of speech

Min Du

Virtual learning; online learning; course design; EFL assessment; second language education

Mr Martin Moore

Haeng-A Kim

Dr Mark Brenchley

Dr Kevin Yet Fong Cheung

Dr Marek Rei

2019

2018 (Accepted for publication)

2018

2017 (Accepted for publication)

2017

2016 (Accepted for publication)

2016

2015

2014

2011

2010

2017

2011

2013

Jocelyn Wyburd

Mariano Felice

Dr Ardeshir Geranpayeh

Dr Andrew Caines

2024

2023

2021

2019

2018

2024

2023

2022

2020

2019

2018

2017

2016

2015

2014

2012

2010

2024

2018

2017

2016

2012

2024

2017

Min Du

What we do

Events

Study at Cambridge

About the University

Research at Cambridge