skip to content

Cambridge Language Sciences

Interdisciplinary Research Centre
 

Research

My current projects relate to second language learning and form part of the ALTA Institute research programme. We are working to better understand learner proficiency levels in spoken English and provide individualised, automated teaching feedback.

I have also worked on educational technology for low resource languages, issues of literacy, online hacking forums, and adaptive tutoring.

 

Publications (from Symplectic)

Journal articles

2024

  • Ahrenberg, L., Ainiala, T., Aldrin, E., Holdt, ŠA., Caines, A., Dalianis, H., Dannélls, D., Dobnik, S., Grouin, C., Hämäläinen, L., Henriksson, A., Kokkinakis, D., Lassus, J., Tiedemann, TL., Lison, P., Lindén, K., Ljunglöf, P., Sánchez, RM., Nelson, B., Nordman, L., Pilán, I., Raheja, V., Scheffler, T., Torra, V., Vakili, T., Vydiswaran, VGV., Volodina, E. and Vu, XS., 2024. Introduction CALD-pseudo 2024 - Workshop on Computational Approaches to Language Data Pseudonymization, Proceedings of the Workshop,
  • 2023

  • Benedetto, L., Cremonesi, P., Caines, A., Buttery, P., Cappelli, A., Giussani, A. and Turrin, R., 2023. A Survey on Recent Approaches to Question Difficulty Estimation from Text ACM Computing Surveys, v. 55
    Doi: 10.1145/3556538
  • Zhou, L., Caines, A., Pete, I. and Hutchings, A., 2023. Automated hate speech detection and span extraction in underground hacking and extremist forums Natural Language Engineering, v. 29
    Doi: 10.1017/S1351324922000262
  • Goodman, JR., Caines, A. and Foley, RA., 2023. Shibboleth: An agent-based model of signalling mimicry. PLoS One, v. 18
    Doi: http://doi.org/10.1371/journal.pone.0289333
  • Goriely, Z., Caines, A. and Buttery, P., 2023. Word segmentation from transcriptions of child-directed speech using lexical and sub-lexical cues. J Child Lang,
    Doi: http://doi.org/10.1017/S0305000923000491
  • 2021

  • Katushemererwe, F., Caines, A. and Buttery, P., 2021. Building natural language processing tools for Runyakitara Applied Linguistics Review, v. 12
    Doi: http://doi.org/10.1515/applirev-2020-2004
  • 2019

  • Caines, A., Altmann-Richer, E. and Buttery, P., 2019. The cross-linguistic performance of word segmentation models over time. J Child Lang, v. 46
    Doi: http://doi.org/10.1017/S0305000919000485
  • 2018

  • Caines, A., Pastrana, S., Hutchings, A. and Buttery, PJ., 2018. Automatically identifying the function and intent of posts in underground forums Crime Science, v. 7
    Doi: http://doi.org/10.1186/s40163-018-0094-4
  • Conference proceedings

    2024

  • Velentzas, G., Caines, A., Borgo, R., Pacquetet, E., Hamilton, C., Arnold, T., Nicholls, D., Buttery, P., Gaillat, T., Yannakoudakis, H. and Ballier, N., 2024. Logging Keystrokes in Writing by English Learners 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings,
  • Chan, KWH., Bryant, C., Nguyen, L., Caines, A. and Yuan, Z., 2024. Grammatical Error Correction for Code-Switched Sentences by Learners of English 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings,
  • 2023

  • Caines, A., Benedetto, L., Taslimipoor, S., Davis, C., Gao, Y., Andersen, Ø., Yuan, Z., Elliott, M., Moore, R., Bryant, C., Rei, M., Yannakoudakis, H., Mullooly, A., Nicholls, D. and Buttery, P., 2023. On the application of Large Language Models for language teaching and assessment technology CEUR Workshop Proceedings, v. 3487
  • Diehl Martinez, R., Goriely, Z., McGovern, H., Davis, C., Caines, A., Buttery, P. and Beinborn, L., 2023. CLIMB – Curriculum Learning for Infant-inspired Model Building CoNLL 2023 - BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, Proceedings,
  • 2022

  • Wambsganss, T., Caines, A. and Buttery, P., 2022. ALEN App: Persuasive Writing Support To Foster English Language Learning BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,
  • Tyen, G., Brenchley, M., Caines, A. and Buttery, P., 2022. Towards an open-domain chatbot for language practice BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,
  • Rietsche, R., Caines, A., Schramm, C., Pfütze, D. and Buttery, P., 2022. The Specificity and Helpfulness of Peer-to-Peer Feedback in Higher Education BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,
  • Pete, I., Hughes, J., Caines, A., Vu, AV., Gupta, H., Hutchings, A., Anderson, R. and Buttery, P., 2022. PostCog: A tool for interdisciplinary research into underground forums at scale Proceedings - 7th IEEE European Symposium on Security and Privacy Workshops, Euro S and PW 2022,
    Doi: http://doi.org/10.1109/EuroSPW55150.2022.00016
  • Davis, C., Bryant, C., Caines, A., Rei, M. and Buttery, P., 2022. Probing for targeted syntactic knowledge through grammatical error detection CoNLL 2022 - 26th Conference on Computational Natural Language Learning, Proceedings of the Conference,
  • Chua, H., Caines, A. and Yannakoudakis, H., 2022. A unified framework for cross-domain and cross-task learning of mental health conditions NLP4PI 2022 - 2nd Workshop on NLP for Positive Impact, Proceedings of the Workshop,
  • 2020

  • MacSween, R., Caines, A. and Buttery, P., 2020. An Expectation Maximisation Algorithm for Automated Cognate Detection CoNLL 2020 - 24th Conference on Computational Natural Language Learning, Proceedings of the Conference,
  • Zaidi, A., Caines, A., Moore, R., Buttery, P. and Rice, A., 2020. Adaptive Forgetting Curves for Spaced Repetition Language Learning Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 12164 LNAI
    Doi: http://doi.org/10.1007/978-3-030-52240-7_65
  • Craighead, H., Caines, A., Buttery, P. and Yannakoudakis, H., 2020. Investigating the effect of auxiliary objectives for the automated grading of learner english speech transcriptions Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Hughes, J., Aycock, S., Caines, A., Buttery, P. and Hutchings, A., 2020. Detecting Trending Terms in Cybersecurity Forum Discussions Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020),
    Doi: 10.18653/v1/2020.wnut-1.15
  • Caines, A., Bentz, C., Knill, K., Rei, M. and Buttery, P., 2020. Grammatical error detection in transcriptions of spoken English COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference,
  • Caines, A. and Buttery, P., 2020. REPROLANG 2020: Automatic proficiency scoring of Czech, English, German, Italian, and Spanish learner essays LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings,
  • 2019

  • Knill, K., Gales, M., Manakul, P. and Caines, A., 2019. Automatic grammatical error detection of non-native spoken learner English ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
    Doi: 10.1109/icassp.2019.8683755
  • Aglionby, G., Davis, C., Mishra, P., Caines, A., Yannakoudakis, H., Rei, M., Shutova, E. and Buttery, P., 2019. CAMsterdam at SemEval-2019 task 6: Neural and graph-based feature extraction for the identification of offensive tweets NAACL HLT 2019 - International Workshop on Semantic Evaluation, SemEval 2019, Proceedings of the 13th Workshop,
  • Moore, R., Caines, A., Rice, A. and Buttery, P., 2019. Behavioural cloning of teachers for automatic homework selection Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 11625 LNAI
    Doi: http://doi.org/10.1007/978-3-030-23204-7_28
  • Knill, KM., Gales, MJF., Manakul, PP. and Caines, AP., 2019. Automatic Grammatical Error Detection of Non-native Spoken Learner English ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, v. 2019-May
    Doi: 10.1109/ICASSP.2019.8683080
  • Moore, R., Caines, A., Elliott, M., Zaidi, A., Rice, A. and Buttery, P., 2019. Skills embeddings: A neural approach to multicomponent representations of students and tasks EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining,
  • Zaidi, AH., Caines, A., Davis, C., Moore, R., Buttery, P. and Rice, A., 2019. Accurate modelling of language learning tasks and students using representations of grammatical proficiency EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining,
  • Baur, C., Caines, A., Chua, C., Gerlach, J., Qian, M., Rayner, M., Russell, M., Strik, H. and Wei, X., 2019. Overview of the 2019 Spoken CALL Shared Task 8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19,
    Doi: http://doi.org/10.21437/SLaTE.2019-1
  • 2018

  • Pastrana, S., Hutchings, A., Caines, A. and Buttery, P., 2018. Characterizing eve: Analysing cybercrime actors in a large underground forum Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 11050 LNCS
    Doi: http://doi.org/10.1007/978-3-030-00470-5_10
  • Knill, KM., Gales, MJF., Kyriakopoulos, K., Malinin, A., Ragni, A., Wang, Y. and Caines, AP., 2018. Impact of ASR performance on free speaking language assessment Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2018-September
    Doi: http://doi.org/10.21437/Interspeech.2018-1312
  • Baur, C., Caines, A., Chua, C., Gerlach, J., Qian, M., Rayner, M., Russell, M., Strik, H. and Wei, X., 2018. Overview of the 2018 spoken call shared task Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2018-September
    Doi: http://doi.org/10.21437/Interspeech.2018-97
  • Caines, A., Pastrana, S., Hutchings, A. and Buttery, P., 2018. Aggressive language in an online hacking forum 2nd Workshop on Abusive Language Online - Proceedings of the Workshop, co-located with EMNLP 2018,
  • 2017

  • Caines, A., 2017. Spoken CALL Shared Task system description 7th ISCA Workshop on Speech and Language Technology in Education, SLaTE 2017,
    Doi: http://doi.org/10.21437/SLaTE.2017-14
  • Flint, E., Ford, E., Thomas, O., Caines, A. and Buttery, P., 2017. A Text Normalisation System for Non-Standard English Words 3rd Workshop on Noisy User-Generated Text, W-NUT 2017 - Proceedings of the Workshop,
  • Caines, A., Flint, E. and Buttery, P., 2017. Collecting fluency corrections for spoken learner english EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop,
  • Caines, A., McCarthy, M. and Buttery, P., 2017. Parsing transcripts of speech EMNLP 2017 - 1st Workshop on Speech-Centric Natural Language Processing, SCNLP 2017 - Proceedings of the Workshop,
  • 2016

  • Moore, R., Caines, A., Graham, C. and Buttery, P., 2016. Automated speech-unit delimitation in spoken learner English COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,
  • Caines, A., Bentz, C., Graham, C., Polzehl, T. and Buttery, P., 2016. Crowdsourcing a multilingual speech corpus: Recording, transcription and annotation of the CROWDED corpus Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016,
  • Zhang, W., Caines, A., Alikaniotis, D. and Buttery, P., 2016. Predicting author age from Weibo microblog posts Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016,
  • Caines, A., Bentz, C., Alikaniotis, D., Katushemererwe, F. and Buttery, P., 2016. The Glottolog Data Explorer: Mapping the world’s languages Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). Workshop proceedings,
  • 2015

  • Moore, R., Caines, A., Graham, C. and Buttery, P., 2015. Incremental dependency parsing and disfluency detection in spoken learner English Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 9302
    Doi: http://doi.org/10.1007/978-3-319-24033-6_53
  • 2014

  • Caines, A. and Buttery, P., 2014. The effect of disfluencies and learner errors on the parsing of spoken learner language Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages SPMRL-SANCL 2014), Co-located with COLING,
  • 2012

  • Buttery, P. and Caines, A., 2012. Reclassifying subcategorization frames for experimental analysis and stimulus generation Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012,
  • Caines, A. and Buttery, P., 2012. Annotating progressive aspect constructions in the spoken section of the british national Corpus Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012,
  • 2010

  • Caines, A. and Buttery, P., 2010. ‘You talking to me?’ A predictive model for zero auxiliary constructions Proceedings of the Workshop on Natural Language Processing and Linguistics, Finding the Common Ground, Annual Meeting of the Association for Computational Linguistics,
  • Book chapters

    2024

  • Benedetto, L., Taslimipoor, S., Caines, A., Galvan-Sosa, D., Dueñas, G., Loukina, A. and Zesch, T., 2024. Workshop on Automatic Evaluation of Learning and Assessment Content
    Doi: http://doi.org/10.1007/978-3-031-64312-5_60
  • 2018

  • Caines, A., McCarthy, M. and Buttery, P., 2018. 'You still talking to me?': The zero auxiliary progressive in spoken British english twenty years on
  • 2017

  • Caines, A. and Buttery, P., 2017. The Effect of Task and Topic on Opportunity of Use in Learner Corpora
  • 2016

  • Caines, A., McCarthy, M. and O’Keeffe, A., 2016. Spoken language corpora and pedagogical applications
    Doi: http://doi.org/10.4324/9781315657899-39
  • Caines, A., McCarthy, M. and O'Keeffe, A., 2016. Spoken language corpora and pedagogical applications
    Doi: http://doi.org/10.4324/9781315657899
  • 2012

  • Caines, A. and Buttery, P., 2012. Normalising frequency counts to account for ‘opportunity of use’ in learner corpora
  • Caines, A., 2012. ‘You talking to me?’ Testing corpus data with a shadowing experiment
  • Reports

    2017

  • Caines, AP., Nicholls, D. and Buttery, P., 2017. Annotating errors and disfluencies in transcriptions of speech
  • Senior Research Associate, CST & ALTA Institute
    Dr Andrew  Caines

    Contact Details

    Affiliations

    Classifications: