skip to content

Cambridge Language Sciences

Interdisciplinary Research Centre
 
Read more at: Songbo Hu

Songbo Hu

Dialogue Systems, Conversational AI


Read more at: Dr Saussan Khalil

Dr Saussan Khalil

Arabic language, linguistics, sociolinguistics, computational linguistics, lexicography, teaching, second language acquisition, natural language processing, community languages


Read more at: Chris Bryant

Chris Bryant

Grammatical error detection and correction, CALL, NLP

Theses / dissertations

2019

  • Bryant, CJ., 2019. Automatic annotation of error types for grammatical error correction
    Doi: http://doi.org/10.17863/CAM.40832
  • Conference proceedings

    2017

  • Bryant, CJ., Felice, M. and Briscoe, E., 2017. Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, v. 1
  • Journal articles

    2016

  • Felice, M., Bryant, C. and Briscoe, T., 2016. Automatic extraction of learner errors in ESL sentences using linguistically enhanced alignments COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,

  • Read more at: Li Nguyen

    Li Nguyen

    Code-switching; language contact; theoretical linguistics

    Journal articles

    2024

  • Nguyen, L., Mayeux, O. and Yuan, Z., 2024. Code-switching input for machine translation: a case study of Vietnamese–English data International Journal of Multilingualism, v. 21
    Doi: 10.1080/14790718.2023.2224013
  • Li, KK., Nguyen, L., Bryant, C. and Yoo, K., 2024. Lexical tonal effects in code-switching: A comparative study of Cantonese, Mandarin, and Vietnamese switching with English International Journal of Bilingualism, v. 28
    Doi: 10.1177/13670069231181508
  • 2022

  • Nguyen, L., Yuan, Z. and Seed, G., 2022. Building Educational Technologies for Code-Switching: Current Practices, Difficulties and Future Directions Languages, v. 7
    Doi: http://doi.org/10.3390/languages7030220
  • 2021

  • Nguyen, L., Bryant, C., Kidwai, S. and Biberauer, T., 2021. Automatic Language Identification in Code-Switched Hindi-English Social Media Text Journal of Open Humanities Data, v. 7
    Doi: 10.5334/johd.44
  • 2020

  • Li, N. and Bryant, C., 2020. CanVEC - the Canberra Vietnamese-English Code-switching Natural Speech Corpus PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020),
  • 2019

  • Nguyen, L., 2019. Review: Torres Cacoullos and Travis. 2018. Bilingualism in the Community: Code-switching and Grammars in Contact. Cambridge: Cambridge University Press Corpora, v. 14
    Doi: http://doi.org/10.3366/cor.2019.0171
  • Nguyen, L., 2019. Borrowing or Code-switching? Traces of community norms in Vietnamese-English speech (vol 38, pg 443, 2018) AUSTRALIAN JOURNAL OF LINGUISTICS, v. 39
    Doi: http://doi.org/10.1080/07268602.2019.1567451
  • 2018

  • Nguyen, L., 2018. Borrowing or Code-switching? Traces of community norms in Vietnamese-English speech Australian Journal of Linguistics, v. 38
    Doi: http://doi.org/10.1080/07268602.2018.1510727
  • 2016

  • Nguyen, L. and McCallum, K., 2016. Drowning in our own home: a metaphor-led discourse analysis of Australian news media reporting on maritime asylum seekers Communication Research and Practice, v. 2
    Doi: 10.1080/22041451.2016.1188229
  • Theses / dissertations

    2021 (No publication date)

  • Nguyen, L., 2021 (No publication date). Cross-generational linguistic variation in the Canberra Vietnamese heritage language community: A corpus-centred investigation

  • Read more at: Olesya Razuvayevskaya

    Olesya Razuvayevskaya

    Natural Language Processing; machine learning; argument mining

    Theses / dissertations

    2022 (No publication date)

  • Razuvayevskaya, O., 2022 (No publication date). Towards automatic interpretation of A Fortiori arguments
    Doi: http://doi.org/10.17863/CAM.86256
  • Conference proceedings

    2017

  • Razuvayevskaya, O. and Teufel, SH., 2017. Recognising enthymemes in real-world texts: A feasibility study
    Doi: http://doi.org/10.17863/CAM.12376
  • Journal articles

    2017

  • Razuvayevskaya, O. and Teufel, S., 2017. Finding enthymemes in real-world texts: A feasibility study Argument & Computation, v. 8
    Doi: http://doi.org/10.3233/AAC-170020

  • Read more at: Dr Marek Rei

    Dr Marek Rei

    Machine learning;
    neural network models;
    sequence labeling tasks;
    automated assessment

    Conference proceedings

    2019

  • Rei, M. and Sogaard, A., 2019. Jointly Learning to Label Sentences and Tokens THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE,
  • 2018 (Accepted for publication)

  • Rei, M. and Søgaard, A., 2018 (Accepted for publication). Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens
    Doi: http://doi.org/10.17863/CAM.35110
  • 2018

  • Rei, M., Gerz, D. and Vulić, I., 2018. Scoring lexical entailment with a supervised directional similarity network ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 2
    Doi: http://doi.org/10.18653/v1/p18-2101
  • Stathopoulos, YA., Baker, S., Rei, M. and Teufel, S., 2018. Variable typing: Assigning meaning to variables in mathematical text NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1
  • Barrett, M., Bingel, J., Hollenstein, N., Rei, M. and Søgaard, A., 2018. Sequence classification with human attention CoNLL 2018 - 22nd Conference on Computational Natural Language Learning, Proceedings,
    Doi: http://doi.org/10.18653/v1/k18-1030
  • 2017 (Accepted for publication)

  • Rei, M., Bulat, LT., Kiela, D. and Shutova, E., 2017 (Accepted for publication). Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection
  • 2017

  • Rei, M., Felice, M., Yuan, Z. and Briscoe, T., 2017. Artificial Error Generation with Machine Translation and Syntactic Patterns Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications,
  • Farag, Y., Rei, M. and Briscoe, T., 2017. An Error-Oriented Approach to Word Embedding Pre-Training Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications,
  • Rei, M., 2017. Semi-supervised multitask learning for sequence labeling ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 1
    Doi: http://doi.org/10.18653/v1/P17-1194
  • Rei, M., 2017. Detecting Off-topic Responses to Visual Prompts
  • Rei, M. and Giannakoudaki, E., 2017. Auxiliary Objectives for Neural Error Detection Models
  • Giannakoudaki, E., Rei, M., Andersen, OE. and Yuan, Z., 2017. Neural Sequence-Labelling Models for Grammatical Error Correction Proceedings of the 2017 Conference on Empirical Methods in natural Language Processing, v. D17-1
    Doi: http://doi.org/10.18653/v1/D17-1297
  • 2016 (Accepted for publication)

  • Rei, M. and Cao, K., 2016 (Accepted for publication). A Joint Model for Word Embedding and Word Morphology
  • 2016

  • Alikaniotis, D., Yannakoudakis, H. and Rei, M., 2016. Automatic text scoring using neural networks 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, v. 2
    Doi: http://doi.org/10.18653/v1/p16-1068
  • Rei, M. and Yannakoudakis, H., 2016. Compositional sequence labeling models for error detection in learner writing 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, v. 2
  • Rei, M. and Cummins, R., 2016. Sentence Similarity Measures for Fine-Grained Estimation of Topical Relevance in Learner Essays https://aclweb.org/anthology/volumes/proceedings-of-the-11th-workshop-on-innovative-use-of-nlp-for-building-educational-applications/,
    Doi: http://doi.org/10.18653/v1/W16-05
  • Rei, M., Crichton, GKO. and Pyysalo, S., 2016. Attending to characters in neural sequence labeling models COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,
  • 2015

  • Rei, M., 2015. Online representation learning in recurrent neural language models Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing,
  • 2014

  • Rei, M. and Briscoe, T., 2014. Parser lexicalisation through self-learning NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference,
  • Rei, M. and Briscoe, T., 2014. Looking for Hyponyms in Vector Space. CoNLL,
  • 2011

  • Rei, M. and Briscoe, T., 2011. Unsupervised Entailment Detection between Dependency Graph Fragments
    Doi: http://doi.org/10.17863/CAM.21358
  • 2010

  • Rei, M. and Briscoe, T., 2010. Combining manual rules and supervised learning for hedge cue and scope detection Proceedings of the Fourteenth Conference on Computational Natural Language Learning: Shared Task,
  • Journal articles

    2017

  • Farag, Y., Rei, M. and Briscoe, T., 2017. An Error-Oriented Approach to Word Embedding Pre-Training. CoRR, v. abs/1707.06841
  • Rei, M., Felice, M., Yuan, Z. and Briscoe, T., 2017. Artificial Error Generation with Machine Translation and Syntactic Patterns. CoRR, v. abs/1707.05236
  • 2011

  • Briscoe, T., Harrison, K., Naish, A., Parker, A., Rei, M., Siddharthan, A., Sinclair, D., Slater, M. and Watson, R., 2011. Intelligent Information Access from Scientific Papers Current Challenges in Patent Information Retrieval,
  • Theses / dissertations

    2013

  • Rei, M., 2013. Minimally supervised dependency-based methods for natural language processing

  • Read more at: Mariano Felice

    Mariano Felice

    Grammatical error detection and correction in non-native English text


    Read more at: Professor Nigel Collier

    Professor Nigel Collier

    Computational linguistics; machine learning; semantics; text/data mining; knowledge discovery; domain adaptation; question answering

    Conference proceedings

    2024

  • Liu, Y., Fang, Y., Vandyke, D. and Collier, N., 2024. TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Fu, Z., Zhang, M., Meng, Z., Shen, Y., Buckeridge, D. and Collier, N., 2024. BAND: Biomedical Alert News Dataset Proceedings of the AAAI Conference on Artificial Intelligence, v. 38
    Doi: http://doi.org/10.1609/aaai.v38i16.29757
  • Liu, Y., Su, Y., Shareghi, E. and Collier, N., 2024. Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024, v. 2
  • Hu, T. and Collier, N., 2024. Quantifying the Persona Effect in LLM Simulations Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Han, J., Collier, N., Buntine, W. and Shareghi, E., 2024. PiVe: Prompting with Iterative Verification Improving Graph-based Generative Capability of LLMs Proceedings of the Annual Meeting of the Association for Computational Linguistics,
    Doi: 10.18653/v1/2024.findings-acl.400
  • 2023

  • Vulić, I., Glavaš, G., Liu, F., Collier, N., Ponti, EM. and Korhonen, A., 2023. Probing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference,
  • Fu, Z., Yang, H., So, AMC., Lam, W., Bing, L. and Collier, N., 2023. On the Effectiveness of Parameter-Efficient Fine-Tuning Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023, v. 37
    Doi: 10.1609/aaai.v37i11.26505
  • Liu, F., Piccinno, F., Krichene, S., Pang, C., Lee, K., Joshi, M., Altun, Y., Collier, N. and Eisenschlos, JM., 2023. MATCHA: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Zhang, M., Su, Y., Meng, Z., Fu, Z. and Collier, N., 2023. COFFEE: A Contrastive Oracle-Free Framework for Event Extraction Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Liu, F., Eisenschlos, JM., Piccinno, F., Krichene, S., Pang, C., Lee, K., Joshi, M., Chen, W., Collier, N. and Altun, Y., 2023. DEPLOT: One-shot visual language reasoning by plot-to-table translation Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Shu, C., Han, J., Liu, F., Shareghi, E. and Collier, N., 2023. POSQA: Probe the World Models of LLMs with Size Comparisons Findings of the Association for Computational Linguistics: EMNLP 2023,
  • Fu, Z., Su, Y., Meng, Z. and Collier, N., 2023. Biomedical Named Entity Recognition via Dictionary-based Synonym Generalization EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Li, H., Lan, T., Fu, Z., Cai, D., Liu, L., Collier, N., Watanabe, T. and Su, Y., 2023. Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective Advances in Neural Information Processing Systems, v. 36
  • 2022

  • Okhmatovskaia, A., Shen, Y., Ganser, I., Collier, N., King, NB., Meng, Z. and Buckeridge, DL., 2022. A Conceptual Framework for Representing Events Under Public Health Surveillance. Stud Health Technol Inform, v. 294
    Doi: http://doi.org/10.3233/SHTI220480
  • Su, Y., Liu, F., Meng, Z., Lan, T., Shu, L., Shareghi, E. and Collier, N., 2022. TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning Findings of the Association for Computational Linguistics: NAACL 2022 - Findings,
  • Conforti, C., Berndt, J., Pilehvar, MT., Giannitsarou, C., Toxvaerd, F. and Collier, N., 2022. Incorporating Stock Market Signals for Twitter Stance Detection Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Meng, Z., Liu, F., Shareghi, E., Su, Y., Collins, C. and Collier, N., 2022. Rewire-then-Probe: A Contrastive Recipe for Probing Biomedical Knowledge of Pre-trained Language Models Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Li, Y., Liu, F., Collier, N., Korhonen, A. and Vulic, I., 2022. Improving Word Translation via Two-Stage Contrastive Learning Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Zhou, W., Liu, F., Vulic, I., Collier, N. and Chen, M., 2022. Prix-LM: Pretraining for Multilingual Knowledge Base Construction Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Liu, Y., Su, Y., Shareghi, E. and Collier, N., 2022. Plug-and-Play Recipe Generation with Content Planning GEM 2022 - 2nd Workshop on Natural Language Generation, Evaluation, and Metrics, Proceedings of the Workshop,
  • Su, Y., Lan, T., Wang, Y., Yogatama, D., Kong, L. and Collier, N., 2022. A Contrastive Framework for Neural Text Generation Advances in Neural Information Processing Systems, v. 35
  • 2021

  • Su, Y., Cai, D., Wang, Y., Vandyke, D., Baker, S., Li, P. and Collier, N., 2021. Non-Autoregressive Text Generation with Pre-trained Language Models Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics,
  • Meng, Z., Liu, F., Clark, T., Shareghi, E. and Collier, N., 2021. Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing,
    Doi: 10.18653/v1/2021.emnlp-main.383
  • Liu, F., Vulic, I., Korhonen, A. and Collier, N., 2021. Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021),
  • Clark, TH., Conforti, C., Liu, F., Meng, Z., Shareghi, E. and Collier, N., 2021. Integrating Transformers and Knowledge Graphs for Twitter Stance Detection W-NUT 2021 - 7th Workshop on Noisy User-Generated Text, Proceedings of the Conference,
  • Liu, F., Shareghi, E., Meng, Z., Basaldella, M. and Collier, N., 2021. Self-Alignment Pretraining for Biomedical Entity Representations NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference,
  • Prokhorov, V., Li, Y., Shareghi, E. and Collier, N., 2021. Learning Sparse Sentence Encoding without Supervision: An Exploration of Sparsity in Variational Autoencoders RepL4NLP 2021 - 6th Workshop on Representation Learning for NLP, Proceedings of the Workshop,
  • Su, Y., Cai, D., Zhou, Q., Lin, Z., Baker, S., Cao, Y., Shi, S., Collier, N. and Wang, Y., 2021. Dialogue response selection with hierarchical curriculum learning ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference,
  • Conforti, C., Berndt, J., Pilehvar, MT., Giannitsarou, C., Toxvaerd, F. and Collier, N., 2021. Synthetic Examples Improve Cross-Target Generalization: A Study on Stance Detection on a Twitter Corpus WASSA 2021 - Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Proceedings of the 11th Workshop,
  • Liu, F., Bugliarello, E., Ponti, EM., Redely, S., Collier, N. and Elliott, D., 2021. Visually Grounded Reasoning across Languages and Cultures EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Liu, F., Vulić, I., Korhonen, A. and Collier, N., 2021. Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, v. 2
  • Liu, F., Vulić, I., Korhonen, A. and Collier, N., 2021. Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Meng, Z., Liu, F., Clark, TH., Shareghi, E. and Collier, N., 2021. Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Liu, F., Chen, M., Roth, D. and Collier, N., 2021. Visual Pivoting for (Unsupervised) Entity Alignment 35th AAAI Conference on Artificial Intelligence, AAAI 2021, v. 5B
  • Conforti, C., Berndt, J., Pilehvar, MT., Basaldella, M., Giannitsarou, C., Toxvaerd, F. and Collier, N., 2021. Adversarial Training for News Stance Detection: Leveraging Signals from a Multi-Genre Corpus. EACL Hackashop on News Media Content Analysis and Automated Report Generation, Hackashop 2021 at 16th conference of the European Chapter of the Association for Computational Linguistics, EACL 2021 - Proceedings,
  • Su, Y., Vandyke, D., Baker, S., Wang, Y. and Collier, N., 2021. Keep the Primary, Rewrite the Secondary: A Two-Stage Approach for Paraphrase Generation Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021,
  • Liu, Q., Liu, F., Collier, N., Korhonen, A. and Vulić, I., 2021. MIRRORWIC: On Eliciting Word-in-Context Representations from Pretrained Language Models CoNLL 2021 - 25th Conference on Computational Natural Language Learning, Proceedings,
  • Su, Y., Meng, Z., Baker, S. and Collier, N., 2021. Few-Shot Table-to-Text Generation with Prototype Memory Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021,
  • Su, Y., Vandyke, D., Wang, S., Fang, Y. and Collier, N., 2021. Plan-then-Generate: Controlled Data-to-Text Generation via Planning Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021,
  • Liu, Q., Liu, F., Collier, N., Korhonen, A. and Vulić, I., 2021. MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models Proceedings of the 25th Conference on Computational Natural Language Learning,
    Doi: 10.18653/v1/2021.conll-1.44
  • 2020

  • Conforti, C., Berndt, J., Pilehvar, MT., Giannitsarou, C., Toxvaerd, F. and Collier, N., 2020. STANDER: An expert-annotated dataset for news stance detection and evidence retrieval Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020,
  • Basaldella, M., Liu, F., Shareghi, E. and Collier, N., 2020. COMETA: A corpus for medical entity linking in the social media EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
  • Conforti, C., Berndt, J., Pilehvar, MT., Giannitsarou, C., Toxvaerd, F. and Collier, N., 2020. Will-they-won't-they: A very large dataset for stance detection on twitter Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Liu, F., Shareghi, E., Meng, Z., Basaldella, M. and Collier, N., 2020. Self-Alignment Pretraining for Biomedical Entity Representations
  • Pilehvar, MT., Kartsaklis, D., Prokhorov, V. and Collier, N., 2020. CARD-660: Cambridge rare word dataset - A reliable benchmark for infrequent word representation models Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018,
  • 2019 (No publication date)

  • Conforti, C., Collier, N. and Pilehvar, M., 2019 (No publication date). Towards Automatic Fake News Detection: Cross-Level Stance Detection in News Articles
    Doi: http://doi.org/10.17863/CAM.37758
  • Conforti, C., Collier, N. and Pilehvar, M., 2019 (No publication date). Towards Automatic Fake News Detection: Cross-Level Stance Detection in News Articles
  • 2019

  • Conforti, C., Pilehvar, MT. and Collier, N., 2019. Modeling the fake news challenge as a cross-level stance detection task CEUR Workshop Proceedings, v. 2482
  • Prokhorov, V., Pilehvar, MT. and Collier, N., 2019. Generating knowledge graph paths from textual definitions using sequence-to-sequence models NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1
  • Prokhorov, V., Shareghi, E., Li, Y., Pilehvar, MT. and Collier, N., 2019. On the importance of the Kullback-Leibler divergence term in Variational Autoencoders for text generation EMNLP-IJCNLP 2019 - Proceedings of the 3rd Workshop on Neural Generation and Translation,
  • Basaldella, M. and Collier, N., 2019. BioReddit: Word embeddings for user-generated biomedical NLP LOUHI@EMNLP 2019 - 10th International Workshop on Health Text Mining and Information Analysis, Proceedings,
  • Prokhorov, V., Pilehvar, MT., Kartsaklis, D., Liò, P. and Collier, N., 2019. Unseen word representation by aligning heterogeneous lexical semantic spaces 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019,
  • Can, DC., Le, HQ., Ha, QT. and Collier, N., 2019. A richer-but-smarter shortest dependency path with attentive augmentation for relation extraction NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1
  • Prokhorov, V., Pilehvar, MT., Kartsaklis, D., Liò, P. and Collier, N., 2019. Unseen Word Representation by Aligning Heterogeneous Lexical Semantic Spaces. AAAI,
  • 2018

  • Pilehvar, MT., Kartsaklis, D., Prokhorov, V. and Collier, N., 2018. CARD-660: Cambridge rare word dataset - A reliable benchmark for infrequent word representation models Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018,
  • Gritta, M., Pilehvar, MT. and Collier, N., 2018. Which Melbourne? Augmenting geocoding with maps ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 1
    Doi: http://doi.org/10.18653/v1/p18-1119
  • Kartsaklis, D., Pilehvar, MT. and Collier, N., 2018. Mapping text to knowledge graph entities using multi-sense LSTMs Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018,
  • Le, HQ., Can, DC., Vu, ST., Dang, TH., Pilehvar, MT. and Collier, N., 2018. Large-scale exploration of neural relation classification architectures Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018,
  • 2017 (Accepted for publication)

  • Collier, NH., Pilehvar, MT., Limsopatham, N. and Gritta, M., 2017 (Accepted for publication). Vancouver Welcomes You! Minimalist Location Metonymy Resolution Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2017), Vancouver, Canada,
    Doi: http://doi.org/10.18653/v1/P17-1115
  • 2017

  • Collier, N., Limsopatham, N., Culotta, A., Conway, M., Cox, IJ. and Lampos, V., 2017. WSDM 2017 workshop on mining online health reports WSDM workshop summary WSDM 2017 - Proceedings of the 10th ACM International Conference on Web Search and Data Mining,
    Doi: http://doi.org/10.1145/3018661.3022761
  • Pilehvar, MT. and Collier, N., 2017. Inducing Embeddings for Rare and Unseen Words by Leveraging Lexical Resources Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, v. 2, Short Papers
  • Pilehvar, MT., Camacho-Collados, J., Navigli, R. and Collier, N., 2017. Towards a seamless integration of word senses into downstream NLP applications ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 1
    Doi: http://doi.org/10.18653/v1/P17-1170
  • Le, HQ., Tran, MV., Can, DC., Ha, QT., Dang, TH. and Collier, N., 2017. Improving chemical-induced disease relation extraction with learned features based on convolutional neural network Proceedings - 2017 9th International Conference on Knowledge and Systems Engineering, KSE 2017, v. 2017-January
    Doi: http://doi.org/10.1109/KSE.2017.8119474
  • 2016

  • Limsopatham, N. and Collier, N., 2016. Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016) at the 26th International Conference on Computational Linguistics (COLING 2016) Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining,
  • Limsopatham, N. and Collier, NH., 2016. Bidirectional LSTM for Named Entity Recognition in Twitter Messages Proceedings of the 2nd Workshop on Noisy User-generated Text,
  • Collier, NH. and Pilehvar, MT., 2016. De-Conflated Semantic Representations Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,
    Doi: http://doi.org/10.18653/v1/D16-1174
  • Limsopatham, N. and Collier, N., 2016. Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16),
  • 2015

  • Limsopatham, N. and Collier, N., 2015. Adapting phrase-based machine translation to normalise medical terms in social media messages Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing,
    Doi: http://doi.org/10.18653/v1/d15-1194
  • Limsopatham, N. and Collier, N., 2015. Towards the semantic interpretation of personal health messages from social media UCUI 2015 - Proceedings of the ACM 1st International Workshop on Understanding the City with Urban Informatics, co-located with CIKM 2015,
    Doi: http://doi.org/10.1145/2811271.2811275
  • 2014

  • Lofi, C., Nieke, C. and Collier, N., 2014. Discriminating rhetorical analogies in social media 14th Conference of the European Chapter of the Association for Computational Linguistics 2014, EACL 2014,
    Doi: http://doi.org/10.3115/v1/e14-1059
  • Collier, N., Paster, F. and Tran, MV., 2014. The impact of near domain transfer on biomedical named entity recognition Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis, Louhi 2014 at the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014,
  • 2013

  • Bao, Y., Collier, N. and Datta, A., 2013. Improving text categorization by augmenting topic features with a small number of word features WITS 2013 - 23rd Workshop on Information Technology and Systems: Leveraging Big Data Analytics for Societal Benefits,
  • Groza, T., Oellrich, A. and Collier, N., 2013. Using silver and semi-gold standard corpora to compare open named entity recognisers Proceedings - 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013,
    Doi: http://doi.org/10.1109/BIBM.2013.6732541
  • Tran, MV., Le, HQ., Phi, VT., Pham, TB. and Collier, N., 2013. Exploring a Probabilistic Earley Parser for Event Composition in Biomedical Texts Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 2013-October
  • Bao, Y., Collier, N. and Datta, A., 2013. A partially supervised cross-collection topic model for cross-domain text classification International Conference on Information and Knowledge Management, Proceedings,
    Doi: http://doi.org/10.1145/2505515.2505556
  • 2012

  • Collier, N., Tran, MV., Le, HQ., Oellrich, A., Kawazoe, A., Hall-May, M. and Rebholz-Schuhmann, D., 2012. A hybrid approach to finding phenotype candidates in genetic texts 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers,
  • Lau, JH., Collier, N. and Baldwin, T., 2012. On-line trend analysis with topic models: Twitter trends detection topic model online 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers,
  • Ananiadou, S., Salakoski, T., Pyysalo, S., Rebholz-Schuhmann, D., Rinaldi, F., Schneider, G., Clematide, S., Grigonyte, G., Shepherd, A., Burgun-Parenthoine, A., McClosky, D., Demner-Fushman, D., Ginter, F., Leitner, F., Nenadic, G., Yi, GS., Liu, H., Su, J., Lee, H., Kim, JD., Park, JC., Kim, JJ., Verspoor, K., Cohen, K., Miwa, M., Krallinger, M., Romacker, M., Volk, M., Krauthammer, M., Conway, M., Okazaki, N., Collier, N., Ruch, P., Lambrix, P., Zweigenbaum, P., Ohta, T., Sætre, R., Hahn, U., Chapman, W., Tsuruoka, Y., Sasaki, Y., Mulkar-Mehta, R., Zhang, W. and Stenetorp, P., 2012. Introductory remarks SMBM 2012 - Proceedings of the 5th International Symposium on Semantic Mining in Biomedicine,
    Doi: http://doi.org/10.5167/uzh-64476
  • Doan, S., Ohno-Machado, L. and Collier, N., 2012. Enhancing twitter data analysis with simple semantic filtering: Example in tracking influenza-like illnesses Proceedings - 2012 IEEE 2nd Conference on Healthcare Informatics, Imaging and Systems Biology, HISB 2012,
    Doi: http://doi.org/10.1109/HISB.2012.21
  • Liu, S., Yamada, M., Collier, N. and Sugiyama, M., 2012. Change-point detection in time-series data by relative density-ratio estimation Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 7626 LNCS
    Doi: http://doi.org/10.1007/978-3-642-34166-3_40
  • Collier, N. and Doan, S., 2012. Syndromic classification of twitter messages Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, v. 91 LNICST
    Doi: http://doi.org/10.1007/978-3-642-29262-0_27
  • Doan, S., Vo, BKH. and Collier, N., 2012. An analysis of twitter messages in the 2011 Tohoku Earthquake Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, v. 91 LNICST
    Doi: http://doi.org/10.1007/978-3-642-29262-0_8
  • 2010

  • Collier, N., Goodwin, RM., Mccrae, J., Doan, S., Kawazoe, A., Conway, M., Kawtrakul, A., Takeuchi, K. and Dien, D., 2010. An ontology-driven system for detecting global health events Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference, v. 2
  • Collier, N., 2010. Towards cross-lingual alerting for bursty epidemic events CEUR Workshop Proceedings, v. 714
  • Collier, N., Son, NT. and Nguyen, NM., 2010. OMG U got flu? Analysis of shared health messages for bio-surveillance CEUR Workshop Proceedings, v. 714
  • Rebholz-Schuhmann, D., Yepes, AJ., Li, C., Kafkas, S., Lewin, I., Kang, N., Corbett, P., Milward, D., Buyko, E., Beisswanger, E., Hornbostel, K., Kouznetsov, A., Witte, R., Laurila, JB., Baker, CJO., Kuo, CJ., Clematide, S., Rinaldi, F., Farkas, R., Móra, G., Hara, K., Furlong, L., Rautschka, M., Neves, ML., Pascual-Montano, A., Wei, Q., Collier, N., Chowdhury, MFM., Lavelli, A., Berlanga, R., Morante, R., Van Asch, V., Daelemans, W., Marina, JL., Van Mulligen, E., Kors, J. and Hahn, U., 2010. Assessment of NER solutions against the first and second CALBC Silver Standard Corpus CEUR Workshop Proceedings, v. 714
  • 2009

  • Conway, M., Collier, N. and Doan, S., 2009. Using Hedges to Enhance a Disease Outbreak Report Text Mining System Proceedings ...,
  • Sinnou, T., Takeuchi, K. and Collier, N., 2009. Bio-medical term extraction on simple rule language 3rd International Symposium on Languages in Biology and Medicine, LBM 2009,
  • Rebholz-Schuhmann, D., Collier, N., Park, JC. and Wong, L., 2009. Preface 3rd International Symposium on Languages in Biology and Medicine, LBM 2009,
  • 2008

  • Doan, S., Hung-Ngo, Q., Kawazoe, A. and Collier, N., 2008. Global health monitor - A web-based system for detecting and mapping infectious diseases IJCNLP 2008 - 3rd International Joint Conference on Natural Language Processing, Proceedings of the Conference, v. 2
  • Conway, M., Doan, S., Kawazoe, A. and Collier, N., 2008. Classifying disease outbreak reports using n-grams and semantic features 3rd International Symposium on Semantic Mining in Biomedicine, SMBM 2008 - Proceedings,
  • 2007

  • Hoang, V., Nguyen, N., Dinh, D. and Collier, N., 2007. Topic-based Vietnamese news document filtering in the BioCaster Project Proceedings - ALPIT 2007 6th International Conference on Advanced Language Processing and Web Information Technology,
    Doi: http://doi.org/10.1109/ALPIT.2007.56
  • Wei, Q., Krymolowski, Y. and Collier, N., 2007. Towards a methodology for entity error analysis in annotated corpora CEUR Workshop Proceedings, v. 289
  • Doan, S., Kawazoe, A. and Collier, N., 2007. The role of roles in classifying annotated biomedical text ACL 2007 - Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing,
    Doi: http://doi.org/10.3115/1572392.1572396
  • 2006

  • Korhonen, A., Krymolowski, Y. and Collier, N., 2006. Automatic Classification of Verbs in Biomedical Texts COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE,
  • Kawazoe, A., Jin, L., Shigematsu, M., Barrero, R., Taniguchi, K. and Collier, N., 2006. The development of a schema for the annotation of terms in the BioCaster disease detecting/tracking system CEUR Workshop Proceedings, v. 222
  • 2005

  • Cohen, KB., Hirschman, L., Shatkay, H., Blaschke, C., Ananiadou, S., Aronson, L., Baldwin, B., Bodenreider, O., Bradshaw, S., Carpenter, B., Chang, J., Cohen, A., Collier, N., Fox, L., Futrelle, B., Harkema, H., Hearst, M., Hunter, L., Johnson, S., Light, M., Liu, H., Morgan, A., Pustejovsky, J., Rindflesch, T., Rzhetsky, A., Saric, J., Tanabe, L., Tsujii, JI., Valencia, A., Verspoor, K., Wilbur, J., Yu, H. and Blake, JA., 2005. Introduction ACL-ISMB 2005 - Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics, Proceedings of the Workshop,
  • Wattarujeekrit, T. and Collier, N., 2005. Exploring predicate-argument relations for named entity recognition in the molecular biology domain Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 3735 LNAI
    Doi: http://doi.org/10.1007/11563983_23
  • 2004

  • Mullen, T. and Collier, N., 2004. Sentiment analysis using support vector machines with diverse information sources Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, EMNLP 2004 - A meeting of SIGDAT, a Special Interest Group of the ACL held in conjunction with ACL 2004,
  • Mizuta, Y. and Collier, N., 2004. Zone identification in biology articles as a basis for information extraction Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications - JNLPBA '04,
    Doi: 10.3115/1567594.1567600
  • Kawazoe, A., Kitamoto, A. and Collier, N., 2004. Managing the semantics of coreference relations with Open Ontology Forge CEUR Workshop Proceedings, v. 184
  • Wattarujeekrit, T. and Collier, N., 2004. Integrating event frame annotation into the open ontology forge annotation tool CEUR Workshop Proceedings, v. 184
  • Mullen, T. and Collier, N., 2004. Incorporating topic information into sentiment analysis models Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 2004-July
  • Kim, J-D., Ohta, T., Tsuruoka, Y., Tateisi, Y. and Collier, N., 2004. Introduction to the bio-entity recognition task at JNLPBA Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications - JNLPBA '04,
    Doi: 10.3115/1567594.1567610
  • Kawazoe, A., Kitamoto, A. and Collier, N., 2004. Annotation of coreference relations among linguistic expressions and images in biological articles Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004,
  • Mizuta, Y. and Collier, N., 2004. An annotation scheme for a rhetorical analysis of biology articles Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004,
  • 2003

  • Collier, N., Takeuchi, K., Kawazoe, A., Mullen, T. and Wattarujeekrit, T., 2003. A framework for integrating deep and shallow semantic structures in text mining Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science), v. 2773 PART 1
    Doi: http://doi.org/10.1007/978-3-540-45224-9_110
  • 2000

  • Nobata, C., Collier, N. and Tsujii, J., 2000. Comparison between tagged corpora for the named entity task Proceedings of the workshop on Comparing corpora -, v. 9
    Doi: 10.3115/1117729.1117733
  • 1999

  • Collier, N., Tsujii, J-I., Park, HS., Ogata, N., Tateishi, Y., Nobata, C., Ohta, T., Sekimizu, T., Imai, H. and Ibushi, K., 1999. The GENIA project Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics -,
    Doi: 10.3115/977035.977081
  • Jones, G., Sakai, T., Collier, N., Kumano, A. and Sumita, K., 1999. A comparison of query translation methods for English-Japanese cross-language information retrieval Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1999,
    Doi: http://doi.org/10.1145/312624.312690
  • Collier, N., Park, HS., Ogata, N., Tateishi, Y., Nobata, C., Ohta, T., Sekimizu, T., Imai, H., Ibushi, K. and Tsujii, JI., 1999. The GENIA project: Corpus-based knowledge acquisition and information extraction from genome research papers 9th Conference of the European Chapter of the Association for Computational Linguistics, EACL 1999,
  • 1998

  • Collier, N., Hirakawa, H. and Kumano, A., 1998. Machine translation vs. dictionary term translation - A comparison for English-Japanese news article alignment Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Collier, N., Ono, K. and Hirakawa, H., 1998. An experiment in hybrid dictionary and statistical sentence alignment Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Collier, N., Hirakawa, H. and Kumano, A., 1998. Machine translation vs. dictionary term translation Proceedings of the 36th annual meeting on Association for Computational Linguistics -, v. 1
    Doi: 10.3115/980845.980888
  • Collier, N., Ono, K. and Hirakawa, H., 1998. An experiment in hybrid dictionary and statistical sentence alignment Proceedings of the 17th international conference on Computational linguistics -, v. 1
    Doi: 10.3115/980451.980889
  • 1997

  • Collier, N., 1997. Convergence time characteristics of an associative memory for natural language processing IJCAI International Joint Conference on Artificial Intelligence, v. 2
  • Journal articles

    2023

  • Nettekoven, CR., Diederen, K., Giles, O., Duncan, H., Stenson, I., Olah, J., Gibbs-Dean, T., Collier, N., Vértes, PE., Spencer, TJ., Morgan, SE. and McGuire, P., 2023. Semantic Speech Networks Linked to Formal Thought Disorder in Early Psychosis. Schizophr Bull, v. 49
    Doi: 10.1093/schbul/sbac056
  • 2022

  • Pilehvar, MT., Bernard, A., Smedley, D. and Collier, N., 2022. PheneBank: a literature-based database of phenotypes. Bioinformatics, v. 38
    Doi: http://doi.org/10.1093/bioinformatics/btab740
  • Meng, Z., Okhmatovskaia, A., Polleri, M., Shen, Y., Powell, G., Fu, Z., Ganser, I., Zhang, M., King, NB., Buckeridge, D. and Collier, N., 2022. BioCaster in 2021: automatic disease outbreaks detection from global news media. Bioinformatics, v. 38
    Doi: http://doi.org/10.1093/bioinformatics/btac497
  • Wu, H., Wang, M., Wu, J., Francis, F., Chang, Y-H., Shavick, A., Dong, H., Poon, MTC., Fitzpatrick, N., Levine, AP., Slater, LT., Handy, A., Karwath, A., Gkoutos, GV., Chelala, C., Shah, AD., Stewart, R., Collier, N., Alex, B., Whiteley, W., Sudlow, C., Roberts, A. and Dobson, RJB., 2022. A survey on clinical natural language processing in the United Kingdom from 2007 to 2022. NPJ Digit Med, v. 5
    Doi: http://doi.org/10.1038/s41746-022-00730-6
  • Le, H-Q., Can, D-C. and Collier, N., 2022. Exploiting document graphs for inter sentence relation extraction. J Biomed Semantics, v. 13
    Doi: http://doi.org/10.1186/s13326-022-00267-3
  • 2021

  • Su, Y., Wang, Y., Cai, D., Baker, S., Korhonen, A. and Collier, N., 2021. PROTOTYPE-TO-STYLE: Dialogue Generation with Style-Aware Editing on Retrieval Memory IEEE/ACM Transactions on Audio Speech and Language Processing, v. 29
    Doi: http://doi.org/10.1109/TASLP.2021.3087948
  • 2020

  • Gritta, M., Pilehvar, MT. and Collier, N., 2020. A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics Language Resources and Evaluation, v. 54
    Doi: http://doi.org/10.1007/s10579-019-09475-3
  • Kim, J-D., Cohen, KB., Rinaldi, F., Lu, Z., Collier, N. and Park, H-S., 2020. Editor's introduction to the special issue of the 6th Biomedical Linked Annotation Hackathon (BLAH6). Genomics Inform, v. 18
    Doi: http://doi.org/10.5808/GI.2020.18.2.e12
  • 2019

  • Kim, J-D., Cohen, KB., Collier, N., Lu, Z. and Rinaldi, F., 2019. Introduction to BLAH5 special issue: recent progress on interoperability of biomedical text mining. Genomics Inform, v. 17
    Doi: http://doi.org/10.5808/GI.2019.17.2.e12
  • Gritta, M., Collier, N. and Pilehvar, M., 2019. A Pragmatic Guide to Geoparsing Evaluation Language Resources and Evaluation,
  • 2017 (Accepted for publication)

  • Gritta, M., Pilehvar, MT., Limsopatham, N. and Collier, N., 2017 (Accepted for publication). Vancouver Welcomes You! Minimalist Location Metonymy Resolution Association for Computational Linguistics,
    Doi: http://doi.org/10.18653/v1/P17-1115
  • 2017

  • Gritta, M., Pilehvar, MT., Limsopatham, N. and Collier, N., 2017. What’s missing in geographical parsing? Language Resources and Evaluation,
    Doi: http://doi.org/10.1007/s10579-017-9385-8
  • Alvaro, N., Miyao, Y. and Collier, N., 2017. TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations. JMIR Public Health Surveillance, v. 3
    Doi: http://doi.org/10.2196/publichealth.6396
  • 2016 (Accepted for publication)

  • Le, H-Q., Tran, M-V., Dang, TH., Ha, Q-T. and Collier, N., 2016 (Accepted for publication). Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction. Database : the Journal of Biological Databases and Curation, v. 2016
    Doi: http://doi.org/10.1093/database/baw102
  • Verspoor, K., Oellrich, A., Collier, N., Groza, T., Rocca-Serra, P., Soldatova, L., Dumontier, M. and Shah, N., 2016 (Accepted for publication). Thematic issue of the Second combined Bio-ontologies and Phenotypes Workshop. Journal of Biomedical Semantics, v. 7
    Doi: http://doi.org/10.1186/s13326-016-0108-7
  • 2016

  • Oellrich, A., Collier, N., Groza, T., Rebholz-Schuhmann, D., Shah, N., Bodenreider, O., Boland, MR., Georgiev, I., Liu, H., Livingston, K., Luna, A., Mallon, A-M., Manda, P., Robinson, PN., Rustici, G., Simon, M., Wang, L., Winnenburg, R. and Dumontier, M., 2016. The digital revolution in phenotyping. Brief Bioinform, v. 17
    Doi: http://doi.org/10.1093/bib/bbv083
  • Pilehvar, MT. and Collier, N., 2016. Improved semantic representation for domain-specific entities BioNLP 2016 - Proceedings of the 15th Workshop on Biomedical Natural Language Processing,
  • Limsopatham, N. and Collier, N., 2016. Modelling the combination of generic and target domain embeddings in a convolutional neural network for sentence classification BioNLP 2016 - Proceedings of the 15th Workshop on Biomedical Natural Language Processing,
  • 2015

  • Collier, N., Groza, T., Smedley, D., Robinson, PN., Oellrich, A. and Rebholz-Schuhmann, D., 2015. PhenoMiner: from text to a database of phenotypes associated with OMIM diseases. Database (Oxford), v. 2015
    Doi: http://doi.org/10.1093/database/bav104
  • Soldatova, LN., Collier, N., Oellrich, A., Groza, T., Verspoor, K., Rocca-Serra, P., Dumontier, M. and Shah, NH., 2015. Special issue on bio-ontologies and phenotypes. J Biomed Semantics, v. 6
    Doi: http://doi.org/10.1186/s13326-015-0040-2
  • Alvaro, N., Conway, M., Doan, S., Lofi, C., Overington, J. and Collier, N., 2015. Crowdsourcing Twitter annotations to identify first-hand experiences of prescription drug use. J Biomed Inform, v. 58
    Doi: http://doi.org/10.1016/j.jbi.2015.11.004
  • Collier, N., Oellrich, A. and Groza, T., 2015. Concept selection for phenotypes and diseases using learn to rank. J Biomed Semantics, v. 6
    Doi: http://doi.org/10.1186/s13326-015-0019-z
  • Groza, T., Köhler, S., Doelken, S., Collier, N., Oellrich, A., Smedley, D., Couto, FM., Baynam, G., Zankl, A. and Robinson, PN., 2015. Automatic concept recognition using the human phenotype ontology reference and test suite corpora. Database (Oxford), v. 2015
    Doi: http://doi.org/10.1093/database/bav005
  • Kim, J-D., Cohen, KB., Collier, N., Lu, Z. and Stenetorp, P., 2015. Introduction to the Biomedical Linked Annotation Hackathon (BLAH) 2015 Symposium BMC proceedings, v. 9
    Doi: http://doi.org/10.1186/1753-6561-9-s5-a1
  • Oellrich, A., Collier, N., Smedley, D. and Groza, T., 2015. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes. PLoS One, v. 10
    Doi: http://doi.org/10.1371/journal.pone.0116040
  • 2014

  • Barboza, P., Vaillant, L., Le Strat, Y., Hartley, DM., Nelson, NP., Mawudeku, A., Madoff, LC., Linge, JP., Collier, N., Brownstein, JS. and Astagneau, P., 2014. Factors influencing performance of internet-based biosurveillance systems used in epidemic intelligence for early detection of infectious diseases outbreaks. PLoS One, v. 9
    Doi: http://doi.org/10.1371/journal.pone.0090536
  • 2013 (Published online)

  • Keffala, B., Conway, M., Doan, S. and Collier, N., 2013 (Published online). Content Analysis of Syndromic Twitter Data Online Journal of Public Health Informatics, v. 5
    Doi: 10.5210/ojphi.v5i1.4548
  • 2013

  • Barboza, P., Vaillant, L., Mawudeku, A., Nelson, NP., Hartley, DM., Madoff, LC., Linge, JP., Collier, N., Brownstein, JS., Yangarber, R., Astagneau, P. and Early Alerting Reporting Project Of The Global Health Security Initiative, , 2013. Evaluation of epidemic intelligence systems integrated in the early alerting and reporting project for the detection of A/H5N1 influenza events. PLoS One, v. 8
    Doi: http://doi.org/10.1371/journal.pone.0057252
  • Collier, N., Tran, M-V., Le, H-Q., Ha, Q-T., Oellrich, A. and Rebholz-Schuhmann, D., 2013. Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking. PLoS One, v. 8
    Doi: http://doi.org/10.1371/journal.pone.0072965
  • Hay, SI., Battle, KE., Pigott, DM., Smith, DL., Moyes, CL., Bhatt, S., Brownstein, JS., Collier, N., Myers, MF., George, DB. and Gething, PW., 2013. Global mapping of infectious disease. Philos Trans R Soc Lond B Biol Sci, v. 368
    Doi: http://doi.org/10.1098/rstb.2012.0250
  • Collier, N., Oellrich, A. and Groza, T., 2013. Toward knowledge support for analysis and interpretation of complex traits. Genome Biol, v. 14
    Doi: http://doi.org/10.1186/gb-2013-14-9-214
  • Hartley, DM., Nelson, NP., Arthur, RR., Barboza, P., Collier, N., Lightfoot, N., Linge, JP., van der Goot, E., Mawudeku, A., Madoff, LC., Vaillant, L., Walters, R., Yangarber, R., Mantero, J., Corley, CD. and Brownstein, JS., 2013. An overview of internet biosurveillance. Clin Microbiol Infect, v. 19
    Doi: http://doi.org/10.1111/1469-0691.12273
  • Liu, S., Yamada, M., Collier, N. and Sugiyama, M., 2013. Change-point detection in time-series data by relative density-ratio estimation. Neural Netw, v. 43
    Doi: http://doi.org/10.1016/j.neunet.2013.01.012
  • 2012

  • Collier, N., 2012. Uncovering text mining: a survey of current work on web-based epidemic intelligence. Glob Public Health, v. 7
    Doi: http://doi.org/10.1080/17441692.2012.699975
  • Doan, S., Collier, N., Xu, H., Pham, HD. and Tu, MP., 2012. Recognition of medication information from discharge summaries using ensembles of classifiers. BMC Med Inform Decis Mak, v. 12
    Doi: http://doi.org/10.1186/1472-6947-12-36
  • Collier, N. and Doan, S., 2012. GENI-DB: a database of global events for epidemic intelligence. Bioinformatics, v. 28
    Doi: http://doi.org/10.1093/bioinformatics/bts099
  • 2011

  • Rebholz-Schuhmann, D., Jimeno Yepes, A., Li, C., Kafkas, S., Lewin, I., Kang, N., Corbett, P., Milward, D., Buyko, E., Beisswanger, E., Hornbostel, K., Kouznetsov, A., Witte, R., Laurila, JB., Baker, CJ., Kuo, C-J., Clematide, S., Rinaldi, F., Farkas, R., Móra, G., Hara, K., Furlong, LI., Rautschka, M., Neves, ML., Pascual-Montano, A., Wei, Q., Collier, N., Chowdhury, MFM., Lavelli, A., Berlanga, R., Morante, R., Van Asch, V., Daelemans, W., Marina, JL., van Mulligen, E., Kors, J. and Hahn, U., 2011. Assessment of NER solutions against the first and second CALBC Silver Standard Corpus. J Biomed Semantics, v. 2 Suppl 5
    Doi: http://doi.org/10.1186/2041-1480-2-S5-S11
  • Rebholz-Schuhmann, D., Rinaldi, F., Pyysalo, S., Collier, N. and Hahn, U., 2011. Towards mature use of semantic resources for biomedical analyses. J Biomed Semantics, v. 2 Suppl 5
    Doi: http://doi.org/10.1186/2041-1480-2-S5-I1
  • Collier, N., 2011. Towards cross-lingual alerting for bursty epidemic events. J Biomed Semantics, v. 2 Suppl 5
    Doi: http://doi.org/10.1186/2041-1480-2-S5-S10
  • Collier, N., Son, NT. and Nguyen, NM., 2011. OMG U got flu? Analysis of shared health messages for bio-surveillance. J Biomed Semantics, v. 2 Suppl 5
    Doi: http://doi.org/10.1186/2041-1480-2-S5-S9
  • Wei, Q. and Collier, N., 2011. Towards classifying species in systems biology papers using text mining. BMC Res Notes, v. 4
    Doi: http://doi.org/10.1186/1756-0500-4-32
  • Coulet, A., Garten, Y., Dumontier, M., Altman, RB., Musen, MA. and Shah, NH., 2011. Integration and publication of heterogeneous text-mined relationships on the Semantic Web. J Biomed Semantics, v. 2 Suppl 2
    Doi: http://doi.org/10.1186/2041-1480-2-S2-S10
  • 2010

  • Rebholz-Schuhmann, D., Collier, N., Park, JC. and Wong, L., 2010. Wrestling with biomedical research results: Language resources and literature analysis Journal of Bioinformatics and Computational Biology, v. 8
    Doi: http://doi.org/10.1142/S0219720010004598
  • Hartley, D., Nelson, N., Walters, R., Arthur, R., Yangarber, R., Madoff, L., Linge, J., Mawudeku, A., Collier, N., Brownstein, J., Thinus, G. and Lightfoot, N., 2010. The landscape of international event-based biosurveillance Emerging Health Threats Journal, v. 3
    Doi: 10.3402/ehtj.v3i0.7096
  • Rebholz-Schuhmann, D., Collier, N., Park, JC. and Wong, L., 2010. Wrestling with biomedical research results: language resources and literature analysis. Introduction. J Bioinform Comput Biol, v. 8
    Doi: http://doi.org/10.1142/s0219720010004598
  • Conway, M., Kawazoe, A., Chanlekha, H. and Collier, N., 2010. Developing a disease outbreak event corpus. J Med Internet Res, v. 12
    Doi: http://doi.org/10.2196/jmir.1323
  • Chanlekha, H. and Collier, N., 2010. Analysis of syntactic and semantic features for fine-grained event-spatial understanding in outbreak news reports. J Biomed Semantics, v. 1
    Doi: http://doi.org/10.1186/2041-1480-1-3
  • Collier, N., 2010. What's unusual in online disease outbreak news? J Biomed Semantics, v. 1
    Doi: http://doi.org/10.1186/2041-1480-1-2
  • Chanlekha, H. and Collier, N., 2010. A methodology to enhance spatial understanding of disease outbreak events reported in news articles. Int J Med Inform, v. 79
    Doi: http://doi.org/10.1016/j.ijmedinf.2010.01.014
  • Chanlekha, H., Kawazoe, A. and Collier, N., 2010. A framework for enhancing spatial and temporal granularity in report-based health surveillance systems. BMC Med Inform Decis Mak, v. 10
    Doi: http://doi.org/10.1186/1472-6947-10-1
  • Hartley, D., Nelson, N., Walters, R., Arthur, R., Yangarber, R., Madoff, L., Linge, J., Mawudeku, A., Collier, N., Brownstein, J., Thinus, G. and Lightfoot, N., 2010. Landscape of international event-based biosurveillance. Emerg Health Threats J, v. 3
    Doi: http://doi.org/10.3134/ehtj.10.003
  • 2009

  • Kawazoe, A., Jin, L., Shigematsu, M., Bekki, D., Barrero, R., Taniguchi, K. and Collier, N., 2009. The development of a schema for semantic annotation: Gain brought by a formal ontological method Applied Ontology, v. 4
    Doi: http://doi.org/10.3233/AO-2009-0062
  • Conway, M., Doan, S., Kawazoe, A. and Collier, N., 2009. Classifying disease outbreak reports using n-grams and semantic features. Int J Med Inform, v. 78
    Doi: http://doi.org/10.1016/j.ijmedinf.2009.03.010
  • Doan, S., Kawazoe, A., Conway, M. and Collier, N., 2009. Towards role-based filtering of disease outbreak reports. J Biomed Inform, v. 42
    Doi: http://doi.org/10.1016/j.jbi.2008.12.009
  • 2008 (Published online)

  • Doan, S., Doan, S., Ngo, Q-H., Kawazoe, A. and Collier, N., 2008 (Published online). Building and Using Geospatial Ontology in the BioCaster Surveillance System Nature Precedings,
    Doi: 10.1038/npre.2008.2110
  • Doan, S., Ngo, Q-H., Kawazoe, A. and Collier, N., 2008 (Published online). Building and Using Geospatial Ontology in the BioCaster Surveillance System Nature Precedings,
    Doi: 10.1038/npre.2008.2110.1
  • 2008

  • Collier, N., Doan, S., Kawazoe, A., Shigematsu, M., Taniguchi, K., Takeuchi, K., Kawtrakul, A. and Dien, D., 2008. The Global Health Monitor: A Bio-Geographic View of World Outbreak News INTERNATIONAL JOURNAL OF INFECTIOUS DISEASES, v. 12
    Doi: http://doi.org/10.1016/j.ijid.2008.05.482
  • Korhonen, A., Krymolowski, Y. and Collier, N., 2008. The choice of features for classification of verbs in biomedical texts Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference, v. 1
    Doi: http://doi.org/10.3115/1599081.1599138
  • Collier, N., Doan, S., Kawazoe, A., Goodwin, RM., Conway, M., Tateno, Y., Ngo, Q-H., Dien, D., Kawtrakul, A., Takeuchi, K., Shigematsu, M. and Taniguchi, K., 2008. BioCaster: detecting public health rumors with a Web-based text mining system. Bioinformatics, v. 24
    Doi: http://doi.org/10.1093/bioinformatics/btn534
  • Kawazoe, A., Chanlekha, H., Shigematsu, M. and Collier, N., 2008. Structuring an event ontology for disease outbreak detection. BMC Bioinformatics, v. 9 Suppl 3
    Doi: http://doi.org/10.1186/1471-2105-9-S3-S8
  • McCrae, J. and Collier, N., 2008. Synonym set extraction from the biomedical literature by lexical pattern discovery. BMC Bioinformatics, v. 9
    Doi: http://doi.org/10.1186/1471-2105-9-159
  • 2007

  • Thao, PTX., Tri, TQ., Dien, D. and Collier, N., 2007. Named entity recognition in Vietnamese using classifier voting ACM Transactions on Asian Language Information Processing, v. 6
    Doi: http://doi.org/10.1145/1316457.1316460
  • Tri Tran, Q., Thao Pham, TX., Hung Ngo, Q., Dinh, D. and Collier, N., 2007. Named entity recognition in Vietnamese documents Progress in Informatics,
    Doi: http://doi.org/10.2201/NiiPi.2007.4.2
  • 2006

  • Collier, N., Nazarenko, A., Baud, R. and Ruch, P., 2006. Recent advances in natural language processing for biomedical applications. Int J Med Inform, v. 75
    Doi: http://doi.org/10.1016/j.ijmedinf.2005.06.008
  • Collier, N., Kawazoe, A., Jin, L., Shigematsu, M., Dien, D., Barrero, RA., Takeuchi, K. and Kawtrakul, A., 2006. A multilingual ontology for infectious disease surveillance: rationale, design and challenges. Lang Resour Eval, v. 40
    Doi: http://doi.org/10.1007/s10579-007-9019-7
  • Mizuta, Y., Korhonen, A., Mullen, T. and Collier, N., 2006. Zone analysis in biology articles as a basis for information extraction. Int J Med Inform, v. 75
    Doi: http://doi.org/10.1016/j.ijmedinf.2005.06.013
  • 2005

  • Takeuchi, K. and Collier, N., 2005. Bio-medical entity extraction using support vector machines. Artif Intell Med, v. 33
    Doi: http://doi.org/10.1016/j.artmed.2004.07.019
  • Mullen, T., Mizuta, Y. and Collier, N., 2005. A baseline feature set for learning rhetorical zones using full articles in the biomedical domain ACM SIGKDD Explorations Newsletter, v. 7
    Doi: 10.1145/1089815.1089823
  • Kogan, Y., Collier, N., Pakhomov, S. and Krauthammer, M., 2005. Towards semantic role labeling & IE in the medical literature. AMIA Annu Symp Proc, v. 2005
  • 2004 (No publication date)

  • Ai Kawazoe, , Tony Mullen, , Koichi Takeuchi, , Tuangthong Wattarujeekrit, and Nigel Collier, , 2004 (No publication date). Genome Informatics 14: 677--678 (2003) 677 Open Ontology Forge: A Tool for Ontology Creation
  • Nigel Collier, , Ai Kawazoe, , Asanobu Kitamoto, , Tuangthong Wattarujeekrit, , Yoko Mizuta, and Anthony Mullen, , 2004 (No publication date). Integrating Deep and Shallow Semantic Structures in Open
  • 2004

  • Angelino, H. and Collier, N., 2004. Comparison of innovation policy and transfer of technology from public institutions in Japan, France, Germany and the United Kingdom NII Journal,
  • Collier, N. and Takeuchi, K., 2004. Comparison of character-level and part of speech features for name recognition in biomedical texts. J Biomed Inform, v. 37
    Doi: http://doi.org/10.1016/j.jbi.2004.08.008
  • Wattarujeekrit, T., Shah, PK. and Collier, N., 2004. PASBio: predicate-argument structures for event extraction in molecular biology. BMC Bioinformatics, v. 5
    Doi: http://doi.org/10.1186/1471-2105-5-155
  • 2003 (No publication date)

  • Koichi Takeuchi, and Nigel Collier, , 2003 (No publication date). Bio-Medical Entity Extraction using Support Vector Machines
  • 2003

  • Collier, N., Kumano, A. and Hirakawa, H., 2003. An application of local relevance feedback for building comparable corpora from news article matching NII Journal,
  • 2002 (No publication date)

  • Nigel Collier, , Chikashi Nobata, and Jun-ichi Tsujii, , 2002 (No publication date). Extracting the Names of Genes and Gene Products with a Hidden Markov Model
  • Nigel Collier, , 2002 (No publication date). Machine Learning for Information Extraction from XML marked-up text on the Semantic Web
  • Chikashi Nobata, and Nigel Collier, , 2002 (No publication date). Comparison between Tagged Corpora for the Named Entity Task
  • 2002

  • Takeuchi, K. and Collier, N., 2002. Use of Support Vector Machines in Extended Named Entity Recognition Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Collier, N., Takeuchi, K., Nobata, C., Fukumoto, J. and Ogata, N., 2002. Progress on multi-lingual named entity annotation guidelines using RDF(S) Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002,
  • Collier, N. and Takeuchi, K., 2002. PIA-Core: Semantic annotation through example-based learning Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002,
  • 2001 (No publication date)

  • Nigel Collier, , Hideki Mima, , Tomoko Ohta, , Yuka Tateisi, and Akane Yakushiji, , 2001 (No publication date). The GENIA Project: Knowledge Acquisition from Biology Texts
  • Nigel Collier, , Koichi Takeuchi, and Keita Tsuji, , 2001 (No publication date). The PIA Project: Learning to Semantically Annotate Texts from an Ontology and XML-Instance Data
  • 2001

  • Collier, N., Nobata, C. and Tsujii, J., 2001. Automatic acquisition and classification of terminology using a tagged corpus in the molecular biology domain Terminology, v. 7
    Doi: http://doi.org/10.1075/term.7.2.07col
  • Jones, G., Collier, N., Sakai, T., Sumita, K. and Hirakawa, H., 2001. A framework for cross-language information access: Application to English and Japanese Computers and the Humanities, v. 35
    Doi: http://doi.org/10.1023/A:1011851209975
  • Jones, G., Collier, N., Sakai, T., Sumita, K. and Hirakawa, H., 2001. A framework for cross-language information access: Application to english and Japanese Language Resources and Evaluation, v. 35
  • 2000 (No publication date)

  • Nigel Collier, , Hideki Hirakawa, and Akira Kumano, , 2000 (No publication date). Cross Language Information Retrieval: an Experimentin Bilingual News Article Alignment from the Internet using MT
  • Hisao Imai, and Nigel Collier, , 2000 (No publication date). A Combined Query Expansion Approach for Information Retrieval
  • Tomoko Ohta, , Yuka Tateisi, , Nigel Collier, , Chikashi Nobata, and Katsutoshi Ibushi, , 2000 (No publication date). A Semantically Annotated Corpus from MEDLINE Abstracts
  • Chikashi Nobata, , Nigel Collier, and Jun-ichi Tsujii, , 2000 (No publication date). Automatic Term Identification and Classification in Biology Texts
  • 1998 (No publication date)

  • Nigel Collier, , Hideki Hirakawa, and Akira Kumano, , 1998 (No publication date). Creating a Noisy Parallel Corpus from Newswire Articles Using Cross-Language Information Retrieval
  • Teruyoshi Hishiki, , Nigel Collier, , Chikashi Nobata, , Tomoko Okazaki-ohta, , Norihiro Ogata, , Takeshi Sekimizu, , Roland Steiner, and Hyun S. Park, , 1998 (No publication date). Developing NLP Tools for Genome Informatics: An Information Extraction Perspective
  • 1996 (No publication date)

  • Nigel Collier, , 1996 (No publication date). Storage of Natural Language Sentences in a Hopfield Network
  • Nigel Collier, , 1996 (No publication date). Contextual Meta-Knowledge Acquisition from Corpora
  • Theses / dissertations

    2021

  • Prokhorov, V., 2021. Injecting Inductive Biases into Distributed Representations of Text
    Doi: http://doi.org/10.17863/CAM.78416
  • Internet publications

    2020

  • Gritta, M., Pilehvar, MT. and Collier, N., 2020. A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics.
    Doi: http://doi.org/10.1007/s10579-019-09475-3
  • Datasets

    2018

  • Gritta, M., 2018. Research data supporting "Which Melbourne? Augmenting Geocoding with Maps"
    Doi: http://doi.org/10.17863/CAM.25015
  • 2017 (No publication date)

  • Gritta, M., Collier, N., Limsopatham, N. and Pilehvar, M., 2017 (No publication date). Research data supporting "Vancouver Welcomes You! Minimalist Location Metonymy Resolution"
  • Book chapters

    2017

  • Camacho-Collados, J., Pilehvar, MT., Collier, N. and Navigli, R., 2017. SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity
  • 2016

  • Collier, NH., 2016. A review of web-based epidemic detection
    Doi: http://doi.org/10.4324/9781315554211-14
  • 2015

  • Collier, NH., 2015. A review of web-based epidemic detection
    Doi: http://doi.org/10.4324/9781315554211
  • 2010

  • Collier, N., Doan, S., Goodwin, R., McCrae, J., Conway, M., Shigematsu, M. and Kawazoe, A., 2010. Navigating the Information Storm
    Doi: 10.1201/b10315-16
  • Doan, S., Conway, M. and Collier, N., 2010. An Empirical Study of Sections in Classifying Disease Outbreak Reports
    Doi: http://doi.org/10.1007/978-1-4419-1274-9_4

  • Read more at: Dr Andrew Caines

    Dr Andrew Caines

    Second language learning; first language acquisition; speech; corpus linguistics; language evolution

    Journal articles

    2024

  • Ahrenberg, L., Ainiala, T., Aldrin, E., Holdt, ŠA., Caines, A., Dalianis, H., Dannélls, D., Dobnik, S., Grouin, C., Hämäläinen, L., Henriksson, A., Kokkinakis, D., Lassus, J., Tiedemann, TL., Lison, P., Lindén, K., Ljunglöf, P., Sánchez, RM., Nelson, B., Nordman, L., Pilán, I., Raheja, V., Scheffler, T., Torra, V., Vakili, T., Vydiswaran, VGV., Volodina, E. and Vu, XS., 2024. Introduction CALD-pseudo 2024 - Workshop on Computational Approaches to Language Data Pseudonymization, Proceedings of the Workshop,
  • Davis, C., Caines, A., Andersen, Ø., Taslimipoor, S., Yannakoudakis, H., Yuan, Z., Bryant, C., Rei, M. and Buttery, P., 2024. Prompting open-source and commercial language models for grammatical error correction of English learner text Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • 2023

  • Benedetto, L., Cremonesi, P., Caines, A., Buttery, P., Cappelli, A., Giussani, A. and Turrin, R., 2023. A Survey on Recent Approaches to Question Difficulty Estimation from Text ACM Computing Surveys, v. 55
    Doi: 10.1145/3556538
  • Zhou, L., Caines, A., Pete, I. and Hutchings, A., 2023. Automated hate speech detection and span extraction in underground hacking and extremist forums Natural Language Engineering, v. 29
    Doi: 10.1017/S1351324922000262
  • Goodman, JR., Caines, A. and Foley, RA., 2023. Shibboleth: An agent-based model of signalling mimicry. PLoS One, v. 18
    Doi: http://doi.org/10.1371/journal.pone.0289333
  • Goriely, Z., Caines, A. and Buttery, P., 2023. Word segmentation from transcriptions of child-directed speech using lexical and sub-lexical cues. J Child Lang,
    Doi: http://doi.org/10.1017/S0305000923000491
  • 2021

  • Katushemererwe, F., Caines, A. and Buttery, P., 2021. Building natural language processing tools for Runyakitara Applied Linguistics Review, v. 12
    Doi: http://doi.org/10.1515/applirev-2020-2004
  • 2019

  • Caines, A., Altmann-Richer, E. and Buttery, P., 2019. The cross-linguistic performance of word segmentation models over time. J Child Lang, v. 46
    Doi: http://doi.org/10.1017/S0305000919000485
  • 2018

  • Caines, A., Pastrana, S., Hutchings, A. and Buttery, PJ., 2018. Automatically identifying the function and intent of posts in underground forums Crime Science, v. 7
    Doi: http://doi.org/10.1186/s40163-018-0094-4
  • Conference proceedings

    2024

  • Velentzas, G., Caines, A., Borgo, R., Pacquetet, E., Hamilton, C., Arnold, T., Nicholls, D., Buttery, P., Gaillat, T., Yannakoudakis, H. and Ballier, N., 2024. Logging Keystrokes in Writing by English Learners 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings,
  • Chan, KWH., Bryant, C., Nguyen, L., Caines, A. and Yuan, Z., 2024. Grammatical Error Correction for Code-Switched Sentences by Learners of English 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings,
  • Moore, R., Caines, A. and Buttery, P., 2024. Recurrent Neural Collaborative Filtering for Knowledge Tracing Communications in Computer and Information Science, v. 2150 CCIS
    Doi: http://doi.org/10.1007/978-3-031-64315-6_36
  • 2023

  • Caines, A., Benedetto, L., Taslimipoor, S., Davis, C., Gao, Y., Andersen, Ø., Yuan, Z., Elliott, M., Moore, R., Bryant, C., Rei, M., Yannakoudakis, H., Mullooly, A., Nicholls, D. and Buttery, P., 2023. On the application of Large Language Models for language teaching and assessment technology CEUR Workshop Proceedings, v. 3487
  • Diehl Martinez, R., Goriely, Z., McGovern, H., Davis, C., Caines, A., Buttery, P. and Beinborn, L., 2023. CLIMB – Curriculum Learning for Infant-inspired Model Building CoNLL 2023 - BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, Proceedings,
  • 2022

  • Wambsganss, T., Caines, A. and Buttery, P., 2022. ALEN App: Persuasive Writing Support To Foster English Language Learning BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,
  • Tyen, G., Brenchley, M., Caines, A. and Buttery, P., 2022. Towards an open-domain chatbot for language practice BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,
    Doi: 10.18653/v1/2022.bea-1.28
  • Rietsche, R., Caines, A., Schramm, C., Pfütze, D. and Buttery, P., 2022. The Specificity and Helpfulness of Peer-to-Peer Feedback in Higher Education BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings,
  • Pete, I., Hughes, J., Caines, A., Vu, AV., Gupta, H., Hutchings, A., Anderson, R. and Buttery, P., 2022. PostCog: A tool for interdisciplinary research into underground forums at scale Proceedings - 7th IEEE European Symposium on Security and Privacy Workshops, Euro S and PW 2022,
    Doi: http://doi.org/10.1109/EuroSPW55150.2022.00016
  • Davis, C., Bryant, C., Caines, A., Rei, M. and Buttery, P., 2022. Probing for targeted syntactic knowledge through grammatical error detection CoNLL 2022 - 26th Conference on Computational Natural Language Learning, Proceedings of the Conference,
  • Chua, H., Caines, A. and Yannakoudakis, H., 2022. A unified framework for cross-domain and cross-task learning of mental health conditions NLP4PI 2022 - 2nd Workshop on NLP for Positive Impact, Proceedings of the Workshop,
    Doi: 10.18653/v1/2022.nlp4pi-1.1
  • 2020

  • Hughes, J., Aycock, S., Caines, A., Buttery, P. and Hutchings, A., 2020. Detecting Trending Terms in Cybersecurity Forum Discussions Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020),
    Doi: 10.18653/v1/2020.wnut-1.15
  • Caines, A., Bentz, C., Knill, K., Rei, M. and Buttery, P., 2020. Grammatical error detection in transcriptions of spoken English COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference,
  • Caines, A. and Buttery, P., 2020. REPROLANG 2020: Automatic proficiency scoring of Czech, English, German, Italian, and Spanish learner essays LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings,
  • MacSween, R., Caines, A. and Buttery, P., 2020. An Expectation Maximisation Algorithm for Automated Cognate Detection CoNLL 2020 - 24th Conference on Computational Natural Language Learning, Proceedings of the Conference,
  • Zaidi, A., Caines, A., Moore, R., Buttery, P. and Rice, A., 2020. Adaptive Forgetting Curves for Spaced Repetition Language Learning Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 12164 LNAI
    Doi: http://doi.org/10.1007/978-3-030-52240-7_65
  • Craighead, H., Caines, A., Buttery, P. and Yannakoudakis, H., 2020. Investigating the effect of auxiliary objectives for the automated grading of learner english speech transcriptions Proceedings of the Annual Meeting of the Association for Computational Linguistics,
    Doi: 10.18653/v1/2020.acl-main.206
  • 2019

  • Baur, C., Caines, A., Chua, C., Gerlach, J., Qian, M., Rayner, M., Russell, M., Strik, H. and Wei, X., 2019. Overview of the 2019 Spoken CALL Shared Task 8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19,
    Doi: http://doi.org/10.21437/SLaTE.2019-1
  • Knill, K., Gales, M., Manakul, P. and Caines, A., 2019. Automatic grammatical error detection of non-native spoken learner English ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
    Doi: 10.1109/icassp.2019.8683755
  • Aglionby, G., Davis, C., Mishra, P., Caines, A., Yannakoudakis, H., Rei, M., Shutova, E. and Buttery, P., 2019. CAMsterdam at SemEval-2019 task 6: Neural and graph-based feature extraction for the identification of offensive tweets NAACL HLT 2019 - International Workshop on Semantic Evaluation, SemEval 2019, Proceedings of the 13th Workshop,
  • Moore, R., Caines, A., Rice, A. and Buttery, P., 2019. Behavioural cloning of teachers for automatic homework selection Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 11625 LNAI
    Doi: http://doi.org/10.1007/978-3-030-23204-7_28
  • Knill, KM., Gales, MJF., Manakul, PP. and Caines, AP., 2019. Automatic Grammatical Error Detection of Non-native Spoken Learner English ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, v. 2019-May
    Doi: 10.1109/ICASSP.2019.8683080
  • Moore, R., Caines, A., Elliott, M., Zaidi, A., Rice, A. and Buttery, P., 2019. Skills embeddings: A neural approach to multicomponent representations of students and tasks EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining,
  • Zaidi, AH., Caines, A., Davis, C., Moore, R., Buttery, P. and Rice, A., 2019. Accurate modelling of language learning tasks and students using representations of grammatical proficiency EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining,
  • 2018

  • Caines, A., Pastrana, S., Hutchings, A. and Buttery, P., 2018. Aggressive language in an online hacking forum 2nd Workshop on Abusive Language Online - Proceedings of the Workshop, co-located with EMNLP 2018,
  • Pastrana, S., Hutchings, A., Caines, A. and Buttery, P., 2018. Characterizing eve: Analysing cybercrime actors in a large underground forum Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 11050 LNCS
    Doi: http://doi.org/10.1007/978-3-030-00470-5_10
  • Knill, KM., Gales, MJF., Kyriakopoulos, K., Malinin, A., Ragni, A., Wang, Y. and Caines, AP., 2018. Impact of ASR performance on free speaking language assessment Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2018-September
    Doi: http://doi.org/10.21437/Interspeech.2018-1312
  • Baur, C., Caines, A., Chua, C., Gerlach, J., Qian, M., Rayner, M., Russell, M., Strik, H. and Wei, X., 2018. Overview of the 2018 spoken call shared task Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2018-September
    Doi: http://doi.org/10.21437/Interspeech.2018-97
  • 2017

  • Caines, A., 2017. Spoken CALL Shared Task system description 7th ISCA Workshop on Speech and Language Technology in Education, SLaTE 2017,
    Doi: http://doi.org/10.21437/SLaTE.2017-14
  • Flint, E., Ford, E., Thomas, O., Caines, A. and Buttery, P., 2017. A Text Normalisation System for Non-Standard English Words 3rd Workshop on Noisy User-Generated Text, W-NUT 2017 - Proceedings of the Workshop,
  • Caines, A., Flint, E. and Buttery, P., 2017. Collecting fluency corrections for spoken learner english EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop,
  • Caines, A., McCarthy, M. and Buttery, P., 2017. Parsing transcripts of speech EMNLP 2017 - 1st Workshop on Speech-Centric Natural Language Processing, SCNLP 2017 - Proceedings of the Workshop,
  • 2016

  • Moore, R., Caines, A., Graham, C. and Buttery, P., 2016. Automated speech-unit delimitation in spoken learner English COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,
  • Caines, A., Bentz, C., Graham, C., Polzehl, T. and Buttery, P., 2016. Crowdsourcing a multilingual speech corpus: Recording, transcription and annotation of the CROWDED corpus Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016,
  • Zhang, W., Caines, A., Alikaniotis, D. and Buttery, P., 2016. Predicting author age from Weibo microblog posts Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016,
  • Caines, A., Bentz, C., Alikaniotis, D., Katushemererwe, F. and Buttery, P., 2016. The Glottolog Data Explorer: Mapping the world’s languages Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). Workshop proceedings,
  • 2015

  • Moore, R., Caines, A., Graham, C. and Buttery, P., 2015. Incremental dependency parsing and disfluency detection in spoken learner English Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 9302
    Doi: http://doi.org/10.1007/978-3-319-24033-6_53
  • 2014

  • Caines, A. and Buttery, P., 2014. The effect of disfluencies and learner errors on the parsing of spoken learner language Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages SPMRL-SANCL 2014), Co-located with COLING,
  • 2012

  • Buttery, P. and Caines, A., 2012. Reclassifying subcategorization frames for experimental analysis and stimulus generation Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012,
  • Caines, A. and Buttery, P., 2012. Annotating progressive aspect constructions in the spoken section of the british national Corpus Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012,
  • 2010

  • Caines, A. and Buttery, P., 2010. ‘You talking to me?’ A predictive model for zero auxiliary constructions Proceedings of the Workshop on Natural Language Processing and Linguistics, Finding the Common Ground, Annual Meeting of the Association for Computational Linguistics,
  • Book chapters

    2024

  • Benedetto, L., Taslimipoor, S., Caines, A., Galvan-Sosa, D., Dueñas, G., Loukina, A. and Zesch, T., 2024. Workshop on Automatic Evaluation of Learning and Assessment Content
    Doi: http://doi.org/10.1007/978-3-031-64312-5_60
  • 2018

  • Caines, A., McCarthy, M. and Buttery, P., 2018. 'You still talking to me?': The zero auxiliary progressive in spoken British english twenty years on
  • 2017

  • Caines, A. and Buttery, P., 2017. The Effect of Task and Topic on Opportunity of Use in Learner Corpora
  • 2016

  • Caines, A., McCarthy, M. and O’Keeffe, A., 2016. Spoken language corpora and pedagogical applications
    Doi: http://doi.org/10.4324/9781315657899-39
  • Caines, A., McCarthy, M. and O'Keeffe, A., 2016. Spoken language corpora and pedagogical applications
    Doi: http://doi.org/10.4324/9781315657899
  • 2012

  • Caines, A. and Buttery, P., 2012. Normalising frequency counts to account for ‘opportunity of use’ in learner corpora
  • Caines, A., 2012. ‘You talking to me?’ Testing corpus data with a shadowing experiment
  • Reports

    2024

  • Nicholls, D., Caines, A. and Buttery, P., 2024. The Write & Improve Corpus 2024: Error-annotated and CEFR-labelled essays by learners of English
    Doi: http://doi.org/10.17863/CAM.112997
  • 2017

  • Caines, AP., Nicholls, D. and Buttery, P., 2017. Annotating errors and disfluencies in transcriptions of speech

  • Read more at: Dr Helen Yannakoudakis

    Dr Helen Yannakoudakis

    Automated assessment

    Conference proceedings

    2019 (No publication date)

  • Mishra, P., Giannakoudaki, E. and Shutova, E., 2019 (No publication date). Neural Character-based Composition Models for Abuse Detection
  • Pushkar, M., Del Tredici, M., Giannakoudaki, E. and Shutova, E., 2019 (No publication date). Author Profiling for Abuse Detection
  • Flachs, S., Lacroix, O., Rei, M., Giannakoudaki, E. and Søgaard, A., 2019 (No publication date). A Simple and Robust Approach to Detecting Subject-Verb Agreement Errors.
  • Farag, Y., Yannakoudakis, H. and Briscoe, T., 2019 (No publication date). Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input Proceedings of NAACL-HLT 2018, New Orleans, Louisiana, pages 263–271, v. Volume 1
    Doi: http://doi.org/10.18653/v1/N18-1024
  • 2019 (Accepted for publication)

  • Mu, J., Giannakoudaki, E. and Shutova, E., 2019 (Accepted for publication). Learning Outside the Box: Discourse-level Features Improve Metaphor Identification. In Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics.,
  • 2019

  • Aglionby, G., Davis, C., Mishra, P., Caines, A., Yannakoudakis, H., Rei, M., Shutova, E. and Buttery, P., 2019. CAMsterdam at SemEval-2019 task 6: Neural and graph-based feature extraction for the identification of offensive tweets NAACL HLT 2019 - International Workshop on Semantic Evaluation, SemEval 2019, Proceedings of the 13th Workshop,
  • Mishra, P., Del Tredici, M., Giannakoudaki, E. and Shutova, E., 2019. Abusive Language Detection with Graph Convolutional Networks Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, v. 1
    Doi: http://doi.org/10.18653/v1/N19-1221
  • Farag, Y. and Giannakoudaki, E., 2019. Multi-Task Learning for Coherence Modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.,
  • 2017

  • Rei, M. and Giannakoudaki, E., 2017. Auxiliary Objectives for Neural Error Detection Models
  • Giannakoudaki, E., Rei, M., Andersen, OE. and Yuan, Z., 2017. Neural Sequence-Labelling Models for Grammatical Error Correction Proceedings of the 2017 Conference on Empirical Methods in natural Language Processing, v. D17-1
    Doi: http://doi.org/10.18653/v1/D17-1297
  • Shutova, E., Wundsam, A. and Yannakoudakis, H., 2017. Semantic frames and visual scenes: Learning semantic role inventories from image and video descriptions *SEM 2017 - 6th Joint Conference on Lexical and Computational Semantics, Proceedings,
    Doi: http://doi.org/10.18653/v1/s17-1018
  • 2016

  • Alikaniotis, D., Yannakoudakis, H. and Rei, M., 2016. Automatic text scoring using neural networks 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, v. 2
    Doi: http://doi.org/10.18653/v1/p16-1068
  • Rei, M. and Yannakoudakis, H., 2016. Compositional sequence labeling models for error detection in learner writing 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, v. 2
  • 2014

  • Felice, M., Yuan, Z., Andersen, ØE., Yannakoudakis, H. and Kochmar, E., 2014. Grammatical error correction using hybrid systems and type filtering CoNLL 2014 - 18th Conference on Computational Natural Language Learning, Proceedings of the Shared Task,
    Doi: http://doi.org/10.3115/v1/w14-1702
  • Journal articles

    2018

  • Yannakoudakis, H., Andersen, ØE., Geranpayeh, A., Briscoe, T. and Nicholls, D., 2018. Developing an automated writing placement system for ESL learners Applied Measurement in Education, v. 31
    Doi: http://doi.org/10.1080/08957347.2018.1464447
  • Other publications

    2018

  • Farag, Y., Yannakoudakis, H. and Briscoe, T., 2018. Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input. CoRR, v. abs/1804.06898

  • What we do

    Cambridge Language Sciences is an Interdisciplinary Research Centre at the University of Cambridge. Our virtual network connects researchers from five schools across the university as well as other world-leading research institutions. Our aim is to strengthen research collaborations and knowledge transfer across disciplines in order to address large-scale multi-disciplinary research challenges relating to language research.

    JOIN OUR NETWORK

    JOIN OUR MAILING LIST

    CONTACT US