skip to content

Cambridge Language Sciences

Interdisciplinary Research Centre
 

Research in LTL is centred around Natural Language Processing (NLP) and how to develop, adapt and apply fundamental NLP techniques to meet the needs of important practical applications in different areas of society.


Read more at: Professor Nigel Collier

Professor Nigel Collier

Computational linguistics; machine learning; semantics; text/data mining; knowledge discovery; domain adaptation; question answering

Conference proceedings

2023

  • Liu, F., Eisenschlos, JM., Piccinno, F., Krichene, S., Pang, C., Lee, K., Joshi, M., Chen, W., Collier, N. and Altun, Y., 2023. DEPLOT: One-shot visual language reasoning by plot-to-table translation Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Shu, C., Han, J., Liu, F., Shareghi, E. and Collier, N., 2023. POSQA: Probe the World Models of LLMs with Size Comparisons Findings of the Association for Computational Linguistics: EMNLP 2023,
  • Fu, Z., Su, Y., Meng, Z. and Collier, N., 2023. Biomedical Named Entity Recognition via Dictionary-based Synonym Generalization EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Vulić, I., Glavaš, G., Liu, F., Collier, N., Ponti, EM. and Korhonen, A., 2023. Probing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference,
  • Fu, Z., Yang, H., So, AMC., Lam, W., Bing, L. and Collier, N., 2023. On the Effectiveness of Parameter-Efficient Fine-Tuning Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023, v. 37
  • Liu, F., Piccinno, F., Krichene, S., Pang, C., Lee, K., Joshi, M., Altun, Y., Collier, N. and Eisenschlos, JM., 2023. MATCHA: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Zhang, M., Su, Y., Meng, Z., Fu, Z. and Collier, N., 2023. COFFEE: A Contrastive Oracle-Free Framework for Event Extraction Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • 2022

  • Okhmatovskaia, A., Shen, Y., Ganser, I., Collier, N., King, NB., Meng, Z. and Buckeridge, DL., 2022. A Conceptual Framework for Representing Events Under Public Health Surveillance. Stud Health Technol Inform, v. 294
    Doi: http://doi.org/10.3233/SHTI220480
  • Su, Y., Liu, F., Meng, Z., Lan, T., Shu, L., Shareghi, E. and Collier, N., 2022. TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning Findings of the Association for Computational Linguistics: NAACL 2022 - Findings,
  • Conforti, C., Berndt, J., Pilehvar, MT., Giannitsarou, C., Toxvaerd, F. and Collier, N., 2022. Incorporating Stock Market Signals for Twitter Stance Detection Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Meng, Z., Liu, F., Shareghi, E., Su, Y., Collins, C. and Collier, N., 2022. Rewire-then-Probe: A Contrastive Recipe for Probing Biomedical Knowledge of Pre-trained Language Models Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Li, Y., Liu, F., Collier, N., Korhonen, A. and Vulic, I., 2022. Improving Word Translation via Two-Stage Contrastive Learning Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Zhou, W., Liu, F., Vulic, I., Collier, N. and Chen, M., 2022. Prix-LM: Pretraining for Multilingual Knowledge Base Construction Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Liu, Y., Su, Y., Shareghi, E. and Collier, N., 2022. Plug-and-Play Recipe Generation with Content Planning GEM 2022 - 2nd Workshop on Natural Language Generation, Evaluation, and Metrics, Proceedings of the Workshop,
  • Su, Y., Lan, T., Wang, Y., Yogatama, D., Kong, L. and Collier, N., 2022. A Contrastive Framework for Neural Text Generation Advances in Neural Information Processing Systems, v. 35
  • 2021

  • Su, Y., Vandyke, D., Baker, S., Wang, Y. and Collier, N., 2021. Keep the Primary, Rewrite the Secondary: A Two-Stage Approach for Paraphrase Generation Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021,
  • Liu, Q., Liu, F., Collier, N., Korhonen, A. and Vulić, I., 2021. MIRRORWIC: On Eliciting Word-in-Context Representations from Pretrained Language Models CoNLL 2021 - 25th Conference on Computational Natural Language Learning, Proceedings,
  • Su, Y., Meng, Z., Baker, S. and Collier, N., 2021. Few-Shot Table-to-Text Generation with Prototype Memory Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021,
  • Su, Y., Vandyke, D., Wang, S., Fang, Y. and Collier, N., 2021. Plan-then-Generate: Controlled Data-to-Text Generation via Planning Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021,
  • Liu, Q., Liu, F., Collier, N., Korhonen, A. and Vulić, I., 2021. MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models Proceedings of the 25th Conference on Computational Natural Language Learning,
    Doi: 10.18653/v1/2021.conll-1.44
  • Su, Y., Cai, D., Wang, Y., Vandyke, D., Baker, S., Li, P. and Collier, N., 2021. Non-Autoregressive Text Generation with Pre-trained Language Models Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics,
  • Meng, Z., Liu, F., Clark, T., Shareghi, E. and Collier, N., 2021. Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing,
    Doi: 10.18653/v1/2021.emnlp-main.383
  • Liu, F., Vulic, I., Korhonen, A. and Collier, N., 2021. Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021),
  • Clark, TH., Conforti, C., Liu, F., Meng, Z., Shareghi, E. and Collier, N., 2021. Integrating Transformers and Knowledge Graphs for Twitter Stance Detection W-NUT 2021 - 7th Workshop on Noisy User-Generated Text, Proceedings of the Conference,
  • Liu, F., Shareghi, E., Meng, Z., Basaldella, M. and Collier, N., 2021. Self-Alignment Pretraining for Biomedical Entity Representations NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference,
  • Su, Y., Cai, D., Zhou, Q., Lin, Z., Baker, S., Cao, Y., Shi, S., Collier, N. and Wang, Y., 2021. Dialogue response selection with hierarchical curriculum learning ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference,
  • Prokhorov, V., Li, Y., Shareghi, E. and Collier, N., 2021. Learning Sparse Sentence Encoding without Supervision: An Exploration of Sparsity in Variational Autoencoders RepL4NLP 2021 - 6th Workshop on Representation Learning for NLP, Proceedings of the Workshop,
  • Conforti, C., Berndt, J., Pilehvar, MT., Giannitsarou, C., Toxvaerd, F. and Collier, N., 2021. Synthetic Examples Improve Cross-Target Generalization: A Study on Stance Detection on a Twitter Corpus WASSA 2021 - Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Proceedings of the 11th Workshop,
  • Liu, F., Bugliarello, E., Ponti, EM., Redely, S., Collier, N. and Elliott, D., 2021. Visually Grounded Reasoning across Languages and Cultures EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Liu, F., Vulić, I., Korhonen, A. and Collier, N., 2021. Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, v. 2
  • Liu, F., Vulić, I., Korhonen, A. and Collier, N., 2021. Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Meng, Z., Liu, F., Clark, TH., Shareghi, E. and Collier, N., 2021. Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Liu, F., Chen, M., Roth, D. and Collier, N., 2021. Visual Pivoting for (Unsupervised) Entity Alignment 35th AAAI Conference on Artificial Intelligence, AAAI 2021, v. 5B
  • Conforti, C., Berndt, J., Pilehvar, MT., Basaldella, M., Giannitsarou, C., Toxvaerd, F. and Collier, N., 2021. Adversarial Training for News Stance Detection: Leveraging Signals from a Multi-Genre Corpus. EACL Hackashop on News Media Content Analysis and Automated Report Generation, Hackashop 2021 at 16th conference of the European Chapter of the Association for Computational Linguistics, EACL 2021 - Proceedings,
  • 2020

  • Pilehvar, MT., Kartsaklis, D., Prokhorov, V. and Collier, N., 2020. CARD-660: Cambridge rare word dataset - A reliable benchmark for infrequent word representation models Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018,
  • Conforti, C., Berndt, J., Pilehvar, MT., Giannitsarou, C., Toxvaerd, F. and Collier, N., 2020. STANDER: An expert-annotated dataset for news stance detection and evidence retrieval Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020,
  • Basaldella, M., Liu, F., Shareghi, E. and Collier, N., 2020. COMETA: A corpus for medical entity linking in the social media EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
  • Conforti, C., Berndt, J., Pilehvar, MT., Giannitsarou, C., Toxvaerd, F. and Collier, N., 2020. Will-they-won't-they: A very large dataset for stance detection on twitter Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Liu, F., Shareghi, E., Meng, Z., Basaldella, M. and Collier, N., 2020. Self-Alignment Pretraining for Biomedical Entity Representations
  • 2019 (No publication date)

  • Conforti, C., Collier, N. and Pilehvar, M., 2019 (No publication date). Towards Automatic Fake News Detection: Cross-Level Stance Detection in News Articles
  • Conforti, C., Collier, N. and Pilehvar, M., 2019 (No publication date). Towards Automatic Fake News Detection: Cross-Level Stance Detection in News Articles
    Doi: http://doi.org/10.17863/CAM.37758
  • 2019

  • Prokhorov, V., Pilehvar, MT., Kartsaklis, D., Liò, P. and Collier, N., 2019. Unseen word representation by aligning heterogeneous lexical semantic spaces 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019,
  • Can, DC., Le, HQ., Ha, QT. and Collier, N., 2019. A richer-but-smarter shortest dependency path with attentive augmentation for relation extraction NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1
  • Prokhorov, V., Pilehvar, MT., Kartsaklis, D., Liò, P. and Collier, N., 2019. Unseen Word Representation by Aligning Heterogeneous Lexical Semantic Spaces. AAAI,
  • Conforti, C., Pilehvar, MT. and Collier, N., 2019. Modeling the fake news challenge as a cross-level stance detection task CEUR Workshop Proceedings, v. 2482
  • Prokhorov, V., Pilehvar, MT. and Collier, N., 2019. Generating knowledge graph paths from textual definitions using sequence-to-sequence models NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1
  • Prokhorov, V., Shareghi, E., Li, Y., Pilehvar, MT. and Collier, N., 2019. On the importance of the Kullback-Leibler divergence term in Variational Autoencoders for text generation EMNLP-IJCNLP 2019 - Proceedings of the 3rd Workshop on Neural Generation and Translation,
  • Basaldella, M. and Collier, N., 2019. BioReddit: Word embeddings for user-generated biomedical NLP LOUHI@EMNLP 2019 - 10th International Workshop on Health Text Mining and Information Analysis, Proceedings,
  • 2018

  • Kartsaklis, D., Pilehvar, MT. and Collier, N., 2018. Mapping text to knowledge graph entities using multi-sense LSTMs Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018,
  • Le, HQ., Can, DC., Vu, ST., Dang, TH., Pilehvar, MT. and Collier, N., 2018. Large-scale exploration of neural relation classification architectures Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018,
  • Pilehvar, MT., Kartsaklis, D., Prokhorov, V. and Collier, N., 2018. CARD-660: Cambridge rare word dataset - A reliable benchmark for infrequent word representation models Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018,
  • Gritta, M., Pilehvar, MT. and Collier, N., 2018. Which Melbourne? Augmenting geocoding with maps ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 1
    Doi: http://doi.org/10.18653/v1/p18-1119
  • 2017 (Accepted for publication)

  • Collier, NH., Pilehvar, MT., Limsopatham, N. and Gritta, M., 2017 (Accepted for publication). Vancouver Welcomes You! Minimalist Location Metonymy Resolution Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2017), Vancouver, Canada,
    Doi: http://doi.org/10.18653/v1/P17-1115
  • 2017

  • Collier, N., Limsopatham, N., Culotta, A., Conway, M., Cox, IJ. and Lampos, V., 2017. WSDM 2017 workshop on mining online health reports WSDM workshop summary WSDM 2017 - Proceedings of the 10th ACM International Conference on Web Search and Data Mining,
    Doi: http://doi.org/10.1145/3018661.3022761
  • Pilehvar, MT. and Collier, N., 2017. Inducing Embeddings for Rare and Unseen Words by Leveraging Lexical Resources Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, v. 2, Short Papers
  • Pilehvar, MT., Camacho-Collados, J., Navigli, R. and Collier, N., 2017. Towards a seamless integration of word senses into downstream NLP applications ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 1
    Doi: http://doi.org/10.18653/v1/P17-1170
  • Le, HQ., Tran, MV., Can, DC., Ha, QT., Dang, TH. and Collier, N., 2017. Improving chemical-induced disease relation extraction with learned features based on convolutional neural network Proceedings - 2017 9th International Conference on Knowledge and Systems Engineering, KSE 2017, v. 2017-January
    Doi: http://doi.org/10.1109/KSE.2017.8119474
  • 2016

  • Limsopatham, N. and Collier, N., 2016. Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016) at the 26th International Conference on Computational Linguistics (COLING 2016) Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining,
  • Limsopatham, N. and Collier, NH., 2016. Bidirectional LSTM for Named Entity Recognition in Twitter Messages Proceedings of the 2nd Workshop on Noisy User-generated Text,
  • Collier, NH. and Pilehvar, MT., 2016. De-Conflated Semantic Representations Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,
    Doi: http://doi.org/10.18653/v1/D16-1174
  • Limsopatham, N. and Collier, N., 2016. Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16),
  • 2015

  • Limsopatham, N. and Collier, N., 2015. Adapting phrase-based machine translation to normalise medical terms in social media messages Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing,
    Doi: http://doi.org/10.18653/v1/d15-1194
  • Limsopatham, N. and Collier, N., 2015. Towards the semantic interpretation of personal health messages from social media UCUI 2015 - Proceedings of the ACM 1st International Workshop on Understanding the City with Urban Informatics, co-located with CIKM 2015,
    Doi: http://doi.org/10.1145/2811271.2811275
  • 2014

  • Lofi, C., Nieke, C. and Collier, N., 2014. Discriminating rhetorical analogies in social media 14th Conference of the European Chapter of the Association for Computational Linguistics 2014, EACL 2014,
    Doi: http://doi.org/10.3115/v1/e14-1059
  • Collier, N., Paster, F. and Tran, MV., 2014. The impact of near domain transfer on biomedical named entity recognition Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis, Louhi 2014 at the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014,
  • 2013

  • Bao, Y., Collier, N. and Datta, A., 2013. A partially supervised cross-collection topic model for cross-domain text classification International Conference on Information and Knowledge Management, Proceedings,
    Doi: http://doi.org/10.1145/2505515.2505556
  • Bao, Y., Collier, N. and Datta, A., 2013. Improving text categorization by augmenting topic features with a small number of word features WITS 2013 - 23rd Workshop on Information Technology and Systems: Leveraging Big Data Analytics for Societal Benefits,
  • Groza, T., Oellrich, A. and Collier, N., 2013. Using silver and semi-gold standard corpora to compare open named entity recognisers Proceedings - 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013,
    Doi: http://doi.org/10.1109/BIBM.2013.6732541
  • Tran, MV., Le, HQ., Phi, VT., Pham, TB. and Collier, N., 2013. Exploring a Probabilistic Earley Parser for Event Composition in Biomedical Texts Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 2013-October
  • 2012

  • Doan, S., Vo, BKH. and Collier, N., 2012. An analysis of twitter messages in the 2011 Tohoku Earthquake Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, v. 91 LNICST
    Doi: http://doi.org/10.1007/978-3-642-29262-0_8
  • Collier, N., Tran, MV., Le, HQ., Oellrich, A., Kawazoe, A., Hall-May, M. and Rebholz-Schuhmann, D., 2012. A hybrid approach to finding phenotype candidates in genetic texts 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers,
  • Lau, JH., Collier, N. and Baldwin, T., 2012. On-line trend analysis with topic models: Twitter trends detection topic model online 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers,
  • Ananiadou, S., Salakoski, T., Pyysalo, S., Rebholz-Schuhmann, D., Rinaldi, F., Schneider, G., Clematide, S., Grigonyte, G., Shepherd, A., Burgun-Parenthoine, A., McClosky, D., Demner-Fushman, D., Ginter, F., Leitner, F., Nenadic, G., Yi, GS., Liu, H., Su, J., Lee, H., Kim, JD., Park, JC., Kim, JJ., Verspoor, K., Cohen, K., Miwa, M., Krallinger, M., Romacker, M., Volk, M., Krauthammer, M., Conway, M., Okazaki, N., Collier, N., Ruch, P., Lambrix, P., Zweigenbaum, P., Ohta, T., Sætre, R., Hahn, U., Chapman, W., Tsuruoka, Y., Sasaki, Y., Mulkar-Mehta, R., Zhang, W. and Stenetorp, P., 2012. Introductory remarks SMBM 2012 - Proceedings of the 5th International Symposium on Semantic Mining in Biomedicine,
    Doi: http://doi.org/10.5167/uzh-64476
  • Doan, S., Ohno-Machado, L. and Collier, N., 2012. Enhancing twitter data analysis with simple semantic filtering: Example in tracking influenza-like illnesses Proceedings - 2012 IEEE 2nd Conference on Healthcare Informatics, Imaging and Systems Biology, HISB 2012,
    Doi: http://doi.org/10.1109/HISB.2012.21
  • Liu, S., Yamada, M., Collier, N. and Sugiyama, M., 2012. Change-point detection in time-series data by relative density-ratio estimation Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 7626 LNCS
    Doi: http://doi.org/10.1007/978-3-642-34166-3_40
  • Collier, N. and Doan, S., 2012. Syndromic classification of twitter messages Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, v. 91 LNICST
    Doi: http://doi.org/10.1007/978-3-642-29262-0_27
  • 2010

  • Collier, N., Goodwin, RM., Mccrae, J., Doan, S., Kawazoe, A., Conway, M., Kawtrakul, A., Takeuchi, K. and Dien, D., 2010. An ontology-driven system for detecting global health events Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference, v. 2
  • Collier, N., 2010. Towards cross-lingual alerting for bursty epidemic events CEUR Workshop Proceedings, v. 714
  • Collier, N., Son, NT. and Nguyen, NM., 2010. OMG U got flu? Analysis of shared health messages for bio-surveillance CEUR Workshop Proceedings, v. 714
  • Rebholz-Schuhmann, D., Yepes, AJ., Li, C., Kafkas, S., Lewin, I., Kang, N., Corbett, P., Milward, D., Buyko, E., Beisswanger, E., Hornbostel, K., Kouznetsov, A., Witte, R., Laurila, JB., Baker, CJO., Kuo, CJ., Clematide, S., Rinaldi, F., Farkas, R., Móra, G., Hara, K., Furlong, L., Rautschka, M., Neves, ML., Pascual-Montano, A., Wei, Q., Collier, N., Chowdhury, MFM., Lavelli, A., Berlanga, R., Morante, R., Van Asch, V., Daelemans, W., Marina, JL., Van Mulligen, E., Kors, J. and Hahn, U., 2010. Assessment of NER solutions against the first and second CALBC Silver Standard Corpus CEUR Workshop Proceedings, v. 714
  • 2009

  • Sinnou, T., Takeuchi, K. and Collier, N., 2009. Bio-medical term extraction on simple rule language 3rd International Symposium on Languages in Biology and Medicine, LBM 2009,
  • Rebholz-Schuhmann, D., Collier, N., Park, JC. and Wong, L., 2009. Preface 3rd International Symposium on Languages in Biology and Medicine, LBM 2009,
  • Conway, M., Collier, N. and Doan, S., 2009. Using Hedges to Enhance a Disease Outbreak Report Text Mining System Proceedings ...,
  • 2008

  • Conway, M., Doan, S., Kawazoe, A. and Collier, N., 2008. Classifying disease outbreak reports using n-grams and semantic features 3rd International Symposium on Semantic Mining in Biomedicine, SMBM 2008 - Proceedings,
  • Doan, S., Hung-Ngo, Q., Kawazoe, A. and Collier, N., 2008. Global health monitor - A web-based system for detecting and mapping infectious diseases IJCNLP 2008 - 3rd International Joint Conference on Natural Language Processing, Proceedings of the Conference, v. 2
  • 2007

  • Hoang, V., Nguyen, N., Dinh, D. and Collier, N., 2007. Topic-based Vietnamese news document filtering in the BioCaster Project Proceedings - ALPIT 2007 6th International Conference on Advanced Language Processing and Web Information Technology,
    Doi: http://doi.org/10.1109/ALPIT.2007.56
  • Wei, Q., Krymolowski, Y. and Collier, N., 2007. Towards a methodology for entity error analysis in annotated corpora CEUR Workshop Proceedings, v. 289
  • Doan, S., Kawazoe, A. and Collier, N., 2007. The role of roles in classifying annotated biomedical text ACL 2007 - Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing,
    Doi: http://doi.org/10.3115/1572392.1572396
  • 2006

  • Korhonen, A., Krymolowski, Y. and Collier, N., 2006. Automatic Classification of Verbs in Biomedical Texts COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE,
  • Kawazoe, A., Jin, L., Shigematsu, M., Barrero, R., Taniguchi, K. and Collier, N., 2006. The development of a schema for the annotation of terms in the BioCaster disease detecting/tracking system CEUR Workshop Proceedings, v. 222
  • 2005

  • Cohen, KB., Hirschman, L., Shatkay, H., Blaschke, C., Ananiadou, S., Aronson, L., Baldwin, B., Bodenreider, O., Bradshaw, S., Carpenter, B., Chang, J., Cohen, A., Collier, N., Fox, L., Futrelle, B., Harkema, H., Hearst, M., Hunter, L., Johnson, S., Light, M., Liu, H., Morgan, A., Pustejovsky, J., Rindflesch, T., Rzhetsky, A., Saric, J., Tanabe, L., Tsujii, JI., Valencia, A., Verspoor, K., Wilbur, J., Yu, H. and Blake, JA., 2005. Introduction ACL-ISMB 2005 - Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics, Proceedings of the Workshop,
  • Wattarujeekrit, T. and Collier, N., 2005. Exploring predicate-argument relations for named entity recognition in the molecular biology domain Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 3735 LNAI
    Doi: http://doi.org/10.1007/11563983_23
  • 2004

  • Mullen, T. and Collier, N., 2004. Sentiment analysis using support vector machines with diverse information sources Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, EMNLP 2004 - A meeting of SIGDAT, a Special Interest Group of the ACL held in conjunction with ACL 2004,
  • Mizuta, Y. and Collier, N., 2004. Zone identification in biology articles as a basis for information extraction Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications - JNLPBA '04,
    Doi: 10.3115/1567594.1567600
  • Kawazoe, A., Kitamoto, A. and Collier, N., 2004. Managing the semantics of coreference relations with Open Ontology Forge CEUR Workshop Proceedings, v. 184
  • Wattarujeekrit, T. and Collier, N., 2004. Integrating event frame annotation into the open ontology forge annotation tool CEUR Workshop Proceedings, v. 184
  • Mullen, T. and Collier, N., 2004. Incorporating topic information into sentiment analysis models Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 2004-July
  • Kim, J-D., Ohta, T., Tsuruoka, Y., Tateisi, Y. and Collier, N., 2004. Introduction to the bio-entity recognition task at JNLPBA Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications - JNLPBA '04,
    Doi: 10.3115/1567594.1567610
  • Kawazoe, A., Kitamoto, A. and Collier, N., 2004. Annotation of coreference relations among linguistic expressions and images in biological articles Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004,
  • Mizuta, Y. and Collier, N., 2004. An annotation scheme for a rhetorical analysis of biology articles Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004,
  • 2003

  • Collier, N., Takeuchi, K., Kawazoe, A., Mullen, T. and Wattarujeekrit, T., 2003. A framework for integrating deep and shallow semantic structures in text mining Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science), v. 2773 PART 1
    Doi: http://doi.org/10.1007/978-3-540-45224-9_110
  • 2000

  • Nobata, C., Collier, N. and Tsujii, J., 2000. Comparison between tagged corpora for the named entity task Proceedings of the workshop on Comparing corpora -,
    Doi: 10.3115/1117729.1117733
  • 1999

  • Collier, N., Tsujii, J-I., Park, HS., Ogata, N., Tateishi, Y., Nobata, C., Ohta, T., Sekimizu, T., Imai, H. and Ibushi, K., 1999. The GENIA project Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics -,
    Doi: 10.3115/977035.977081
  • Jones, G., Sakai, T., Collier, N., Kumano, A. and Sumita, K., 1999. A comparison of query translation methods for English-Japanese cross-language information retrieval Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1999,
    Doi: http://doi.org/10.1145/312624.312690
  • Collier, N., Park, HS., Ogata, N., Tateishi, Y., Nobata, C., Ohta, T., Sekimizu, T., Imai, H., Ibushi, K. and Tsujii, JI., 1999. The GENIA project: Corpus-based knowledge acquisition and information extraction from genome research papers 9th Conference of the European Chapter of the Association for Computational Linguistics, EACL 1999,
  • 1998

  • Collier, N., Hirakawa, H. and Kumano, A., 1998. Machine translation vs. dictionary term translation - A comparison for English-Japanese news article alignment Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Collier, N., Ono, K. and Hirakawa, H., 1998. An experiment in hybrid dictionary and statistical sentence alignment Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Collier, N., Hirakawa, H. and Kumano, A., 1998. Machine translation vs. dictionary term translation Proceedings of the 36th annual meeting on Association for Computational Linguistics -,
    Doi: 10.3115/980845.980888
  • Collier, N., Ono, K. and Hirakawa, H., 1998. An experiment in hybrid dictionary and statistical sentence alignment Proceedings of the 17th international conference on Computational linguistics -,
    Doi: 10.3115/980451.980889
  • 1997

  • Collier, N., 1997. Convergence time characteristics of an associative memory for natural language processing IJCAI International Joint Conference on Artificial Intelligence, v. 2
  • Journal articles

    2023

  • Nettekoven, CR., Diederen, K., Giles, O., Duncan, H., Stenson, I., Olah, J., Gibbs-Dean, T., Collier, N., Vértes, PE., Spencer, TJ., Morgan, SE. and McGuire, P., 2023. Semantic Speech Networks Linked to Formal Thought Disorder in Early Psychosis. Schizophr Bull, v. 49
    Doi: 10.1093/schbul/sbac056
  • 2022

  • Le, H-Q., Can, D-C. and Collier, N., 2022. Exploiting document graphs for inter sentence relation extraction. J Biomed Semantics, v. 13
    Doi: http://doi.org/10.1186/s13326-022-00267-3
  • Pilehvar, MT., Bernard, A., Smedley, D. and Collier, N., 2022. PheneBank: a literature-based database of phenotypes. Bioinformatics, v. 38
    Doi: http://doi.org/10.1093/bioinformatics/btab740
  • Meng, Z., Okhmatovskaia, A., Polleri, M., Shen, Y., Powell, G., Fu, Z., Ganser, I., Zhang, M., King, NB., Buckeridge, D. and Collier, N., 2022. BioCaster in 2021: automatic disease outbreaks detection from global news media. Bioinformatics, v. 38
    Doi: http://doi.org/10.1093/bioinformatics/btac497
  • Wu, H., Wang, M., Wu, J., Francis, F., Chang, Y-H., Shavick, A., Dong, H., Poon, MTC., Fitzpatrick, N., Levine, AP., Slater, LT., Handy, A., Karwath, A., Gkoutos, GV., Chelala, C., Shah, AD., Stewart, R., Collier, N., Alex, B., Whiteley, W., Sudlow, C., Roberts, A. and Dobson, RJB., 2022. A survey on clinical natural language processing in the United Kingdom from 2007 to 2022. NPJ Digit Med, v. 5
    Doi: http://doi.org/10.1038/s41746-022-00730-6
  • 2021

  • Su, Y., Wang, Y., Cai, D., Baker, S., Korhonen, A. and Collier, N., 2021. PROTOTYPE-TO-STYLE: Dialogue Generation with Style-Aware Editing on Retrieval Memory IEEE/ACM Transactions on Audio Speech and Language Processing, v. 29
    Doi: http://doi.org/10.1109/TASLP.2021.3087948
  • 2020

  • Kim, J-D., Cohen, KB., Rinaldi, F., Lu, Z., Collier, N. and Park, H-S., 2020. Editor's introduction to the special issue of the 6th Biomedical Linked Annotation Hackathon (BLAH6). Genomics Inform, v. 18
    Doi: http://doi.org/10.5808/GI.2020.18.2.e12
  • Gritta, M., Pilehvar, MT. and Collier, N., 2020. A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics Language Resources and Evaluation, v. 54
    Doi: http://doi.org/10.1007/s10579-019-09475-3
  • 2019

  • Kim, J-D., Cohen, KB., Collier, N., Lu, Z. and Rinaldi, F., 2019. Introduction to BLAH5 special issue: recent progress on interoperability of biomedical text mining. Genomics Inform, v. 17
    Doi: http://doi.org/10.5808/GI.2019.17.2.e12
  • Gritta, M., Collier, N. and Pilehvar, M., 2019. A Pragmatic Guide to Geoparsing Evaluation Language Resources and Evaluation,
  • 2017 (Accepted for publication)

  • Gritta, M., Pilehvar, MT., Limsopatham, N. and Collier, N., 2017 (Accepted for publication). Vancouver Welcomes You! Minimalist Location Metonymy Resolution Association for Computational Linguistics,
    Doi: http://doi.org/10.18653/v1/P17-1115
  • 2017

  • Gritta, M., Pilehvar, MT., Limsopatham, N. and Collier, N., 2017. What’s missing in geographical parsing? Language Resources and Evaluation,
    Doi: http://doi.org/10.1007/s10579-017-9385-8
  • Alvaro, N., Miyao, Y. and Collier, N., 2017. TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations. JMIR Public Health Surveillance, v. 3
    Doi: http://doi.org/10.2196/publichealth.6396
  • 2016 (Accepted for publication)

  • Le, H-Q., Tran, M-V., Dang, TH., Ha, Q-T. and Collier, N., 2016 (Accepted for publication). Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction. Database : the Journal of Biological Databases and Curation, v. 2016
    Doi: http://doi.org/10.1093/database/baw102
  • Verspoor, K., Oellrich, A., Collier, N., Groza, T., Rocca-Serra, P., Soldatova, L., Dumontier, M. and Shah, N., 2016 (Accepted for publication). Thematic issue of the Second combined Bio-ontologies and Phenotypes Workshop. Journal of Biomedical Semantics, v. 7
    Doi: http://doi.org/10.1186/s13326-016-0108-7
  • 2016

  • Oellrich, A., Collier, N., Groza, T., Rebholz-Schuhmann, D., Shah, N., Bodenreider, O., Boland, MR., Georgiev, I., Liu, H., Livingston, K., Luna, A., Mallon, A-M., Manda, P., Robinson, PN., Rustici, G., Simon, M., Wang, L., Winnenburg, R. and Dumontier, M., 2016. The digital revolution in phenotyping. Brief Bioinform, v. 17
    Doi: http://doi.org/10.1093/bib/bbv083
  • Pilehvar, MT. and Collier, N., 2016. Improved semantic representation for domain-specific entities BioNLP 2016 - Proceedings of the 15th Workshop on Biomedical Natural Language Processing,
  • Limsopatham, N. and Collier, N., 2016. Modelling the combination of generic and target domain embeddings in a convolutional neural network for sentence classification BioNLP 2016 - Proceedings of the 15th Workshop on Biomedical Natural Language Processing,
  • 2015

  • Collier, N., Groza, T., Smedley, D., Robinson, PN., Oellrich, A. and Rebholz-Schuhmann, D., 2015. PhenoMiner: from text to a database of phenotypes associated with OMIM diseases. Database (Oxford), v. 2015
    Doi: http://doi.org/10.1093/database/bav104
  • Soldatova, LN., Collier, N., Oellrich, A., Groza, T., Verspoor, K., Rocca-Serra, P., Dumontier, M. and Shah, NH., 2015. Special issue on bio-ontologies and phenotypes. J Biomed Semantics, v. 6
    Doi: http://doi.org/10.1186/s13326-015-0040-2
  • Alvaro, N., Conway, M., Doan, S., Lofi, C., Overington, J. and Collier, N., 2015. Crowdsourcing Twitter annotations to identify first-hand experiences of prescription drug use. J Biomed Inform, v. 58
    Doi: http://doi.org/10.1016/j.jbi.2015.11.004
  • Collier, N., Oellrich, A. and Groza, T., 2015. Concept selection for phenotypes and diseases using learn to rank. J Biomed Semantics, v. 6
    Doi: http://doi.org/10.1186/s13326-015-0019-z
  • Groza, T., Köhler, S., Doelken, S., Collier, N., Oellrich, A., Smedley, D., Couto, FM., Baynam, G., Zankl, A. and Robinson, PN., 2015. Automatic concept recognition using the human phenotype ontology reference and test suite corpora. Database (Oxford), v. 2015
    Doi: http://doi.org/10.1093/database/bav005
  • Kim, J-D., Cohen, KB., Collier, N., Lu, Z. and Stenetorp, P., 2015. Introduction to the Biomedical Linked Annotation Hackathon (BLAH) 2015 Symposium BMC proceedings, v. 9
    Doi: http://doi.org/10.1186/1753-6561-9-s5-a1
  • Oellrich, A., Collier, N., Smedley, D. and Groza, T., 2015. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes. PLoS One, v. 10
    Doi: http://doi.org/10.1371/journal.pone.0116040
  • 2014

  • Barboza, P., Vaillant, L., Le Strat, Y., Hartley, DM., Nelson, NP., Mawudeku, A., Madoff, LC., Linge, JP., Collier, N., Brownstein, JS. and Astagneau, P., 2014. Factors influencing performance of internet-based biosurveillance systems used in epidemic intelligence for early detection of infectious diseases outbreaks. PLoS One, v. 9
    Doi: http://doi.org/10.1371/journal.pone.0090536
  • 2013 (Published online)

  • Keffala, B., Conway, M., Doan, S. and Collier, N., 2013 (Published online). Content Analysis of Syndromic Twitter Data Online Journal of Public Health Informatics, v. 5
    Doi: 10.5210/ojphi.v5i1.4548
  • 2013

  • Barboza, P., Vaillant, L., Mawudeku, A., Nelson, NP., Hartley, DM., Madoff, LC., Linge, JP., Collier, N., Brownstein, JS., Yangarber, R., Astagneau, P. and Early Alerting Reporting Project Of The Global Health Security Initiative, , 2013. Evaluation of epidemic intelligence systems integrated in the early alerting and reporting project for the detection of A/H5N1 influenza events. PLoS One, v. 8
    Doi: http://doi.org/10.1371/journal.pone.0057252
  • Collier, N., Tran, M-V., Le, H-Q., Ha, Q-T., Oellrich, A. and Rebholz-Schuhmann, D., 2013. Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking. PLoS One, v. 8
    Doi: http://doi.org/10.1371/journal.pone.0072965
  • Hay, SI., Battle, KE., Pigott, DM., Smith, DL., Moyes, CL., Bhatt, S., Brownstein, JS., Collier, N., Myers, MF., George, DB. and Gething, PW., 2013. Global mapping of infectious disease. Philos Trans R Soc Lond B Biol Sci, v. 368
    Doi: http://doi.org/10.1098/rstb.2012.0250
  • Collier, N., Oellrich, A. and Groza, T., 2013. Toward knowledge support for analysis and interpretation of complex traits. Genome Biol, v. 14
    Doi: http://doi.org/10.1186/gb-2013-14-9-214
  • Hartley, DM., Nelson, NP., Arthur, RR., Barboza, P., Collier, N., Lightfoot, N., Linge, JP., van der Goot, E., Mawudeku, A., Madoff, LC., Vaillant, L., Walters, R., Yangarber, R., Mantero, J., Corley, CD. and Brownstein, JS., 2013. An overview of internet biosurveillance. Clin Microbiol Infect, v. 19
    Doi: http://doi.org/10.1111/1469-0691.12273
  • Liu, S., Yamada, M., Collier, N. and Sugiyama, M., 2013. Change-point detection in time-series data by relative density-ratio estimation. Neural Netw, v. 43
    Doi: http://doi.org/10.1016/j.neunet.2013.01.012
  • 2012

  • Collier, N., 2012. Uncovering text mining: a survey of current work on web-based epidemic intelligence. Glob Public Health, v. 7
    Doi: http://doi.org/10.1080/17441692.2012.699975
  • Doan, S., Collier, N., Xu, H., Pham, HD. and Tu, MP., 2012. Recognition of medication information from discharge summaries using ensembles of classifiers. BMC Med Inform Decis Mak, v. 12
    Doi: http://doi.org/10.1186/1472-6947-12-36
  • Collier, N. and Doan, S., 2012. GENI-DB: a database of global events for epidemic intelligence. Bioinformatics, v. 28
    Doi: http://doi.org/10.1093/bioinformatics/bts099
  • 2011

  • Rebholz-Schuhmann, D., Jimeno Yepes, A., Li, C., Kafkas, S., Lewin, I., Kang, N., Corbett, P., Milward, D., Buyko, E., Beisswanger, E., Hornbostel, K., Kouznetsov, A., Witte, R., Laurila, JB., Baker, CJ., Kuo, C-J., Clematide, S., Rinaldi, F., Farkas, R., Móra, G., Hara, K., Furlong, LI., Rautschka, M., Neves, ML., Pascual-Montano, A., Wei, Q., Collier, N., Chowdhury, MFM., Lavelli, A., Berlanga, R., Morante, R., Van Asch, V., Daelemans, W., Marina, JL., van Mulligen, E., Kors, J. and Hahn, U., 2011. Assessment of NER solutions against the first and second CALBC Silver Standard Corpus. J Biomed Semantics, v. 2 Suppl 5
    Doi: http://doi.org/10.1186/2041-1480-2-S5-S11
  • Rebholz-Schuhmann, D., Rinaldi, F., Pyysalo, S., Collier, N. and Hahn, U., 2011. Towards mature use of semantic resources for biomedical analyses. J Biomed Semantics, v. 2 Suppl 5
    Doi: http://doi.org/10.1186/2041-1480-2-S5-I1
  • Collier, N., 2011. Towards cross-lingual alerting for bursty epidemic events. J Biomed Semantics, v. 2 Suppl 5
    Doi: http://doi.org/10.1186/2041-1480-2-S5-S10
  • Collier, N., Son, NT. and Nguyen, NM., 2011. OMG U got flu? Analysis of shared health messages for bio-surveillance. J Biomed Semantics, v. 2 Suppl 5
    Doi: http://doi.org/10.1186/2041-1480-2-S5-S9
  • Wei, Q. and Collier, N., 2011. Towards classifying species in systems biology papers using text mining. BMC Res Notes, v. 4
    Doi: http://doi.org/10.1186/1756-0500-4-32
  • Coulet, A., Garten, Y., Dumontier, M., Altman, RB., Musen, MA. and Shah, NH., 2011. Integration and publication of heterogeneous text-mined relationships on the Semantic Web. J Biomed Semantics, v. 2 Suppl 2
    Doi: http://doi.org/10.1186/2041-1480-2-S2-S10
  • 2010

  • Chanlekha, H. and Collier, N., 2010. A methodology to enhance spatial understanding of disease outbreak events reported in news articles. Int J Med Inform, v. 79
    Doi: http://doi.org/10.1016/j.ijmedinf.2010.01.014
  • Chanlekha, H., Kawazoe, A. and Collier, N., 2010. A framework for enhancing spatial and temporal granularity in report-based health surveillance systems. BMC Med Inform Decis Mak, v. 10
    Doi: http://doi.org/10.1186/1472-6947-10-1
  • Hartley, D., Nelson, N., Walters, R., Arthur, R., Yangarber, R., Madoff, L., Linge, J., Mawudeku, A., Collier, N., Brownstein, J., Thinus, G. and Lightfoot, N., 2010. Landscape of international event-based biosurveillance. Emerg Health Threats J, v. 3
    Doi: http://doi.org/10.3134/ehtj.10.003
  • Rebholz-Schuhmann, D., Collier, N., Park, JC. and Wong, L., 2010. Wrestling with biomedical research results: Language resources and literature analysis Journal of Bioinformatics and Computational Biology, v. 8
    Doi: http://doi.org/10.1142/S0219720010004598
  • Hartley, D., Nelson, N., Walters, R., Arthur, R., Yangarber, R., Madoff, L., Linge, J., Mawudeku, A., Collier, N., Brownstein, J., Thinus, G. and Lightfoot, N., 2010. The landscape of international event-based biosurveillance Emerging Health Threats Journal, v. 3
    Doi: 10.3402/ehtj.v3i0.7096
  • Rebholz-Schuhmann, D., Collier, N., Park, JC. and Wong, L., 2010. Wrestling with biomedical research results: language resources and literature analysis. Introduction. J Bioinform Comput Biol, v. 8
    Doi: http://doi.org/10.1142/s0219720010004598
  • Conway, M., Kawazoe, A., Chanlekha, H. and Collier, N., 2010. Developing a disease outbreak event corpus. J Med Internet Res, v. 12
    Doi: http://doi.org/10.2196/jmir.1323
  • Chanlekha, H. and Collier, N., 2010. Analysis of syntactic and semantic features for fine-grained event-spatial understanding in outbreak news reports. J Biomed Semantics, v. 1
    Doi: http://doi.org/10.1186/2041-1480-1-3
  • Collier, N., 2010. What's unusual in online disease outbreak news? J Biomed Semantics, v. 1
    Doi: http://doi.org/10.1186/2041-1480-1-2
  • 2009

  • Conway, M., Doan, S., Kawazoe, A. and Collier, N., 2009. Classifying disease outbreak reports using n-grams and semantic features. Int J Med Inform, v. 78
    Doi: http://doi.org/10.1016/j.ijmedinf.2009.03.010
  • Doan, S., Kawazoe, A., Conway, M. and Collier, N., 2009. Towards role-based filtering of disease outbreak reports. J Biomed Inform, v. 42
    Doi: http://doi.org/10.1016/j.jbi.2008.12.009
  • Kawazoe, A., Jin, L., Shigematsu, M., Bekki, D., Barrero, R., Taniguchi, K. and Collier, N., 2009. The development of a schema for semantic annotation: Gain brought by a formal ontological method Applied Ontology, v. 4
    Doi: http://doi.org/10.3233/AO-2009-0062
  • 2008 (Published online)

  • Doan, S., Doan, S., Ngo, Q-H., Kawazoe, A. and Collier, N., 2008 (Published online). Building and Using Geospatial Ontology in the BioCaster Surveillance System Nature Precedings,
    Doi: 10.1038/npre.2008.2110
  • Doan, S., Ngo, Q-H., Kawazoe, A. and Collier, N., 2008 (Published online). Building and Using Geospatial Ontology in the BioCaster Surveillance System Nature Precedings,
    Doi: 10.1038/npre.2008.2110.1
  • 2008

  • Collier, N., Doan, S., Kawazoe, A., Goodwin, RM., Conway, M., Tateno, Y., Ngo, Q-H., Dien, D., Kawtrakul, A., Takeuchi, K., Shigematsu, M. and Taniguchi, K., 2008. BioCaster: detecting public health rumors with a Web-based text mining system. Bioinformatics, v. 24
    Doi: http://doi.org/10.1093/bioinformatics/btn534
  • Kawazoe, A., Chanlekha, H., Shigematsu, M. and Collier, N., 2008. Structuring an event ontology for disease outbreak detection. BMC Bioinformatics, v. 9 Suppl 3
    Doi: http://doi.org/10.1186/1471-2105-9-S3-S8
  • McCrae, J. and Collier, N., 2008. Synonym set extraction from the biomedical literature by lexical pattern discovery. BMC Bioinformatics, v. 9
    Doi: http://doi.org/10.1186/1471-2105-9-159
  • Collier, N., Doan, S., Kawazoe, A., Shigematsu, M., Taniguchi, K., Takeuchi, K., Kawtrakul, A. and Dien, D., 2008. The Global Health Monitor: A Bio-Geographic View of World Outbreak News INTERNATIONAL JOURNAL OF INFECTIOUS DISEASES, v. 12
    Doi: http://doi.org/10.1016/j.ijid.2008.05.482
  • Korhonen, A., Krymolowski, Y. and Collier, N., 2008. The choice of features for classification of verbs in biomedical texts Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference, v. 1
    Doi: http://doi.org/10.3115/1599081.1599138
  • 2007

  • Thao, PTX., Tri, TQ., Dien, D. and Collier, N., 2007. Named entity recognition in Vietnamese using classifier voting ACM Transactions on Asian Language Information Processing, v. 6
    Doi: http://doi.org/10.1145/1316457.1316460
  • Tri Tran, Q., Thao Pham, TX., Hung Ngo, Q., Dinh, D. and Collier, N., 2007. Named entity recognition in Vietnamese documents Progress in Informatics,
    Doi: http://doi.org/10.2201/NiiPi.2007.4.2
  • 2006

  • Collier, N., Kawazoe, A., Jin, L., Shigematsu, M., Dien, D., Barrero, RA., Takeuchi, K. and Kawtrakul, A., 2006. A multilingual ontology for infectious disease surveillance: rationale, design and challenges. Lang Resour Eval, v. 40
    Doi: http://doi.org/10.1007/s10579-007-9019-7
  • Mizuta, Y., Korhonen, A., Mullen, T. and Collier, N., 2006. Zone analysis in biology articles as a basis for information extraction. Int J Med Inform, v. 75
    Doi: http://doi.org/10.1016/j.ijmedinf.2005.06.013
  • Collier, N., Nazarenko, A., Baud, R. and Ruch, P., 2006. Recent advances in natural language processing for biomedical applications. Int J Med Inform, v. 75
    Doi: http://doi.org/10.1016/j.ijmedinf.2005.06.008
  • 2005

  • Takeuchi, K. and Collier, N., 2005. Bio-medical entity extraction using support vector machines. Artif Intell Med, v. 33
    Doi: http://doi.org/10.1016/j.artmed.2004.07.019
  • Mullen, T., Mizuta, Y. and Collier, N., 2005. A baseline feature set for learning rhetorical zones using full articles in the biomedical domain ACM SIGKDD Explorations Newsletter, v. 7
    Doi: 10.1145/1089815.1089823
  • Kogan, Y., Collier, N., Pakhomov, S. and Krauthammer, M., 2005. Towards semantic role labeling & IE in the medical literature. AMIA Annu Symp Proc, v. 2005
  • 2004 (No publication date)

  • Nigel Collier, , Ai Kawazoe, , Asanobu Kitamoto, , Tuangthong Wattarujeekrit, , Yoko Mizuta, and Anthony Mullen, , 2004 (No publication date). Integrating Deep and Shallow Semantic Structures in Open
  • Ai Kawazoe, , Tony Mullen, , Koichi Takeuchi, , Tuangthong Wattarujeekrit, and Nigel Collier, , 2004 (No publication date). Genome Informatics 14: 677--678 (2003) 677 Open Ontology Forge: A Tool for Ontology Creation
  • 2004

  • Angelino, H. and Collier, N., 2004. Comparison of innovation policy and transfer of technology from public institutions in Japan, France, Germany and the United Kingdom NII Journal,
  • Collier, N. and Takeuchi, K., 2004. Comparison of character-level and part of speech features for name recognition in biomedical texts. J Biomed Inform, v. 37
    Doi: http://doi.org/10.1016/j.jbi.2004.08.008
  • Wattarujeekrit, T., Shah, PK. and Collier, N., 2004. PASBio: predicate-argument structures for event extraction in molecular biology. BMC Bioinformatics, v. 5
    Doi: http://doi.org/10.1186/1471-2105-5-155
  • 2003 (No publication date)

  • Koichi Takeuchi, and Nigel Collier, , 2003 (No publication date). Bio-Medical Entity Extraction using Support Vector Machines
  • 2003

  • Collier, N., Kumano, A. and Hirakawa, H., 2003. An application of local relevance feedback for building comparable corpora from news article matching NII Journal,
  • 2002 (No publication date)

  • Nigel Collier, , 2002 (No publication date). Machine Learning for Information Extraction from XML marked-up text on the Semantic Web
  • Chikashi Nobata, and Nigel Collier, , 2002 (No publication date). Comparison between Tagged Corpora for the Named Entity Task
  • Nigel Collier, , Chikashi Nobata, and Jun-ichi Tsujii, , 2002 (No publication date). Extracting the Names of Genes and Gene Products with a Hidden Markov Model
  • 2002

  • Takeuchi, K. and Collier, N., 2002. Use of Support Vector Machines in Extended Named Entity Recognition Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Collier, N., Takeuchi, K., Nobata, C., Fukumoto, J. and Ogata, N., 2002. Progress on multi-lingual named entity annotation guidelines using RDF(S) Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002,
  • Collier, N. and Takeuchi, K., 2002. PIA-Core: Semantic annotation through example-based learning Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002,
  • 2001 (No publication date)

  • Nigel Collier, , Koichi Takeuchi, and Keita Tsuji, , 2001 (No publication date). The PIA Project: Learning to Semantically Annotate Texts from an Ontology and XML-Instance Data
  • Nigel Collier, , Hideki Mima, , Tomoko Ohta, , Yuka Tateisi, and Akane Yakushiji, , 2001 (No publication date). The GENIA Project: Knowledge Acquisition from Biology Texts
  • 2001

  • Jones, G., Collier, N., Sakai, T., Sumita, K. and Hirakawa, H., 2001. A framework for cross-language information access: Application to English and Japanese Computers and the Humanities, v. 35
    Doi: http://doi.org/10.1023/A:1011851209975
  • Jones, G., Collier, N., Sakai, T., Sumita, K. and Hirakawa, H., 2001. A framework for cross-language information access: Application to english and Japanese Language Resources and Evaluation, v. 35
  • Collier, N., Nobata, C. and Tsujii, J., 2001. Automatic acquisition and classification of terminology using a tagged corpus in the molecular biology domain Terminology, v. 7
    Doi: http://doi.org/10.1075/term.7.2.07col
  • 2000 (No publication date)

  • Nigel Collier, , Hideki Hirakawa, and Akira Kumano, , 2000 (No publication date). Cross Language Information Retrieval: an Experimentin Bilingual News Article Alignment from the Internet using MT
  • Hisao Imai, and Nigel Collier, , 2000 (No publication date). A Combined Query Expansion Approach for Information Retrieval
  • Tomoko Ohta, , Yuka Tateisi, , Nigel Collier, , Chikashi Nobata, and Katsutoshi Ibushi, , 2000 (No publication date). A Semantically Annotated Corpus from MEDLINE Abstracts
  • Chikashi Nobata, , Nigel Collier, and Jun-ichi Tsujii, , 2000 (No publication date). Automatic Term Identification and Classification in Biology Texts
  • 1998 (No publication date)

  • Teruyoshi Hishiki, , Nigel Collier, , Chikashi Nobata, , Tomoko Okazaki-ohta, , Norihiro Ogata, , Takeshi Sekimizu, , Roland Steiner, and Hyun S. Park, , 1998 (No publication date). Developing NLP Tools for Genome Informatics: An Information Extraction Perspective
  • Nigel Collier, , Hideki Hirakawa, and Akira Kumano, , 1998 (No publication date). Creating a Noisy Parallel Corpus from Newswire Articles Using Cross-Language Information Retrieval
  • 1996 (No publication date)

  • Nigel Collier, , 1996 (No publication date). Storage of Natural Language Sentences in a Hopfield Network
  • Nigel Collier, , 1996 (No publication date). Contextual Meta-Knowledge Acquisition from Corpora
  • Theses / dissertations

    2021

  • Prokhorov, V., 2021. Injecting Inductive Biases into Distributed Representations of Text
    Doi: http://doi.org/10.17863/CAM.78416
  • Internet publications

    2020

  • Gritta, M., Pilehvar, MT. and Collier, N., 2020. A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics.
    Doi: http://doi.org/10.1007/s10579-019-09475-3
  • Datasets

    2018

  • Gritta, M., 2018. Research data supporting "Which Melbourne? Augmenting Geocoding with Maps"
    Doi: http://doi.org/10.17863/CAM.25015
  • 2017 (No publication date)

  • Gritta, M., Collier, N., Limsopatham, N. and Pilehvar, M., 2017 (No publication date). Research data supporting "Vancouver Welcomes You! Minimalist Location Metonymy Resolution"
  • Book chapters

    2017

  • Camacho-Collados, J., Pilehvar, MT., Collier, N. and Navigli, R., 2017. SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity
  • 2016

  • Collier, NH., 2016. A review of web-based epidemic detection
    Doi: http://doi.org/10.4324/9781315554211-14
  • 2015

  • Collier, NH., 2015. A review of web-based epidemic detection
    Doi: http://doi.org/10.4324/9781315554211
  • 2010

  • Collier, N., Doan, S., Goodwin, R., McCrae, J., Conway, M., Shigematsu, M. and Kawazoe, A., 2010. Navigating the Information Storm
    Doi: 10.1201/b10315-16
  • Doan, S., Conway, M. and Collier, N., 2010. An Empirical Study of Sections in Classifying Disease Outbreak Reports
    Doi: http://doi.org/10.1007/978-1-4419-1274-9_4

  • Read more at: Bert Vaux

    Bert Vaux

    Phonology; historical linguistics; dialectology; Armenian; Abkhaz

    Book chapters

    2023

  • Vaux, B. and Miller, B., 2023. On the atoms of phonological representation
    Doi: http://doi.org/10.1093/oso/9780198791126.003.0002
  • 2020

  • Ahmed, SK., Andersson, S. and Vaux, B., 2020. English phonology and morphology
    Doi: http://doi.org/10.1002/9781119540618.ch18
  • 2018 (No publication date)

  • Vaux, B., Sayeed, O. and Andersson, S., 2018 (No publication date). Regularity and lexical diffusion in phonological change
  • Vaux, B. and Miller, B., 2018 (No publication date). The atoms of phonological representation
    Doi: http://doi.org/10.17863/CAM.70755
  • 2018

  • Vaux, B. and Samuels, B., 2018. Abstract underlying representations in phonological theory
  • Vaux, B. and Perry, JJ., 2018. Vedic Sanskrit accentuation and readjustment rules
    Doi: http://doi.org/10.1515/9781501506734-009
  • Perry, JJ. and Vaux, B., 2018. Vedic Sanskrit accentuation and readjustment rules
    Doi: http://doi.org/10.1515/9781501506734-009
  • 2017 (Published online)

  • Andersson, S., Sayeed, O. and Vaux, B., 2017 (Published online). The Phonology of Language Contact
    Doi: 10.1093/oxfordhb/9780199935345.013.55
  • 2017

  • Vaux, B. and Myler, N., 2017. Issues and prospects in rule-based phonology
    Doi: http://doi.org/10.4324/9781315675428
  • Vaux, B. and Samuels, BD., 2017. Consonant Epenthesis and Markedness
    Doi: http://doi.org/10.1075/la.241.04vau
  • Vaux, B. and Sayeed, O., 2017. The evolution of Armenian
    Doi: http://doi.org/10.1515/9783110523874-021
  • 2016

  • Vaux, B., Myler, N. and Miller, B., 2016. Phonology in Universal Grammar
    Doi: http://doi.org/10.1093/oxfordhb/9780199573776.013.8
  • 2013

  • Vaux, B., 2013. The Armenian dialect of Khodorchur
  • 2012

  • Vaux, B., 2012. The Armenian dialect of Smyrna/Izmir
  • 2011

  • Vaux, B., 2011. Language games and speech disguises
  • Vaux, B. and Miller, B., 2011. The representation of fricatives
    Doi: http://doi.org/10.1111/b.9781405184236.2011.x
  • Vaux, B. and Myler, N., 2011. Meter is music
    Doi: http://doi.org/10.1093/acprof:oso/9780199553426.003.0005
  • Vaux, B., 2011. Language Games
    Doi: 10.1002/9781444343069.ch22
  • 2009

  • Vaux, B., 2009. The role of features in a symbolic theory of phonology
  • Vaux, B., 2009. Feature analysis
  • Vaux, B., 2009. Feature analysis.
  • Vaux, B., 2009. The role of features in a symbolic theory of phonology
  • 2008

  • Vaux, B., 2008. Armenian
  • Vaux, B., 2008. Zok: The Armenian dialect of Agulis
  • Vaux, B., 2008. Why the phonological component must be serial and rule-based
    Doi: http://doi.org/10.1093/acprof:oso/9780199226511.001.0001
  • Vaux, B., 2008. Why the Phonological Component must be Serial and Rule-Based 1
    Doi: http://doi.org/10.1093/acprof:oso/9780199226511.003.0002
  • 2007

  • Vaux, B. and Nevins, A., 2007. Underlying representations that do not minimize grammatical violations
  • 2006

  • Vaux, B., 2006. Armenian
    Doi: http://doi.org/10.1016/B0-08-044854-2/02158-1
  • Vaux, B., 2006. Homshetsma: The language of the Armenians of Hamshen
    Doi: http://doi.org/10.4324/9780203641682
  • 2002

  • Vaux, B., 2002. Szemerényi's Law and Stang's Law in Non-Linear Phonology
  • 1995

  • Vaux, B., 1995. A Problem in Diachronic Armenian Verbal Morphology
  • Journal articles

    2023

  • Andersson, S. and Vaux, B., 2023. Cwyzhy Abkhaz Journal of the International Phonetic Association, v. 53
    Doi: 10.1017/S0025100320000390
  • Burridge, J. and Vaux, B., 2023. Low dimensional measurement of vowels using machine perception. J Acoust Soc Am, v. 153
    Doi: http://doi.org/10.1121/10.0016845
  • 2022

  • Samuels, BD., Andersson, S., Sayeed, O. and Vaux, B., 2022. Getting ready for primetime: Paths to acquiring substance-free phonology Canadian Journal of Linguistics, v. 67
    Doi: http://doi.org/10.1017/cnj.2022.9
  • 2020

  • Burridge, J. and Vaux, B., 2020. Brownian dynamics for the vowel sounds of human language Physical Review Research, v. 2
    Doi: http://doi.org/10.1103/PhysRevResearch.2.013274
  • 2019 (Accepted for publication)

  • Burridge, J., Vaux, B., Gnacik, M. and Grudeva, Y., 2019 (Accepted for publication). Statistical physics of language maps in the USA Physical Review E, v. 99
    Doi: http://doi.org/10.1103/physreve.99.032305
  • Burridge, J., Blaxter, T. and Vaux, B., 2019 (Accepted for publication). Evolutionary paths of language EPL (Europhysics Letters), v. 128
    Doi: http://doi.org/10.1209/0295-5075/128/28003
  • 2015

  • Vaux, B. and Samuels, B., 2015. Explaining vowel systems: Dispersion theory vs natural selection Linguistic Review, v. 32
    Doi: http://doi.org/10.1515/tlr-2014-0028
  • 2012

  • Johannessen, JB. and Vaux, B., 2012. Retroflex variation and methodological issues: A reply to Simonsen, Moen, and Cowen (2008) Journal of Phonetics,
  • 2006

  • Vaux, B. and Nevins, A., 2006. The Role of Contrast in Locality: Transparent Palatal Glides in Kyrghyz MIT Working Papers in Linguistics, v. 52
  • 2005

  • Vaux, B. and Samuels, B., 2005. Laryngeal markedness and aspiration Phonology, v. 22
    Doi: http://doi.org/10.1017/S0952675705000667
  • 2003

  • Vaux, B., 2003. Syllabification in Armenian, universal grammar, and the lexicon LINGUIST INQ, v. 34
  • 2001

  • Vaux, B., 2001. The Armenian Dialect of Aslanbeg Annual of Armenian Linguistics, v. 21
  • 2000

  • Halle, M., Vaux, B. and Wolfe, A., 2000. On feature spreading and the representation of place of articulation LINGUIST INQ, v. 31
  • 1999

  • Vaux, B., 1999. Notes on the Armenian Dialect of Ayntab Annual of Armenian Linguistics, v. 20
  • 1998

  • Vaux, B., 1998. The laryngeal specifications of fricatives LINGUIST INQ, v. 29
  • 1996

  • Vaux, B., La Porta, S. and Tucker, E., 1996. Ethnographic Materials from the Muslim Hemshinli with Linguistic Notes Annual of Armenian Linguistics, v. 17
  • Vaux, B., 1996. The status of ATR in feature geometry LINGUIST INQ, v. 27
  • 1993

  • Vaux, B., 1993. Coronal Fronting in the Akn Dialect of Armenian Annual of Armenian Linguistics, v. 14
  • 1992

  • VAUX, B., 1992. GEMINATION AND SYLLABIC INTEGRITY IN SANSKRIT J INDO-EUR STUD, v. 20
  • Books

    2018 (No publication date)

  • Vaux, B., 2018 (No publication date). The Armenian dialect of New Julfa, Isfahan
  • 2016

  • Gordon, M., 2016. Phonological Typology
  • 2015

  • Saltmarsh, J., 2015. King’s College Chapel: A History and Commentary
  • 2011

  • Vaux, B., 2011. Language games
    Doi: http://doi.org/10.1002/9781444343069
  • 2008

  • Nevins, A. and Vaux, B., 2008. Introduction: The Division of Labor between Rules, Representations, and Constraints in Phonological Theory 1
    Doi: http://doi.org/10.1093/acprof:oso/9780199226511.003.0001
  • Vaux, B. and Nevins, A., 2008. Rules, constraints, and phonological phenomena
    Doi: http://doi.org/10.1093/acprof:oso/9780199226511.001.0001
  • 2007

  • Vaux, B., 2007. Linguistic Field Methods

  • Read more at: Dr Dora Alexopoulou

    Dr Dora Alexopoulou

    Second language learning; learner corpora,  theoretical and experimental syntax, linguistic theory

    Journal articles

    2021

  • Chen, X., Alexopoulou, T. and Tsimpli, I., 2021. Automatic extraction of subordinate clauses and its application in second language acquisition research. Behav Res Methods, v. 53
    Doi: http://doi.org/10.3758/s13428-020-01456-7
  • 2020

  • Ballier, N., Canu, S., Petitjean, C., Gasso, G., Balhana, C., Alexopoulou, T. and Gaillat, T., 2020. Machine learning for learner English: A plea for creating learner data challenges INTERNATIONAL JOURNAL OF LEARNER CORPUS RESEARCH, v. 6
    Doi: 10.1075/ijlcr.18012.bal
  • 2018 (Accepted for publication)

  • Huang, Y., Murakami, A., Alexopoulou, T. and Korhonen, A., 2018 (Accepted for publication). Dependency parsing of learner English International Journal of Corpus Linguistics, v. 23
    Doi: http://doi.org/10.1075/ijcl.16080.hua
  • 2017 (Accepted for publication)

  • Alexopoulou, T. and Folli, R., 2017 (Accepted for publication). Topic strategies and the internal structure of nominal arguments in Italian and Greek. Linguistic Inquiry, v. 50
    Doi: http://doi.org/10.1162/ling_a_00315
  • 2017

  • Alexopoulou, T., Michel, M., Murakami, A. and Meurers, D., 2017. Task Effects on Linguistic Complexity and Accuracy: A Large-Scale Learner Corpus Analysis Employing Natural Language Processing Techniques Language Learning, v. 67
    Doi: http://doi.org/10.1111/lang.12232
  • Matras, Y., 2017. Can global cities have a language policy? Languages, Society & Policy,
  • 2016

  • Murakami, A. and Alexopoulou, T., 2016. L1 Influence on the Acquisition Order of English Grammatical Morphemes Studies in Second Language Acquisition, v. 38
    Doi: http://doi.org/10.1017/S0272263115000352
  • 2015

  • Alexopoulou, T., Geertzen, J., Korhonen, A. and Meurers, D., 2015. Exploring big educational learner corpora for SLA research* Perspectives on relative clauses International Journal of Learner Corpus Research, v. 1
    Doi: http://doi.org/10.1075/ijlcr.1.1.04ale
  • 2010

  • Alexopoulou, T., 2010. Truly intrusive: Resumptive pronominals in questions and relative clauses LINGUA, v. 120
    Doi: http://doi.org/10.1016/j.lingua.2008.10.009
  • 2007

  • Alexopoulou, T. and Keller, F., 2007. Locality, cyclicity, and resumption: At the interface, between the grammar and the human sentence processor LANGUAGE, v. 83
  • 2006

  • Alexopoulou, T., 2006. Resumption in relative clauses NAT LANG LINGUIST TH, v. 24
    Doi: http://doi.org/10.1007/s11049-005-0898-2
  • 2002

  • Alexopoulou, T. and Kolliakou, D., 2002. On linkhood, topicalization and the clitic left dislocation J LINGUIST, v. 38
  • 2001

  • Keller, F. and Alexopoulou, T., 2001. Phonology competes with syntax: experimental evidence for the interaction of word order and accent placement in the realization of Information Structure COGNITION, v. 79
  • Book chapters

    2018

  • Jiang, X., Huang, Y., Guo, Y., Geertzen, J., Alexopoulou, T., Sun, L. and Korhonen, A., 2018. Native language identification on EFCAMDAT
    Doi: http://doi.org/10.1017/9781316676974.007
  • 2016

  • Alexopoulou, T. and Murakami, A., 2016. Longitudinal L2 development of the English article in individual learners
  • 2013 (No publication date)

  • Alexopoulou, T., Folli, R. and Toulas, G., 2013 (No publication date). Bare Number
  • 2012 (No publication date)

  • Alexopoulou, T. and Baltazani, M., 2012 (No publication date). Focus in Greek Wh-questions
  • Alexopoulou, T. and Keller, F., 2012 (No publication date). What vs. who and which: Kind denoting fillers and the complexity of whether-islands
  • 2010

  • 2010. Truly intrusive:resumptive pronominals in questions and relative clauses
  • 2004

  • Alexopoulou, T., Doron, E. and Heycock, C., 2004. Broad Subjects and Clitic Left Dislocation
    Doi: 10.1007/1-4020-1910-6_14
  • 2003

  • Alexopoulou, T. and Keller, F., 2003. Linguistic Complexity, Locality and Resumption
  • Conference proceedings

    2015

  • Alexopoulou, T., Kane, F., Romoli, J., Tsoulas, G. and Folli, R., 2015. A scalar implicature-based account of the inference of pluralised mass (and count) nouns Proceedings of the 51st meeting of the Chicago Linguistic Society,
  • 2013 (No publication date)

  • Geertzen, J., Alexopoulou, T. and Korhonen, A., 2013 (No publication date). Automatic linguistic annotation of large scale L2 databases: the EF-Cambridge Open Language Database Selected papers from the Second Language Research Forum,
  • 2012 (No publication date)

  • Alexopoulou, T. and Folli, R., 2012 (No publication date). Indefinite Topics and the syntax of nominals in Italian and Greek https://sites.google.com/site/wccfl28pro/home,
  • Yannakoudakis, H., Briscoe, E. and Alexopoulou, T., 2012 (No publication date). Automating Second Language Acquisition Research: Integrating Information Visualisation and Machine Learning EACL,
  • Alexopoulou, T., Yannakoudakis, H. and Salamoura, A., 2012 (No publication date). Classifying intermediate Learner English: a data driven approach to learner corpora
  • 2008

  • Alexopoulou, T., 2008. Binding Illusions and Resumption in Greek Proceedings of the 2007 Workshop in Greek Syntax and Semantics at MIT, v. 57
  • Alexopoulou, D., Parodi, T. and Vilar-Beltrán, E., 2008. Variables and Resumption in Child Spanish Proceedings of Child Language Seminar,
  • 2006

  • Alexopoulou, T. and Keller, F., 2006. Gradience and parametric variation Proceedings of Workshop in Experimental Linguistics,
  • 2005

  • Alexopoulou, D., 2005. Free and Restrictive Relative Clauses in Greek 17th International Symposium of Theoretical and Applied Linguistics, v. 1
  • Alexopoulou, T. and Keller, F., 2005. A crosslinguistic experimental investigation of resumptive pronouns and "that-trace" effects Proceedings of the 27th Annual Conference of the Cognitive Science Society,
  • 2002

  • Alexopoulou, T. and Keller, F., 2002. Resumption and Locality: a crosslinguistic experimental study Proceedings of the 38th meeting of the Chicago Linguistics Society,
  • 2001

  • 2001. Relative Clauses with Quantifiers and Definiteness Proceedings of the Workshop "Choice Functions and Natural Language Semantics", v. 110
  • 1998

  • 1998. Detaching Discourse Functions from Functional Projections Penn Working Papers in Linguistics (PWPL), v. 5:1
  • 1997

  • Alexopoulou, T., 1997. A discourse-based account of Weak Crossover Effect Proceedings of the Second ESSLLI Student Session,
  • Other publications

    2012 (No publication date)

  • Geertzen, J., Alexopoulou, T. and Korhonen, A., 2012 (No publication date). EF-Cambridge Open Language Database: written coprus

  • Read more at: Professor Anna Korhonen

    Professor Anna Korhonen

    Computational approaches to lexicon, syntax, semantics and discourse;
    scientific text processing and text mining;
    NLP for biomedicine;
    NLP for real-world applications;
    computational models of human language learning

    Theses / dissertations

    2023 (No publication date)

  • Liu, Q., 2023 (No publication date). On the Evaluation and Modelling of Context-sensitive Lexical Semantics
    Doi: http://doi.org/10.17863/CAM.95289
  • Journal articles

    2023 (Accepted for publication)

  • Collins, C., Baker, S., Brown, J., Zeng, H., Chan, A., Stenius, U., Narita, M. and Korhonen, A., 2023 (Accepted for publication). Text Mining for Contexts and Relationships in Cancer Genomics Literature Bioinformatics,
    Doi: http://doi.org/10.1093/bioinformatics/btae021
  • 2023

  • Hu, S., Zhou, H., Hergul, M., Gritta, M., Zhang, G., Iacobacci, I., Vulić, I. and Korhonen, A., 2023. MULTI<sup>3</sup>WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems Transactions of the Association for Computational Linguistics, v. 11
    Doi: http://doi.org/10.1162/tacl_a_00609
  • Breger, A., Selby, I., Roberts, M., Babar, J., Gkrania-Klotsas, E., Preller, J., Escudero Sánchez, L., AIX-COVNET Collaboration, , Rudd, JHF., Aston, JAD., Weir-McCall, JR., Sala, E. and Schönlieb, C-B., 2023. A pipeline to further enhance quality, integrity and reusability of the NCCID clinical data. Sci Data, v. 10
    Doi: http://doi.org/10.1038/s41597-023-02340-7
  • Petti, U., Baker, S., Korhonen, A. and Robin, J., 2023. How Much Speech Data Is Needed for Tracking Language Change in Alzheimer's Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples. Digit Biomark, v. 7
    Doi: http://doi.org/10.1159/000533423
  • Petti, U., Baker, S., Korhonen, A. and Robin, J., 2023. The Generalizability of Longitudinal Changes in Speech Before Alzheimer's Disease Diagnosis. J Alzheimers Dis, v. 92
    Doi: http://doi.org/10.3233/JAD-220847
  • Majewska, O., Razumovskaia, E., Ponti, EM., Vulić, I. and Korhonen, A., 2023. Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation Transactions of the Association for Computational Linguistics, v. 11
    Doi: http://doi.org/10.1162/tacl_a_00539
  • Majewska, O. and Korhonen, A., 2023. Verb Classification Across Languages Annual Review of Linguistics, v. 9
    Doi: http://doi.org/10.1146/annurev-linguistics-030521-043632
  • Petti, U., Baker, S., Korhonen, A. and Robin, J., 2023. The Generalizability of Longitudinal Changes in Speech Before Alzheimer's Disease Diagnosis. J Alzheimers Dis, v. 92
    Doi: http://doi.org/10.3233/JAD-220847
  • Dittmer, S., Roberts, M., Gilbey, J., Biguri, A., Selby, I., Breger, A., Thorpe, M., Weir-McCall, JR., Gkrania-Klotsas, E., Korhonen, A., Jefferson, E., Langs, G., Yang, G., Prosch, H., Stanczuk, J., Tang, J., Babar, J., Escudero Sánchez, L., Teare, P., Patel, M., Wassin, M., Holzer, M., Walton, N., Lió, P., Shadbahr, T., Sala, E., Preller, J., Rudd, JHF., Aston, JAD. and Schönlieb, CB., 2023. Navigating the development challenges in creating complex data systems Nature Machine Intelligence, v. 5
    Doi: 10.1038/s42256-023-00665-x
  • Schellaert, W., Martínez-Plumed, F., Vold, K., Burden, J., Casares, PAM., Loe, BS., Reichart, R., Héigeartaigh, S., Korhonen, A. and Hernández-Orallo, J., 2023. Your Prompt is My Command: On Assessing the Human-Centred Generality of Multimodal Models Journal of Artificial Intelligence Research, v. 77
    Doi: http://doi.org/10.1613/jair.1.14157
  • 2022

  • Majewska, O., Razumovskaia, E., Ponti, EM., Vulić, I. and Korhonen, A., 2022. Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation
  • Razumovskaia, E., Glavaš, G., Majewska, O., Ponti, EM., Korhonen, A. and Vulic, I., 2022. Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue Systems Journal of Artificial Intelligence Research, v. 74
    Doi: http://doi.org/10.1613/JAIR.1.13083
  • 2021

  • Majewska, O., McCarthy, D., van den Bosch, JJF., Kriegeskorte, N., Vulić, I. and Korhonen, A., 2021. Semantic data set construction from human clustering and spatial arrangement Computational Linguistics, v. 47
    Doi: http://doi.org/10.1162/COLI_a_00396
  • Roberts, M., Driggs, D., Thorpe, M., Gilbey, J., Yeung, M., Ursprung, S., Aviles-Rivero, AI., Etmann, C., McCague, C., Beer, L., Weir-McCall, JR., Teng, Z., Gkrania-Klotsas, E., Ruggiero, A., Korhonen, A., Jefferson, E., Ako, E., Langs, G., Gozaliasl, G., Yang, G., Prosch, H., Preller, J., Stanczuk, J., Tang, J., Hofmanninger, J., Babar, J., Sánchez, LE., Thillai, M., Gonzalez, PM., Teare, P., Zhu, X., Patel, M., Cafolla, C., Azadbakht, H., Jacob, J., Lowe, J., Zhang, K., Bradley, K., Wassin, M., Holzer, M., Ji, K., Ortet, MD., Ai, T., Walton, N., Lio, P., Stranks, S., Shadbahr, T., Lin, W., Zha, Y., Niu, Z., Rudd, JHF., Sala, E. and Schönlieb, CB., 2021. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans Nature Machine Intelligence, v. 3
    Doi: http://doi.org/10.1038/s42256-021-00307-0
  • Ali, I., Dreij, K., Baker, S., Högberg, J., Korhonen, A. and Stenius, U., 2021. Application of Text Mining in Risk Assessment of Chemical Mixtures: A Case Study of Polycyclic Aromatic Hydrocarbons (PAHs). Environ Health Perspect, v. 129
    Doi: http://doi.org/10.1289/EHP6702
  • Majewska, O., Collins, C., Baker, S., Björne, J., Brown, SW., Korhonen, A. and Palmer, M., 2021. BioVerbNet: a large semantic-syntactic classification of verbs in biomedicine. J Biomed Semantics, v. 12
    Doi: http://doi.org/10.1186/s13326-021-00247-z
  • Huang, Y., Murakami, A., Alexopoulou, T. and Korhonen, A., 2021. Subcategorization frame identification for learner English International Journal of Corpus Linguistics, v. 26
    Doi: http://doi.org/10.1075/ijcl.18097.hua
  • Su, Y., Wang, Y., Cai, D., Baker, S., Korhonen, A. and Collier, N., 2021. PROTOTYPE-TO-STYLE: Dialogue Generation with Style-Aware Editing on Retrieval Memory IEEE/ACM Transactions on Audio Speech and Language Processing, v. 29
    Doi: http://doi.org/10.1109/TASLP.2021.3087948
  • Ponti, EM., Vulić, I., Cotterell, R., Parović, M., Reichart, R. and Korhonen, A., 2021. Parameter space factorization for zero-shot learning across tasks and languages Transactions of the Association for Computational Linguistics, v. 9
    Doi: http://doi.org/10.1162/tacl_a_00374
  • 2020 (Accepted for publication)

  • Vulic, I., Baker, S., Ponti, E., Petti, U., Leviant, I., Wing, K., Majewska, O., Bar, E., Malone, M., Poibeau, T., Reichart, R. and Korhonen, A., 2020 (Accepted for publication). Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity Computational Linguistics,
    Doi: http://doi.org/10.1162/coli_a_00391
  • 2020

  • Petti, U., Baker, S. and Korhonen, A., 2020. A systematic literature review of automatic Alzheimer's disease detection from speech and language. J Am Med Inform Assoc, v. 27
    Doi: http://doi.org/10.1093/jamia/ocaa174
  • Crichton, G., Baker, S., Guo, Y. and Korhonen, A., 2020. Neural networks for open and closed Literature-based Discovery. PLoS One, v. 15
    Doi: http://doi.org/10.1371/journal.pone.0232891
  • 2019

  • Pyysalo, S., Baker, S., Ali, I., Haselwimmer, S., Shah, T., Young, A., Guo, Y., Högberg, J., Stenius, U., Narita, M. and Korhonen, A., 2019. LION LBD: a literature-based discovery system for cancer biology. Bioinformatics, v. 35
    Doi: http://doi.org/10.1093/bioinformatics/bty845
  • Ponti, EM., O'Horan, H., Berzak, Y., Vulic, I., Reichart, R., Poibeau, T., Shutova, E. and Korhonen, A., 2019. Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing COMPUTATIONAL LINGUISTICS, v. 45
    Doi: http://doi.org/10.1162/coli_a_00357
  • 2018 (Accepted for publication)

  • Huang, Y., Murakami, A., Alexopoulou, T. and Korhonen, A., 2018 (Accepted for publication). Dependency parsing of learner English International Journal of Corpus Linguistics, v. 23
    Doi: http://doi.org/10.1075/ijcl.16080.hua
  • Chiu, HW., Majewska, O., Pyysalo, S., Wey, L., Stenius, U., Korhonen, AL. and Palmer, M., 2018 (Accepted for publication). A Neural Classification Method for Supporting the Creation of BioVerbNet Journal of Biomedical Semantics, v. 10
    Doi: http://doi.org/10.1186/s13326-018-0193-x
  • 2018

  • Majewska, O., Vulić, I., McCarthy, D., Huang, Y., Murakami, A., Laippala, V. and Korhonen, A., 2018. Investigating the cross-lingual translatability of VerbNet-style classification. Lang Resour Eval, v. 52
    Doi: http://doi.org/10.1007/s10579-017-9403-x
  • Chiu, B., Pyysalo, S., Vulić, I. and Korhonen, A., 2018. Bio-SimVerb and Bio-SimLex: wide-coverage evaluation sets of word similarity in biomedicine. BMC Bioinformatics, v. 19
    Doi: http://doi.org/10.1186/s12859-018-2039-z
  • Crichton, G., Guo, Y., Pyysalo, S. and Korhonen, A., 2018. Neural networks for link prediction in realistic biomedical graphs: a multi-dimensional evaluation of graph embedding-based approaches. BMC Bioinformatics, v. 19
    Doi: http://doi.org/10.1186/s12859-018-2163-9
  • Gerz, D., Vulić, I., Ponti, E., Naradowsky, J., Reichart, R. and Korhonen, A., 2018. Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction Transactions of the Association for Computational Linguistics, v. 6
    Doi: 10.1162/tacl_a_00032
  • 2017 (Accepted for publication)

  • Baker, S., Ali, I., Silins, I., Pyysalo, S., Guo, Y., Högberg, J., Stenius, U. and Korhonen, A., 2017 (Accepted for publication). Cancer Hallmarks Analytics Tool (CHAT): A text mining approach to organise and evaluate scientific literature on cancer Bioinformatics, v. 33
    Doi: http://doi.org/10.1093/bioinformatics/btx454
  • 2017

  • Crichton, G., Pyysalo, S., Chiu, B. and Korhonen, A., 2017. A neural network multi-task learning approach to biomedical named entity recognition BMC Bioinformatics, v. 18
    Doi: http://doi.org/10.1186/s12859-017-1776-8
  • Lu, Y., Guo, Y. and Korhonen, A., 2017. Link prediction in drug-target interactions network using similarity indices. BMC Bioinformatics, v. 18
    Doi: http://doi.org/10.1186/s12859-017-1460-z
  • Larsson, K., Baker, S., Silins, I., Guo, Y., Stenius, U., Korhonen, A. and Berglund, M., 2017. Text mining for improved exposure assessment PLOS One, v. 12
    Doi: http://doi.org/10.1371/journal.pone.0173132
  • Vulić, I., Gerz, D., Kiela, D., Hill, F. and Korhonen, A., 2017. HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment Computational Linguistics, v. 43
    Doi: http://doi.org/10.1162/COLI_a_00301
  • 2016

  • Hill, F., Cho, K., Korhonen, A. and Bengio, Y., 2016. Learning to Understand Phrases by Embedding the Dictionary Transactions of the Association for Computational Linguistics, v. 4
    Doi: 10.1162/tacl_a_00080
  • Baker, S., Silins, I., Guo, Y., Ali, I., Högberg, J., Stenius, U. and Korhonen, A., 2016. Automatic semantic classification of scientific literature according to the hallmarks of cancer. Bioinformatics, v. 32
    Doi: http://doi.org/10.1093/bioinformatics/btv585
  • Ali, I., Guo, Y., Silins, I., Högberg, J., Stenius, U. and Korhonen, A., 2016. Grouping chemicals for health risk assessment: A text mining-based case study of polychlorinated biphenyls (PCBs). Toxicol Lett, v. 241
    Doi: http://doi.org/10.1016/j.toxlet.2015.11.003
  • Ali, I., Högberg, J., Hsieh, J-H., Auerbach, S., Korhonen, A., Stenius, U. and Silins, I., 2016. Gender differences in cancer susceptibility: role of oxidative stress. Carcinogenesis, v. 37
    Doi: http://doi.org/10.1093/carcin/bgw076
  • 2015

  • Alexopoulou, T., Geertzen, J., Korhonen, A. and Meurers, D., 2015. Exploring big educational learner corpora for SLA research* Perspectives on relative clauses International Journal of Learner Corpus Research, v. 1
    Doi: http://doi.org/10.1075/ijlcr.1.1.04ale
  • Guo, Y., Reichart, R. and Korhonen, A., 2015. Unsupervised Declarative Knowledge Induction for Constraint-Based Learning of Information Structure in Scientific Documents Transactions of Association for Computational Linguistics, v. 3
  • Kiela, D., Guo, Y., Stenius, U. and Korhonen, A., 2015. Unsupervised discovery of information structure in biomedical documents. Bioinformatics, v. 31
    Doi: http://doi.org/10.1093/bioinformatics/btu758
  • Kiela, D., Guo, Y., Stenius, U. and Korhonen, A., 2015. Unsupervised discovery of information structure in biomedical documents Bioinformatics, v. 31
    Doi: http://doi.org/10.1093/bioinformatics/btu758
  • Hill, F., Reichart, R. and Korhonen, A., 2015. Simlex-999: Evaluating semantic models with (Genuine) similarity estimation Computational Linguistics, v. 41
    Doi: http://doi.org/10.1162/COLI_a_00237
  • Korhonen, A., Baker, S., Silins, I., Guo, Y., Ali, I., Hogberg, J. and Stenius, U., 2015. Automatic Semantic Classification of Scientific Literature According to the Hallmarks of Cancer Bioinformatics,
  • 2014

  • Hill, F., Korhonen, A. and Bentz, C., 2014. A quantitative empirical analysis of the abstract/concrete distinction Cognitive Science, v. 38
    Doi: http://doi.org/10.1111/cogs.12076
  • Kelly, C., Devereux, B. and Korhonen, A., 2014. Automatic extraction of property norm-like data from large text corpora Cognitive Science, v. 38
    Doi: http://doi.org/10.1111/cogs.12091
  • Silins, I., Korhonen, A. and Stenius, U., 2014. Evaluation of carcinogenic modes of action for pesticides in fruit on the Swedish market using a text-mining tool. Front Pharmacol, v. 5
    Doi: http://doi.org/10.3389/fphar.2014.00145
  • Séaghdha, D. and Korhonen, A., 2014. Probabilistic distributional semantics with latent variable models Computational Linguistics, v. 40
    Doi: http://doi.org/10.1162/COLI_a_00194
  • Hill, F., Korhonen, A. and Reichart, R., 2014. Multi-Modal Models for Concrete and Abstract Concept Meaning Transactions of ACL (TACL), v. 2
  • Kiela, D., Hill, F., Korhonen, A. and Clark, S., 2014. Improving multi-modal representations using image dispersion: Why less is sometimes more 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Proceedings of the Conference, v. 2
    Doi: http://doi.org/10.3115/v1/p14-2135
  • Hill, F., Korhonen, A. and Bentz, C., 2014. A quantitative empirical analysis of the abstract/concrete distinction. Cogn Sci, v. 38
    Doi: http://doi.org/10.1111/cogs.12076
  • 2013

  • Kelly, C., Devereux, B. and Korhonen, A., 2013. Automatic Extraction of Property Norm-Like Data From Large Text Corpora Cognitive Science,
  • Shutova, E., Devereux, BJ. and Korhonen, A., 2013. Conceptual metaphor theory meets the data: A corpus-based human annotation study Language Resources and Evaluation, v. 47
    Doi: http://doi.org/10.1007/s10579-013-9238-z
  • Lippincott, T., Rimell, L., Verspoor, K. and Korhonen, A., 2013. Approaches to verb subcategorization for biomedicine. J Biomed Inform, v. 46
    Doi: http://doi.org/10.1016/j.jbi.2012.12.001
  • Poibeau, T., Villavicencio, A., Alishahi, A. and Korhonen, A., 2013. Computational Modeling as a Methodology for Studying Human Language Learning Cognitive Aspects of Computational Language Acquisition,
  • Rimell, L., Lippincott, T., Verspoor, K., Johnson, HL. and Korhonen, A., 2013. Acquisition and evaluation of verb subcategorization resources for biomedicine. J Biomed Inform, v. 46
    Doi: http://doi.org/10.1016/j.jbi.2013.01.001
  • Guo, Y., Silins, I., Stenius, U. and Korhonen, A., 2013. Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review. Bioinformatics, v. 29
    Doi: http://doi.org/10.1093/bioinformatics/btt163
  • Rimell, L., Lippincott, T., Verspoor, K., Johnson, HL. and Korhonen, A., 2013. Acquisition and evaluation of verb subcategorization resources for biomedicine Journal of Biomedical Informatics, v. 46
    Doi: http://doi.org/10.1016/j.jbi.2013.01.001
  • Lippincott, T., Rimell, L., Verspoor, K. and Korhonen, A., 2013. Approaches to verb subcategorization for biomedicine Journal of Biomedical Informatics, v. 46
    Doi: http://doi.org/10.1016/j.jbi.2012.12.001
  • Kelly, C., Korhonen, A. and Devereux, B., 2013. Automatic extraction of property norm-like features from large text corpora with gold standard, human and semantic-similarity evaluations Cognitive Science,
  • Shutova, E., Devereux, BJ. and Korhonen, A., 2013. Conceptual metaphor theory meets the data: a corpus-based human annotation study Language Resources and Evaluation,
  • Shutova, E., Kaplan, J., Teufel, S. and Korhonen, A., 2013. A computational model of logical metonymy ACM Transactions on Speech and Language Processing, v. 10
    Doi: http://doi.org/10.1145/2483969.2483973
  • 2012

  • Korhonen, A., Séaghdha, DO., Silins, I., Sun, L., Högberg, J. and Stenius, U., 2012. Text mining for literature review and knowledge discovery in cancer risk assessment and research. PLoS One, v. 7
    Doi: http://doi.org/10.1371/journal.pone.0033427
  • Kadekar, S., Silins, I., Korhonen, A., Dreij, K., Al-Anati, L., Hogberg, J. and Stenius, U., 2012. Exocrine Pancreatic Carcinogenesis and Autotaxin Expression PLOS ONE, v. 7
    Doi: http://doi.org/10.1371/journal.pone.0043209
  • Silins, I., Korhonen, A., Högberg, J. and Stenius, U., 2012. Data and literature gathering in chemical cancer risk assessment Integrated Environmental Assessment and Management, v. 8
    Doi: http://doi.org/10.1002/ieam.1278
  • Shutova, E., Teufel, SH. and Korhonen, A., 2012. Statistical Metaphor Processing Computational Linguistics, v. 39
  • Van De Cruys, T., Rimell, L., Poibeau, T. and Korhonen, A., 2012. Multi-way tensor factorization for unsupervised lexical acquisition 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers,
  • Contractor, D., Guo, Y. and Korhonen, A., 2012. Using argumentative zones for extractive summarization of scientific articles 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers,
  • Lippincott, T., Séaghdha, D. and Korhonen, A., 2012. Learning syntactic verb frames using graphical models 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference, v. 1
  • 2011

  • Séaghdh, DO. and Korhonen, A., 2011. Probabilistic models of similarity in syntactic context EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
  • Guo, Y., Korhonen, A., Liakata, M., Silins, I., Hogberg, J. and Stenius, U., 2011. A comparison and user-based evaluation of models of textual information structure in the context of cancer risk assessment. BMC Bioinformatics, v. 12
    Doi: http://doi.org/10.1186/1471-2105-12-69
  • Lippincott, T., Séaghdha, DÓ. and Korhonen, A., 2011. Exploring subdomain variation in biomedical language. BMC Bioinformatics, v. 12
    Doi: http://doi.org/10.1186/1471-2105-12-212
  • Guo, Y., Korhonen, A., Silins, I. and Stenius, U., 2011. Weakly supervised learning of information structure of scientific abstracts--is it accurate enough to benefit real-world tasks in biomedicine? Bioinformatics, v. 27
    Doi: http://doi.org/10.1093/bioinformatics/btr536
  • Van De Cruys, T., Poibeau, T. and Korhonen, A., 2011. Latent vector weighting for word meaning in context EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
  • Guo, Y., Korhonen, A. and Poibeau, T., 2011. A weakly-supervised approach to Argumentative Zoning of scientific documents EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
  • Sun, L. and Korhonen, A., 2011. Hierarchical verb clustering using graph factorization EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
  • 2010

  • Korhonen, A., 2010. Automatic lexical classification: bridging research and practice. Philos Trans A Math Phys Eng Sci, v. 368
    Doi: http://doi.org/10.1098/rsta.2010.0039
  • Shutova, E., Sun, L. and Korhonen, A., 2010. Metaphor identification using verb and noun clustering Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference, v. 2
  • Sun, L., Korhonen, A., Poibeau, T. and Messiant, C., 2010. Investigating the cross-linguistic potential of verbnet -style classification Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference, v. 2
  • Lippincott, T., Śeaghdha, DO., Sun, L. and Korhonen, A., 2010. Exploring variation across biomedical subdomains Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference, v. 2
  • Korhonen, A., 2010. Automatic Lexical Classification - Bridging Research and Practice. In Philoshophical Transactions A of the Royal Society, v. 368
  • 2009

  • Korhonen, A., 2009. Automatic lexical classification - Balancing between machine learning and linguistics PACLIC 23 - Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, v. 1
  • Korhonen, A., Silins, I., Sun, L. and Stenius, U., 2009. The first step in the development of Text Mining technology for Cancer Risk Assessment: identifying and organizing scientific evidence in risk assessment literature. BMC Bioinformatics, v. 10
    Doi: http://doi.org/10.1186/1471-2105-10-303
  • Silins, I., Korhonen, A., Hogberg, J., Sun, L. and Stenius, U., 2009. Improved cancer risk assessment using text mining CANCER RESEARCH, v. 69
  • Devereux, B., Pilkington, N., Poibeau, T. and Korhonen, A., 2009. Towards Unrestricted, Large-Scale Acquisition of Feature-Based Conceptual Representations from Corpus Data Research on Language and Computation, v. 7
    Doi: http://doi.org/10.1007/s11168-010-9068-8
  • Sun, L. and Korhonen, A., 2009. Improving verb clustering with automatically acquired selectional preferences EMNLP 2009 - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009,
    Doi: http://doi.org/10.3115/1699571.1699596
  • 2008

  • Kipper, K., Korhonen, A., Ryant, N. and Palmer, M., 2008. A large-scale classification of English verbs LANG RESOUR EVAL, v. 42
    Doi: http://doi.org/10.1007/s10579-007-9048-2
  • Korhonen, A., Krymolowski, Y. and Collier, N., 2008. The choice of features for classification of verbs in biomedical texts Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference, v. 1
    Doi: http://doi.org/10.3115/1599081.1599138
  • 2007

  • 2007. Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition
  • 2006

  • Mizuta, Y., Korhonen, A., Mullen, T. and Collier, N., 2006. Zone analysis in biology articles as a basis for information extraction. Int J Med Inform, v. 75
    Doi: http://doi.org/10.1016/j.ijmedinf.2005.06.013
  • 2005

  • Yallop, J., Korhonen, A. and Briscoe, T., 2005. Automatic acquisition of adjectival subcategorization from corpora ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference,
    Doi: http://doi.org/10.3115/1219840.1219916
  • Buttery, P. and Korhonen, A., 2005. Large-scale analysis of verb subcategorization differences between child directed speech and adult speech Proceedings of the Interdisciplinary Workshop on the Identification and Representation of Verb Features and Verb Classes,
  • Villavicencio, A., Bond, F., Korhonen, A. and McCarthy, D., 2005. Introduction to the special issue on multiword expressions: Having a crack at a hard nut COMPUT SPEECH LANG, v. 19
    Doi: http://doi.org/10.1016/j.csl.2005.05.001
  • 1999

  • Baljko, M. and Korhonen, A., 1999. Preface to the student session papers Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1999-June
  • Conference proceedings

    2023

  • Köksal, A., Yalcin, OF., Akbiyik, A., Kılavuz, MT., Korhonen, A. and Schütze, H., 2023. Language-Agnostic Bias Detection in Language Models with Bias Probing Findings of the Association for Computational Linguistics: EMNLP 2023,
  • Kantharuban, A., Vulić, I. and Korhonen, A., 2023. Quantifying the Dialect Gap and its Correlates Across Languages Findings of the Association for Computational Linguistics: EMNLP 2023,
  • Zhou, H., Wan, X., Vulić, I. and Korhonen, A., 2023. Survival of the Most Influential Prompts: Efficient Black-Box Prompt Search via Clustering and Pruning Findings of the Association for Computational Linguistics: EMNLP 2023,
  • Li, Y., Chang, CY., Rawls, S., Vulić, I. and Korhonen, A., 2023. Translation-Enhanced Multilingual Text-to-Image Generation Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Qiu, Y., Ziser, Y., Korhonen, A., Ponti, EM. and Cohen, SB., 2023. Detecting and Mitigating Hallucinations in Multilingual Summarisation EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Li, Y., Korhonen, A. and Vulić, I., 2023. On Bilingual Lexicon Induction with Large Language Models EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Razumovskaia, E., Vulić, I. and Korhonen, A., 2023. Transfer-Free Data-Efficient Multilingual Slot Labeling EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Ansell, A., Parović, M., Vulić, I., Korhonen, A. and Ponti, EM., 2023. Unifying Cross-Lingual Transfer across Scenarios of Resource Scarcity EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Hu, S., Zhou, H., Yuan, Z., Gritta, M., Zhang, G., Iacobacci, I., Korhonen, A. and Vulić, I., 2023. A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Vulić, I., Glavaš, G., Liu, F., Collier, N., Ponti, EM. and Korhonen, A., 2023. Probing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference,
  • Yuan, Z., Hu, S., Vulić, I., Korhonen, A. and Meng, Z., 2023. Can Pretrained Language Models (Yet) Reason Deductively? EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference,
  • Liu, CC., Pfeiffer, J., Korhonen, A., Vulić, I. and Gurevych, I., 2023. Delving Deeper into Cross-lingual Visual Question Answering EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023,
  • Parović, M., Ansell, A., Vulić, I. and Korhonen, A., 2023. Cross-Lingual Transfer with Target Language-Ready Task Adapters Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Ansell, A., Ponti, EM., Korhonen, A. and Vulić, I., 2023. Distilling Efficient Language-Specific Models for Cross-Lingual Transfer Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Moghe, N., Razumovskaia, E., Guillou, L., Vulić, I., Korhonen, A. and Birch, A., 2023. MULTI<sup>3</sup>NLU<sup>++</sup>: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Petti, U., Nyrup, R., Skopek, JM. and Korhonen, A., 2023. Ethical considerations in the early detection of Alzheimer's disease using speech and AI ACM International Conference Proceeding Series,
    Doi: http://doi.org/10.1145/3593013.3594063
  • 2022

  • Parović, M., Glavaš, G., Vulić, I. and Korhonen, A., 2022. BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference,
  • Li, Y., Liu, F., Collier, N., Korhonen, A. and Vulic, I., 2022. Improving Word Translation via Two-Stage Contrastive Learning Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Ansell, A., Ponti, EM., Korhonen, A. and Vulic, I., 2022. Composable Sparse Fine-Tuning for Cross-Lingual Transfer Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 1
  • Razumovskaia, E., Vulić, I. and Korhonen, A., 2022. Data Augmentation and Learned Layer Aggregation for Improved Multilingual Language Understanding in Dialogue Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Li, Y., Liu, F., Vulić, I. and Korhonen, A., 2022. Improving Bilingual Lexicon Induction with Cross-Encoder Reranking Findings of the Association for Computational Linguistics: EMNLP 2022,
  • Liu, Q., McCarthy, D. and Korhonen, A., 2022. Measuring Context-Word Biases in Lexical Semantic Datasets Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022,
  • 2021

  • Hangya, V., Liu, Q., Stojanovski, D., Fraser, A. and Korhonen, A., 2021. Improving Machine Translation of Rare and Unseen Word Senses WMT 2021 - 6th Conference on Machine Translation, Proceedings,
  • Liu, F., Vulić, I., Korhonen, A. and Collier, N., 2021. Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Liu, Q., Ponti, EM., McCarthy, D., Vulić, I. and Korhonen, A., 2021. AM<sup>2</sup>ICO: Evaluating Word Meaning in Context across Low-Resource Languages with Adversarial Examples EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings,
  • Liu, Q., Liu, F., Collier, N., Korhonen, A. and Vulić, I., 2021. MIRRORWIC: On Eliciting Word-in-Context Representations from Pretrained Language Models CoNLL 2021 - 25th Conference on Computational Natural Language Learning, Proceedings,
  • Ansell, A., Ponti, EM., Pfeiffer, J., Ruder, S., Glavaš, G., Vulic, I. and Korhonen, A., 2021. MAD-G: Multilingual Adapter Generation for Efficient Cross-Lingual Transfer Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021,
  • Zhu, Y., Shareghi, E., Li, Y., Reichart, R. and Korhonen, A., 2021. Combining deep generative models and multi-lingual pretraining for semi-supervised document classification EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference,
  • Majewska, O., Vulic, I., Glavaš, G., Ponti, EM. and Korhonen, A., 2021. Verb knowledge injection for multilingual event processing ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference,
  • Vulic, I., Ponti, EM., Korhonen, A. and Glavaš, G., 2021. LEXFIT: Lexical fine-tuning of pretrained language models ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference,
  • Zhao, M., Zhu, Y., Shareghi, E., Vulic, I., Reichart, R., Korhonen, A. and Schütze, H., 2021. A closer look at few-shot crosslingual transfer: The choice of shots matters ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference,
  • Liu, F., Vulić, I., Korhonen, A. and Collier, N., 2021. Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, v. 2
  • 2020 (Accepted for publication)

  • Vulic, I., Ponti, E., Litschko, R., Glavas, G. and Korhonen, A., 2020 (Accepted for publication). Probing Pretrained Language Models for Lexical Semantics Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020),
    Doi: http://doi.org/10.18653/v1/2020.emnlp-main.586
  • 2020

  • Korhonen, A. and Traum, D., 2020. Message from the program chairs ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference,
  • Majewska, O., McCarthy, D., van den Bosch, J., Kriegeskorte, N., Vulic, I. and Korhonen, A., 2020. Spatial multi-arrangement for clustering and multi-way similarity dataset construction LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings,
  • Gerz, D., Vulić, I., Rei, M., Reichart, R. and Korhonen, A., 2020. Multidirectional associative optimization of function-specific word representations Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Dubossarsky, H., Vulic, I., Reichart, R. and Korhonen, A., 2020. The Secret is in the Spectra: Predicting Cross-Lingual Task Performance with Spectral Similarity Measures Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020),
    Doi: http://doi.org/10.18653/v1/2020.emnlp-main.186
  • Sasano, R. and Korhonen, A., 2020. Investigating word-class distributions in word vector spaces Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Ponti, E., Glavaš, G., Majewska, O., Liu, Q., Vulic, I. and Korhonen, A., 2020. XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020),
    Doi: http://doi.org/10.18653/v1/2020.emnlp-main.185
  • Vulić, I., Korhonen, A. and Glavaš, G., 2020. Improving bilingual lexicon induction with unsupervised post-processing of monolingual word vector spaces Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Karan, M., Vulić, I., Korhonen, A. and Glavaš, G., 2020. Classification-based self-learning for weakly supervised bilingual lexicon induction Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Majewska, O., Vulic, I., McCarthy, D. and Korhonen, A., 2020. Manual Clustering and Spatial Arrangement of Verbs for Multilingual Evaluation and Typology Analysis Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020),
  • Li, Y., Ponti, E., Vulic, I. and Korhonen, A., 2020. Emergent Communication Pretraining for Few-Shot Machine Translation Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020),
  • Liu, Q., McCarthy, D. and Korhonen, A., 2020. Towards better context-aware lexical semantics: Adjusting contextualized representations through static anchors EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
  • Lauscher, A., Vulic, I., Ponti, E., Korhonen, A. and Glavas, G., 2020. Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020),
  • Glavas, G., Vulic, I., Korhonen, A. and Ponzetto, SP., 2020. SemEval-2020 Task 2: Predicting Multilingual and Cross-Lingual (Graded) Lexical Entailment Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval 2020),
  • Korhonen, A. and Traum, D., 2020. Message from the program chairs ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference,
  • 2019 (Accepted for publication)

  • Shareghi, E., Gerz, D., Vulic, I. and Korhonen, A., 2019 (Accepted for publication). Show Some Love to Your n-grams: A Bit of Progress and Stronger n-gram Language Modeling Baselines
    Doi: http://doi.org/10.17863/CAM.39778
  • 2019

  • Liu, Q., McCarthy, D. and Korhonen, A., 2019. Second-order contexts from lexical substitutes for few-shot learning of word representations *SEM@NAACL-HLT 2019 - 8th Joint Conference on Lexical and Computational Semantics,
  • Shareghi, E., Li, Y., Zhu, Y., Reichart, R. and Korhonen, A., 2019. Bayesian learning for neural dependency parsing NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1
  • Chiu, B., Baker, S., Palmer, M. and Korhonen, A., 2019. Enhancing biomedical word embeddings by retrofitting to verb clusters BioNLP 2019 - SIGBioMed Workshop on Biomedical Natural Language Processing, Proceedings of the 18th BioNLP Workshop and Shared Task,
  • Zhu, Y., Vulić, I. and Korhonen, A., 2019. A systematic study of leveraging subword information for learning word representations NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1
  • Shareghi, E., Gerz, D., Vulić, I. and Korhonen, A., 2019. Show some love to your n-grams: A bit of progress and stronger n-gram language modeling baselines NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1
  • Majewska, O., McCarthy, D., Vulić, I. and Korhonen, A., 2019. Acquiring verb classes through bottom-up semantic verb clustering LREC 2018 - 11th International Conference on Language Resources and Evaluation,
  • Liu, Q., McCarthy, D., Vulić, I. and Korhonen, A., 2019. Investigating cross-lingual alignment methods for contextualized embeddings with Token-level evaluation CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference,
  • Ponti, EM., Vulić, I., Cotterell, R., Reichart, R. and Korhonen, A., 2019. Towards zero-shot language modeling EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference,
  • Ponti, EM., Vulić, I., Glavaš, G., Reichart, R. and Korhonen, A., 2019. Cross-lingual semantic specialization via lexical relation induction EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference,
  • Vulić, I., Glavaš, G., Reichart, R. and Korhonen, A., 2019. Do we really need fully unsupervised cross-lingual embeddings? EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference,
  • Zhu, Y., Heinzerling, B., Vulić, I., Strube, M., Reichart, R. and Korhonen, A., 2019. On the importance of subword information for morphological tasks in truly low-resource languages CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference,
  • Tseng, BH., Rei, M., Budzianowski, P., Turner, RE., Byrne, B. and Korhonen, A., 2019. Semi-supervised bootstrapping of dialogue state trackers for task-oriented modelling EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference,
  • 2018

  • Vulic, I., Glavaš, G., Mrkšić, N. and Korhonen, A., 2018. Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources
  • Ponti, E., Reichart, R., Korhonen, A. and Vulic, I., 2018. Isomorphic Transfer of Syntactic Structures in Cross-Lingual NLP Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018),
    Doi: http://doi.org/10.18653/v1/P18-1142
  • Vulić, I. and Korhonen, A., 2018. Injecting Lexical Contrast into Word Vectors by Guiding Vector Space Specialisation Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Ponti, EM., Vulić, I., Glavaš, G., Mrkšić, N. and Korhonen, A., 2018. Adversarial propagation and zero-shot cross-lingual transfer of word vector specialization Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018,
  • Gerz, D., Vulić, I., Ponti, EM., Reichart, R. and Korhonen, A., 2018. On the relation between linguistic typology and (limitations of) multilingual language modeling Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018,
  • 2017 (No publication date)

  • Baker, S., Korhonen, A. and Pyysalo, S., 2017 (No publication date). Cancer Hallmark Text Classification Using Convolutional Neural Networks
    Doi: http://doi.org/10.17863/CAM.12420
  • 2017

  • Baker, S. and Korhonen, A., 2017. Initializing neural networks for hierarchical multi-label text classification BioNLP 2017 - SIGBioMed Workshop on Biomedical Natural Language Processing, Proceedings of the 16th BioNLP Workshop,
  • Vulić, I., Mrkšić, N. and Korhonen, A., 2017. Cross-lingual induction and transfer of verb classes based on word vector space specialisation EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings,
    Doi: http://doi.org/10.18653/v1/d17-1270
  • Vulić, I., Schwartz, R., Rappoport, A., Reichart, R. and Korhonen, A., 2017. Automatic Selection of Context Configurations for Improved Class-Specific Word Representations Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017),
    Doi: http://doi.org/10.18653/v1/K17-1013
  • Vulić, I., Kiela, D. and Korhonen, A., 2017. Evaluation by association: A systematic study of quantitative word association evaluation 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference, v. 1
    Doi: http://doi.org/10.18653/v1/e17-1016
  • Ponti, EM. and Korhonen, A., 2017. Event-Related Features in Feedforward Neural Networks Contribute to Identifying Causal Relations in Discourse LSDSem 2017 - 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-Level Semantics, Proceedings of the Workshop,
  • Vulic, I., Mrkšic, N., Reichart, R., Séaghdha, D., Young, S. and Korhonen, A., 2017. Morph-fitting: Fine-tuning word vector spaces with simple language-specific rules ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), v. 1
    Doi: http://doi.org/10.18653/v1/P17-1006
  • Ponti, EM., Vulić, I. and Korhonen, A., 2017. Decoding sentiment from distributed representations of sentences *SEM 2017 - 6th Joint Conference on Lexical and Computational Semantics, Proceedings,
    Doi: http://doi.org/10.18653/v1/s17-1003
  • 2016

  • Vulić, I. and Korhonen, A., 2016. Is "universal syntax" universally useful for learning distributed word representations? 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Short Papers,
    Doi: http://doi.org/10.18653/v1/p16-2084
  • Vulíc, I. and Korhonen, A., 2016. On the role of seed lexicons in learning bilingual word embeddings 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, v. 1
    Doi: http://doi.org/10.18653/v1/p16-1024
  • Chiu, B., Korhonen, A. and Pyysalo, S., 2016. Intrinsic evaluation ofword vectors fails to predict extrinsic performance Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Chiu, B., Crichton, G., Korhonen, A. and Pyysalo, S., 2016. How to train goodword embeddings for biomedical nlp BioNLP 2016 - Proceedings of the 15th Workshop on Biomedical Natural Language Processing,
  • 2016. Learning distributed representations of sentences from unlabelled data 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference,
  • Hill, F., Cho, K. and Korhonen, A., 2016. Learning distributed representations of sentences from unlabelled data 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference,
    Doi: http://doi.org/10.18653/v1/n16-1162
  • O'Horan, H., Berzak, Y., Vulić, I., Reichart, R. and Korhonen, A., 2016. Survey on the use of typological information in natural language processing COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,
  • Baker, S., Kiela, D. and Korhonen, A., 2016. Robust text classification for sparsely labelled data using multi-level embeddings COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,
  • Gerz, D., Vulic, I., Hill, F., Reichart, R. and Korhonen, A., 2016. Simverb-3500: A large-scale evaluation set of verb similarity EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings,
    Doi: http://doi.org/10.18653/v1/d16-1235
  • Berzak, Y., Huang, Y., Barbu, A., Korhonen, A. and Katz, B., 2016. Anchoring and agreement in syntactic annotations EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings,
    Doi: http://doi.org/10.18653/v1/d16-1239
  • 2015

  • Geertzen, J., Alexopoulou, T., Post, B. and Korhonen, A., 2015. Native language effects on pronunciation accuracy L2 English International Symposium on Monolingual and Bilingual Speech (ISMBS),
  • Alexopoulou, T., Geertzen, J., Meurers, D. and Korhonen, A., 2015. Relativisors and animacy in L2 English Second Language Research Forum (SLRF) 2015,
  • Karlgren, J., Callin, J., Collins-Thompson, K., Gyllensten, AC., Ekgren, A., Jurgens, D., Korhonen, A., Olsson, F., Sahlgren, M. and Schütze, H., 2015. Evaluating learning language representations Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 9283
    Doi: http://doi.org/10.1007/978-3-319-24027-5_25
  • Korhonen, A., Guo, Y., Baker, S., Yetisgen-Yildiz, M., Stenius, U., Narita, M. and Liò, P., 2015. Improving literature-based discovery with advanced text mining Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 8623
    Doi: http://doi.org/10.1007/978-3-319-24462-4_8
  • 2014

  • Scarton, C., Sun, L., Kipper-Schuler, K., Duran, MS., Palmer, M. and Korhonen, A., 2014. Verb clustering for Brazilian Portuguese Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 8403 LNCS
    Doi: http://doi.org/10.1007/978-3-642-54906-9-3
  • Silins, I., Korhonen, A., Guo, Y. and Stenius, U., 2014. A text-mining approach for chemical risk assessment and cancer research TOXICOLOGY LETTERS, v. 229
    Doi: http://doi.org/10.1016/j.toxlet.2014.06.565
  • Larsson, K., Silins, I., Guo, Y., Korhonen, A., Stenius, U. and Berglund, M., 2014. Text mining for improved human exposure assessment TOXICOLOGY LETTERS, v. 229
    Doi: http://doi.org/10.1016/j.toxlet.2014.06.427
  • Guo, Y., Séaghdha, D., Silins, I., Sun, L., Högberg, J., Stenius, U. and Korhonen, A., 2014. CRAB 2.0: A text mining tool for supporting literature review in chemical cancer risk assessment COLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of the Conference System Demonstrations,
  • Jiang, X., Guo, Y., Geertzen, J., Alexopoulou, D., Sun, L. and Korhonen, A., 2014. Native Language Identification using large, longitudinal data Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014,
  • Scarton, C., Sun, L., Kipper-Schuler, K., Duran, MS., Palmer, M. and Korhonen, A., 2014. Verb clustering for Brazilian Portuguese Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 8403 LNCS
    Doi: http://doi.org/10.1007/978-3-642-54906-9_3
  • Hill, F. and Korhonen, A., 2014. Concreteness and subjectivity as dimensions of lexical meaning 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Proceedings of the Conference, v. 2
    Doi: http://doi.org/10.3115/v1/p14-2118
  • Hill, F. and Korhonen, A., 2014. Learning abstract concept embeddings from multi-modal data: Since you probably can't see what I mean EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
    Doi: http://doi.org/10.3115/v1/d14-1032
  • Baker, S., Reichart, R. and Korhonen, A., 2014. An unsupervised model for instance level subcategorization acquisition EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
  • 2013 (No publication date)

  • Geertzen, J., Alexopoulou, T. and Korhonen, A., 2013 (No publication date). Automatic linguistic annotation of large scale L2 databases: the EF-Cambridge Open Language Database Selected papers from the Second Language Research Forum,
  • Korhonen, A., Guo, Y. and Reichart, R., 2013 (No publication date). Improved Information Structure Analysis of Scientific Documents Through Discourse and Lexical Constraints
  • Korhonen, A. and O'Seaghdha, D., 2013 (No publication date). Probabilistic models of similarity in syntactic context EMNLP 2011,
  • 2013

  • Sun, L., McCarthy, D. and Korhonen, A., 2013. Diathesis alternation approximation for verb clustering ACL 2013 - 51st Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, v. 2
  • Hill, F., Korhonen, A. and Bentz, C., 2013. Large-Scale Empirical Analyses of the Abstract/Concrete Distinction Cooperative Minds: Social Interaction and Group Dynamics - Proceedings of the 35th Annual Meeting of the Cognitive Science Society, CogSci 2013,
  • Hill, F., Kiela, D. and Korhonen, A., 2013. Concreteness and Corpora: A Theoretical and Practical Analysis CMCL 2013 - Cognitive Modeling and Computational Linguistics, Proceedings of the Workshop,
  • Hill, F., Korhonen, A. and Bentz, C., 2013. Large-scale empirical analyses of concreteness Proceedings of the Annual Meeting of the Cognitive Science Society,
  • Kelly, C., Korhonen, A. and Devereux, B., 2013. Minimally Supervised Learning for Unconstrained Conceptual Property Extraction Cooperative Minds: Social Interaction and Group Dynamics - Proceedings of the 35th Annual Meeting of the Cognitive Science Society, CogSci 2013,
  • Baldwin, T. and Korhonen, A., 2013. Preface EMNLP 2013 - 2013 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
  • Guo, Y., Reichart, R. and Korhonen, A., 2013. Improved Information Structure Analysis of Scientific Documents through Discourse and Lexical Constraints Proceedings of the 2nd Workshop on Computational Linguistics for Literature, CLfL 2013 at the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2013,
  • van de Cruys, T., Poibeau, T. and Korhonen, A., 2013. A Tensor-based Factorization Model of Semantic Compositionality Proceedings of the 2nd Workshop on Computational Linguistics for Literature, CLfL 2013 at the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2013,
  • Korhonen, A. and Reichart, R., 2013. Improved Lexical Acquisition through DPP-based Verb Clustering Association for Computational Linguistics,
  • Korhonen, A., Van de Cruys, T. and Poibeau, T., 2013. A Tensor-based Factorization Model of Semantic Compositionality http://aclweb.org/anthology/N/N13/N13-1.pdf,
  • 2012 (No publication date)

  • Rimell, L., Poibeau, T. and Korhonen, A., 2012 (No publication date). Merging Lexicons for Higher Precision Subcategorization Frame Acquisition Proceedings of the LREC 2012 Workshop on Language Resource Merging,
  • 2012

  • Kadekar, S., Silins, I., Korhonen, A., Dreij, K., Al-Anati, L., Hogberg, J. and Stenius, U., 2012. Exocrine pancreatic tumorigenesis and autotaxin expression TOXICOLOGY LETTERS, v. 211
    Doi: http://doi.org/10.1016/j.toxlet.2012.03.216
  • Silins, I., Korhonen, A., Sun, L., Hogberg, J. and Stenius, U., 2012. A text mining approach for chemical cancer research and risk assessment TOXICOLOGY LETTERS, v. 211
    Doi: http://doi.org/10.1016/j.toxlet.2012.03.458
  • Korhonen, A. and Reichart, R., 2012. Document and Corpus Level Inference For Unsupervised Learning of Information Structure of Scientific Documents Proceedings of the 24th International Conference on Computational Linguistics (COLING),
  • Shutova, E., van de Cruys, T. and Korhonen, A., 2012. Unsupervised Metaphor Paraphrasing Using a Vector Space Model Proceedings of the 24th International Conference on Computational Linguistics (COLING),
  • Guo, Y., Silins, I., Korhonen, A. and Reichart, R., 2012. CRAB Reader: A Tool for Analysis and Visualization of Argumentative Zones in Scientific Literature Proceedings of the 24th International Conference on Computational Linguistics (COLING),
  • Séaghdha, D. and Korhonen, A., 2012. Modelling selectional preferences in a lexical hierarchy *SEM 2012 - 1st Joint Conference on Lexical and Computational Semantics, v. 1
  • Kelly, C., Devereux, B. and Korhonen, A., 2012. Semi-supervised learning for automatic conceptual property extraction Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 2012-June
  • Abend, O., Biemann, C., Korhonen, A., Rappoport, A., Sogaard, A. and Reichart, R., 2012. Proceedings of the EACL Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
  • Berwick, R., Korhonen, A., Villavicencio, A. and Poibeau, T., 2012. Proceedings of the EACL Workshop on Computational Models of Language Acquisition and Loss
  • Alexopoulou, T., Geertzen, J., Meurers, D. and Korhonen, A., 2012. L1 effects in L2 English relative clauses: evidence from corpus production Abstracts of the 22nd Annual Conference of the European Second Language Association (EUROSLA-22),
  • 2011

  • Abend, O., Korhonen, A., Rappoport, A. and Reichart, R., 2011. Introduction Workshop on Unsupervised Learning in NLP at the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011 - Proceedings,
  • Abend, O., Korhonen, A., Reichart, R. and Rappoport, A., 2011. Proceedings of the EMNLP Workshop on Unsupervised Learning in NLP
  • Devereux, B., Tyler, L. and Korhonen, A., 2011. Parsing sentences are unlikely: corpus-based analyses of the neural processing of verbs International Conference on Cognitive Neuroscience (ICON),
  • Zhuang, J., Devereux, B., Tyler, L. and Korhonen, A., 2011. Lexical and syntactic competition effects in verb processing: evidence from corpus-based statistics International Conference on Cognitive Neuroscience (ICON),
  • 2010

  • Kadekar, S., Silins, I., Korhonen, A., Hogberg, J., Dreij, K. and Stenius, U., 2010. Carcinogen-induced inflammation and pancreatic cancer Proceedings of the 101th Annual Meeting of the American Association for Cancer Research,
  • Kelly, C., Korhonen, A. and Devereux, B., 2010. Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL-HLT Workshop on Computational Neurolinguistics,
  • Devereux, B., Korhonen, A. and Kelly, C., 2010. Using fMRI Activation to Conceptual Stimuli to Evaluate Methods for Extracting Conceptual Representations from Corpora Proceedings of the NAACL-HLT Workshop on Computational Neurolinguistics,
  • Murphy, B., Korhonen, A. and Chang, K-MK., 2010. Proceedings of the NAACL-HLT Workshop on Computational Neurolinguistics
  • Devereux, B., Kelly, C., Pilkington, N., Korhonen, A. and Poibeau, T., 2010. The Acquisition of Unconstrained Feature-Based Conceptual Representations from Corpora The Rovereto Workshop on Concepts, Actions, and Objects: Functional and Neural Perspectives,
  • Moore, S., Buchholz, S. and Korhonen, A., 2010. Annotating the enron email corpus with number senses Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010,
  • Devereux, B., Pilkington, N., Poibeau, T. and Korhonen, A., 2010. Large-Scale Acquisition of Feature-Based Conceptual Representations from Textual Corpora COGNITION IN FLUX,
  • Guo, Y., Silins, I., Korhonen, A., Sun, L., Liakata, M. and Stenius, U., 2010. Identifying the information structure of scientific abstracts: An investigation of three different schemes Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • 2009

  • Moore, S., Buchholz, S. and Korhonen, A., 2009. Number Sense Disambiguation Proceedings of the 12th Conference of the Pacific Association for Computational Linguistics,
  • Sun, L., Korhonen, A., Silins, I. and Stenius, U., 2009. User-Driven Development of Text Mining Resources for Cancer Risk Assessment Proceedings ...,
  • Kipper-Schuler, K., Korhonen, A. and Brown, S., 2009. Proceedings of the NAACL 2009 Tutorial on VerbNet and Its Applications North American Chapter of the Association for Computational Linguistics - Human Language Technologies NAACL HLT,
  • Vlachos, A., Korhonen, A. and Ghahramani, Z., 2009. Unsupervised and constrained dirichlet process mixture models for verb clustering.
  • Schuler, KK., Korhonen, A. and Brown, S., 2009. VerbNet overview, extensions, mappings and applications NAACL-HLT 2009 - Human Language Technologies: 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Tutorial Abstracts,
  • 2008

  • Sun, L., Korhonen, A. and Krymolowski, Y., 2008. Verb class discovery from rich syntactic data COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, v. 4919
  • Lewin, I., Silins, I., Korhonen, A., Hogberg, J. and Stenius, U., 2008. A New Challenge for Text Mining: Cancer Risk Assessment Proceedings of the ISMB BioLINK Special Interest Group on Text Data Mining,
  • Korhonen, A., Lewin, I., Silins, I., Hogberg, J. and Stenius, U., 2008. CRAB - Cancer Risk Assessment and Biomedical Text Mining Proceedings of the European Conference on Computational Biology,
  • Messiant, C., Korhonen, A. and Poibeau, T., 2008. LexSchem: A large subcategorization lexicon for French verbs Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008,
  • Sun, L., Korhonen, A. and Krymolowski, Y., 2008. Automatic classification of English verbs using rich syntactic features IJCNLP 2008 - 3rd International Joint Conference on Natural Language Processing, Proceedings of the Conference, v. 2
  • Vlachos, A., Ghahramani, Z. and Korhonen, A., 2008. Dirichlet process mixture models for verb clustering
  • 2007

  • Preiss, J., Briscoe, T. and Korhonen, A., 2007. A System for Large-Scale Acquisition of Verbal, Nominal and Adjectival Subcategorization Frames from Corpora. ACL,
  • Buttery, P. and Korhonen, A., 2007. I will shoot your shopping down and you can shoot all my tins: automatic lexical acquisition from the CHILDES database Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition,
  • Buttery, P. and and Anna Korhonen, AV., 2007. The proceedings of the ACL 2007 Workshop on Cognitive Aspects of Computational Language Acquisition
  • 2006

  • Korhonen, A., Krymolowski, Y. and Collier, N., 2006. Automatic Classification of Verbs in Biomedical Texts COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE,
  • Kipper, K., Korhonen, A., Ryant, N. and Palmer, M., 2006. A Large-Scale Extension of VerbNet with Novel Verb Classes Proceedings of EURALEX,
  • Korhonen, A., Krymolowski, Y. and Briscoe, T., 2006. A large subcategorization lexicon for natural language processing applications Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006,
  • Kipper, K., Korhonen, A., Ryant, N. and Palmer, M., 2006. Extending VerbNet with novel verb classes Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006,
  • 2005

  • Yallop, J., Korhonen, A. and Briscoe, T., 2005. Automatic Acquisition of Adjectival Subcategorization from Corpora. ACL,
  • Baldwin, T., Villavicencio, A. and Korhonen, A., 2005. Proceedings of the ACL-SIGLEX 2005 Workshop on Deep Lexical Acquisition
  • 2004

  • Preiss, J. and Korhonen, A., 2004. WSD for subcategorization acquisition task description Proceedings of the SENSEVAL@ACL 2004: 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text - Held in cooperation with ACL 2004,
  • Tanaka, T., Villavicencio, A., Korhonen, A. and Bond, F., 2004. Proceedings of the ACL-SIGLEX 2004 Workshop on Multiword Expressions: Integrating Processing
  • Briscoe, T. and Korhonen, A., 2004. Extended Lexical-Semantic Classification of English Verbs Proceedings of the HLT/NAACL Workshop on Computational Lexical Semantics,
  • 2003

  • Korhonen, A., Krymolowski, Y. and Marx, Z., 2003. Clustering polysemic subcategorization frame distributions semantically 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE,
  • Korhonen, A. and Preiss, J., 2003. Improving subcategorization acquisition using word sense disambiguation 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE,
  • 2002

  • Korhonen, A. and Krymolowski, Y., 2002. On the Robustness of Entropy-Based Similarity Measures in Evaluation of Subcategorization Acquisition Systems Proceedings of the Annual Meeting of the Association for Computational Linguistics,
  • Preiss, J., Korhonen, A. and Briscoe, T., 2002. Subcategorization Acquisition as an Evaluation Method for WSD. LREC,
  • 2000

  • Korhonen, A., Gorrell, G. and McCarthy, D., 2000. Statistical filtering and subcategorization frame acquisition PROCEEDINGS OF THE 2000 JOINT SIGDAT CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND VERY LARGE CORPORA,
  • Korhonen, A., 2000. Using semantically motivated estimates to help subcategorization acquisition PROCEEDINGS OF THE 2000 JOINT SIGDAT CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND VERY LARGE CORPORA,
  • 1998

  • McCarthy, D. and Korhonen, A., 1998. Detecting verbal participation in diathesis alternations Proceedings of the Annual Meeting of the Association for Computational Linguistics, v. 2
  • Datasets

    2018

  • Chiu, HW., Pyysalo, S., Vulic, I. and Korhonen, A., 2018. Bio-SimVerb
    Doi: http://doi.org/10.17863/CAM.18370
  • 2017 (No publication date)

  • Gerz, DS., Vulic, I., Hill, F., Reichart, R. and Korhonen, A., 2017 (No publication date). Research data supporting "SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity"
  • Book chapters

    2018

  • Jiang, X., Huang, Y., Guo, Y., Geertzen, J., Alexopoulou, T., Sun, L. and Korhonen, A., 2018. Native language identification on EFCAMDAT
    Doi: http://doi.org/10.1017/9781316676974.007
  • 2013

  • Korhonen, A., 2013. Tools and Procedures for the Acquisition of Morphological and Syntactical Information from Corpora
  • Books

    2013

  • Villavicencio, A., Poibeau, T., Alishahi, A. and Korhonen, A., 2013. Cognitive Aspects of Computational Language Acquisition

  • What we do

    Cambridge Language Sciences is an Interdisciplinary Research Centre at the University of Cambridge. Our virtual network connects researchers from five schools across the university as well as other world-leading research institutions. Our aim is to strengthen research collaborations and knowledge transfer across disciplines in order to address large-scale multi-disciplinary research challenges relating to language research.

    JOIN OUR NETWORK

    JOIN OUR MAILING LIST

    CONTACT US