skip to primary navigationskip to content

Dr Nigel Collier

Computational linguistics; machine learning; semantics; text/data mining; knowledge discovery; domain adaptation; question answering
Dr Nigel Collier

Principal Research Associate (Dept. of Theoretical and Applied Linguistics)

Visiting Scientist, European Bioinformatics Institute (EMBL-EBI)

Computational linguistics

Machine learning


Text/data mining

Knowledge discovery

Domain adaptation

Question answering

Office Phone: +44 (0)1223 335 000 (Main Faculty number)

Research Interests

Nigel's research is in the broad area of Natural Language Processing and Computational Linguistics. His research brings together computational techniques such as machine learning, syntactic parsing and concept understanding with the aim of providing a machine-understandable semantic representation of text. This is used to support real-world tasks, e.g. question answering and knowledge discovery from very large scale data sources such as the World Wide Web.

Nigel works in collaboration with colleagues from computer science, the life sciences and linguistics.

Recent research projects:

2015 – 2020, SIPHS (EPSRC funded), Semantic interpretation of personal health messages

2012 – 2014, PhenoMiner (EC FP7 funded), Semantic mining of phenotype associations from the scientific literature

2006 – 2012 BioCaster (JST funded), Detecting public health rumors with a Web-based text mining system


  • Text mining
  • Computational linguistics
  • Machine learning


Key Publications

Lofi, C., Nieke, C. and Collier, N. (2014), Discriminating rhetorical analogies in social media, European Conference on Computational Linguistics (EACL), Gothenburg, Sweden, April 26-30, pp. 560-568.

Collier, N., Tran, M., Le, H. Ha, Q., Oellrich, A. Rebholz-Schuhmann, D. (2013), Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking, PLoS One 8(10): e72965.

Bao, Y., Collier, N. and Datta, A. (2013), A partially supervised cross-collection topic model for cross-domain text classification, ACM Conference of Information and Knowledge Management, San Francisco, USA, October 27-November 1, pp. 239-248

Collier, N., Son, N. T., & Nguyen, N. M. (2011), OMG U got flu? Analysis of shared health messages for bio-surveillance. J. Biomedical Semantics2(S-5), S9.

Hay, S. I., Battle, K. E., Pigott, D. M., Smith, D. L., Moyes, C. L., Bhatt, S., Brownstein, J. S., Collier, N., Myers, M. F., George, D. B. & Gething, P. W. (2013), Global mapping of infectious disease. Philosophical Transactions of the Royal Society B: Biological Sciences,368(1614), 20120250.

Lau, J. H., Collier, N., & Baldwin, T. (2012), On-line Trend Analysis with Topic Models:\# twitter Trends Detection Topic Model Online. 24th International Conference on Computational Linguistics (COLING), Bombay, India, December 8-15, pp. 1519-1534.

Chanlekha, H., Kawazoe, A. & Collier, N. (2010), A framework for enhancing spatial and temporal granularity in report-based health surveillance systems. BMC medical informatics and decision making10(1), 1.

Collier, N. (2010), What’s unusual in online disease outbreak news? Journal of Biomedical Semantics, 1:2.