skip to content

Cambridge Language Sciences

Interdisciplinary Research Centre
 

Biography

I am a Research Associate (Postdoc) at the Language Technology Lab (LTL).

Publications (from Symplectic)

Journal articles

2023 (Accepted for publication)

  • Collins, C., Baker, S., Brown, J., Zeng, H., Chan, A., Stenius, U., Narita, M. and Korhonen, A., 2023 (Accepted for publication). Text Mining for Contexts and Relationships in Cancer Genomics Literature Bioinformatics,
    Doi: http://doi.org/10.1093/bioinformatics/btae021
  • 2021

  • Ali, I., Dreij, K., Baker, S., Högberg, J., Korhonen, A. and Stenius, U., 2021. Application of Text Mining in Risk Assessment of Chemical Mixtures: A Case Study of Polycyclic Aromatic Hydrocarbons (PAHs). Environ Health Perspect, v. 129
    Doi: http://doi.org/10.1289/EHP6702
  • Su, Y., Wang, Y., Cai, D., Baker, S., Korhonen, A. and Collier, N., 2021. PROTOTYPE-TO-STYLE: Dialogue Generation with Style-Aware Editing on Retrieval Memory IEEE/ACM Transactions on Audio Speech and Language Processing, v. 29
    Doi: http://doi.org/10.1109/TASLP.2021.3087948
  • Majewska, O., Collins, C., Baker, S., Björne, J., Brown, SW., Korhonen, A. and Palmer, M., 2021. BioVerbNet: a large semantic-syntactic classification of verbs in biomedicine. J Biomed Semantics, v. 12
    Doi: http://doi.org/10.1186/s13326-021-00247-z
  • 2020 (Accepted for publication)

  • Vulic, I., Baker, S., Ponti, E., Petti, U., Leviant, I., Wing, K., Majewska, O., Bar, E., Malone, M., Poibeau, T., Reichart, R. and Korhonen, A., 2020 (Accepted for publication). Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity Computational Linguistics,
    Doi: http://doi.org/10.1162/coli_a_00391
  • 2020

  • Wichmann, P., Brintrup, A., Baker, S., Woodall, P. and McFarlane, D., 2020. Extracting supply chain maps from news articles using deep neural networks International Journal of Production Research, v. 58
    Doi: http://doi.org/10.1080/00207543.2020.1720925
  • Crichton, G., Baker, S., Guo, Y. and Korhonen, A., 2020. Neural networks for open and closed Literature-based Discovery. PLoS One, v. 15
    Doi: http://doi.org/10.1371/journal.pone.0232891
  • Petti, U., Baker, S. and Korhonen, A., 2020. A systematic literature review of automatic Alzheimer's disease detection from speech and language. J Am Med Inform Assoc, v. 27
    Doi: http://doi.org/10.1093/jamia/ocaa174
  • Chiu, B. and Baker, S., 2020. Word embeddings for biomedical natural language processing: A survey Language and Linguistics Compass, v. 14
    Doi: http://doi.org/10.1111/lnc3.12402
  • 2019

  • Pyysalo, S., Baker, S., Ali, I., Haselwimmer, S., Shah, T., Young, A., Guo, Y., Högberg, J., Stenius, U., Narita, M. and Korhonen, A., 2019. LION LBD: a literature-based discovery system for cancer biology. Bioinformatics, v. 35
    Doi: http://doi.org/10.1093/bioinformatics/bty845
  • 2018

  • Wichmann, P., Brintrup, A., Baker, S., Woodall, P. and McFarlane, D., 2018. Towards automatically generating supply chain maps from natural language text
    Doi: http://doi.org/10.1016/j.ifacol.2018.08.207
  • 2017 (Accepted for publication)

  • Baker, S., Ali, I., Silins, I., Pyysalo, S., Guo, Y., Högberg, J., Stenius, U. and Korhonen, A., 2017 (Accepted for publication). Cancer Hallmarks Analytics Tool (CHAT): A text mining approach to organise and evaluate scientific literature on cancer Bioinformatics, v. 33
    Doi: http://doi.org/10.1093/bioinformatics/btx454
  • 2017

  • Larsson, K., Baker, S., Silins, I., Guo, Y., Stenius, U., Korhonen, A. and Berglund, M., 2017. Text mining for improved exposure assessment PLOS One, v. 12
    Doi: http://doi.org/10.1371/journal.pone.0173132
  • 2016

  • Baker, S., Silins, I., Guo, Y., Ali, I., Högberg, J., Stenius, U. and Korhonen, A., 2016. Automatic semantic classification of scientific literature according to the hallmarks of cancer. Bioinformatics, v. 32
    Doi: http://doi.org/10.1093/bioinformatics/btv585
  • 2015

  • Korhonen, A., Baker, S., Silins, I., Guo, Y., Ali, I., Hogberg, J. and Stenius, U., 2015. Automatic Semantic Classification of Scientific Literature According to the Hallmarks of Cancer Bioinformatics,
  • Conference proceedings

    2021

  • Su, Y., Cai, D., Wang, Y., Vandyke, D., Baker, S., Li, P. and Collier, N., 2021. Non-Autoregressive Text Generation with Pre-trained Language Models Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics,
  • 2019

  • Chiu, B., Baker, S., Palmer, M. and Korhonen, A., 2019. Enhancing biomedical word embeddings by retrofitting to verb clusters BioNLP 2019 - SIGBioMed Workshop on Biomedical Natural Language Processing, Proceedings of the 18th BioNLP Workshop and Shared Task,
  • 2018

  • Stathopoulos, YA., Baker, S., Rei, M. and Teufel, S., 2018. Variable typing: Assigning meaning to variables in mathematical text NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, v. 1
  • Mendes, E., Rodriguez, P., Freitas, V., Baker, S. and Atoui, MA., 2018. Towards improving decision making and estimating the value of decisions in value-based software engineering: the VALUE framework Software Quality Journal, v. 26
    Doi: http://doi.org/10.1007/s11219-017-9360-z
  • 2017 (No publication date)

  • Baker, S., Korhonen, A. and Pyysalo, S., 2017 (No publication date). Cancer Hallmark Text Classification Using Convolutional Neural Networks
    Doi: http://doi.org/10.17863/CAM.12420
  • 2017

  • Baker, S. and Korhonen, A., 2017. Initializing neural networks for hierarchical multi-label text classification BioNLP 2017 - SIGBioMed Workshop on Biomedical Natural Language Processing, Proceedings of the 16th BioNLP Workshop,
  • 2016

  • Baker, S., Kiela, D. and Korhonen, A., 2016. Robust text classification for sparsely labelled data using multi-level embeddings COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers,
  • 2015

  • Korhonen, A., Guo, Y., Baker, S., Yetisgen-Yildiz, M., Stenius, U., Narita, M. and Liò, P., 2015. Improving literature-based discovery with advanced text mining Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 8623
    Doi: http://doi.org/10.1007/978-3-319-24462-4_8
  • 2014

  • Baker, S., Reichart, R. and Korhonen, A., 2014. An unsupervised model for instance level subcategorization acquisition EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference,
  • 2010

  • Baker, S. and Mendes, E., 2010. Aggregating Expert-Driven Causal Maps for Web Effort Estimation ADVANCES IN SOFTWARE ENGINEERING, v. 117
  • Baker, S. and Mendes, E., 2010. Evaluating the Weighted Sum Algorithm for Estimating Conditional Probabilities in Bayesian Networks 22ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING & KNOWLEDGE ENGINEERING (SEKE 2010),
  • 2008

  • Baker, S., Au, F., Dobbie, G. and Warren, I., 2008. Automated usability testing using HUI Analyzer ASWEC 2008: 19TH AUSTRALIAN SOFTWARE ENGINEERING CONFERENCE, PROCEEDINGS,
    Doi: http://doi.org/10.1109/ASWEC.2008.40
  • Baker, S., Au, F., Dobbie, G. and Warren, I., 2008. Automated usability testing using HUI analyzer Proceedings of the Australian Software Engineering Conference, ASWEC,
    Doi: http://doi.org/10.1109/ASWEC.2008.4483248
  • Research Associate
    Dr Simon  Baker

    Contact Details

    Affiliations

    Classifications: