Publications – Language Technology Group

2021

Arlene Casey, Emma Davidson, Michael Poon, Hang Dong, Daniel Duma, Andreas Grivas, Claire Grover, Víctor Suárez-Paniagua, Richard Tobin, William Whiteley, Honghan Wu and Beatrice Alex (2021). A Systematic Review of Natural Language Processing Applied to Radiology Reports. BMC Medical Informatics and Decision Making, 21, 179. [arXiv, pdf, DOI]

Beatrice Alex, Clare Llewellyn, Pawel Michal Orzechowski, and Maria Boutchkova (2021). The Online Pivot: Lessons Learned from Teaching a Text and Data Mining Course in Lockdown, Enhancing online Teaching with Pair Programming and Digital Badges. In Proceedings of the NLP Teaching Workshop at NAACL-HLT 2021. [arXiv, pdf]

Lauren Hall-Lew, Claire Cowie, Stephen Joseph McNulty, Nina Markl, Shan-Jan Sarah Liu, Catherine Lai, Clare Llewellyn, Beatrice Alex, Nini Fang, Zuzana Elliott, and Anita Klingler (2021). The Lothian Diary Project: Investigating the Impact of the COVID-19 Pandemic on Edinburgh and Lothian Residents. Journal of Open Humanities Data, 7: 4, pp. 1-5. [pdf] [DOI]

Dominic Sykes, Andreas Grivas, Claire Grover, Richard Tobin, Cathie Sudlow, William Whiteley, Andrew McIntosh, Heather Whalley, Beatrice Alex (2021). Comparison of Rule-based and Neural Network Models for Negation Detection in Radiology Reports. Journal of Natural Language Engineering, 27(2), March 2021 , pp. 203 – 224. [DOI, accepted manuscript]

Arlene Casey, Mike Bennett, Richard Tobin, Claire Grover, Iona Walker, Lukas Engelmann and Beatrice Alex (2021). Plague Dot Text: Text Mining and Annotation of Outbreak Reports of the Third Plague Pandemic (1894-1952). Journal of Data Mining and Digital Humanities, January 2021. [arXiv, url, pdf]

2020

Andreas Grivas, Beatrice Alex, Claire Grover, Richard Tobin, William Whiteley (2020). Not a cute stroke: Analysis of Rule- and Neural Network-Based Information Extraction Systems for Brain Radiology Reports. In Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis (LOUHI 2020) at EMNLP 2020, November 2020. [pdf]

Lucy Havens, Melissa Terras, Benjamin Bach, Beatrice Alex (2020). Situated Data, Situated Systems: A Methodology to Engage with Power Relations in Natural Language Processing Bias Research. In Proceedings of the 2nd Workshop on Gender Bias in Natural Language Processing at COLING 2020. [pdf]

Vebjørn Espeland, Benjamin Bach, Beatrice Alex (2020). Enhanced Labelling in Active Learning for Coreference Resolution. In Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2020) at COLING 2020. [pdf]

Barbara McGillivray, Beatrice Alex, Sarah Ames, Guyda Armstrong, David Beavan, Arianna Ciula, Giovanni Colavizza, James Cummings, David De Roure, Adam Farquhar, Simon Hengchen, Anouk Lang, James Loxley, Eirini Goudarouli, Federico Nanni, Andrea Nini, Julianne Nyhan, Nicola Osborne, Thierry Poibeau, Mia Ridge, Sonia Ranade, James Smithies, Melissa Terras, Andreas Vlachidis and Pip Willcox (2020). The challenges and prospects of the intersection of humanities and data science: A White Paper from The Alan Turing Institute. White paper, The Alan Turing Institute, August 2020. [URL, DOI, pdf]

Clare Llewellyn, Pawel Orzechowski and Beatrice Alex (2020). Teaching a Text Mining Bootcamp in Lockdown. University of Edinburgh, June 2020, Edinburgh, pp. 1-7. [html, pdf]

Rosa Filgueira, Claire Grover, Melissa Terras and Beatrice Alex (2020). Geoparsing the historical Gazetteers of Scotland: accurately computing location in mass digitised texts. In Proceedings of the 8th Workshop on the Challenges in the Management of Large Corpora (CMLC-8 2020), LREC 2020, 16th of May 2020. [pdf, workshop]

2019

Richard Tobin, Elaine Farrow, Claire Grover, Beatrice Alex (2019). Automatic coding of occupation and cause-of-death record, presented at ADR 2019, Cardiff, UK, December 2019. [html]

Beatrice Alex, Claire Grover, Richard Tobin, Cathie Sudlow, Grant Mair and William Whiteley (2019). Text Mining Brain Imaging Reports. Journal of Biomedical Semantics, 10, 23, 2019, doi:10.1186/s13326-019-0211-7. [html, pdf]

Beatrice Alex, Claire Grover, Richard Tobin and Jon Oberlander (2019). Geoparsing Historical and Contemporary Literary Text set in the City of Edinburgh. Language Resources and Evaluation, 53(4): 651-675. [html, pdf]

Emily Wheater, Grant Mair, Cathie Sudlow, Beatrice Alex, Claire Grover and William Whiteley (2019). A validated natural language processing algorithm for brain imaging phenotypes from radiology reports in UK electronic health records. BMC Medical Informatics and Decision Making, 19, 184, 2019, doi:10.1186/s12911-019-0908-7. [html, pdf]

Arlene Casey, Mike Bennett, Richard Tobin, Claire Grover, Lukas Engelmann and Beatrice Alex (2019). Plague Dot Text: Text mining and annotation of outbreak reports of the Third Plague Pandemic (1894-1952). In Proceedings of HistoInformatics 2019 at the 23rd International Conference on Theory and Practice of Digital Libraries (TPDL 2019), CEUR Vol-2461, Oslo, Norway, 2019. [pdf]

Catherine Lai, Beatrice Alex, Johanna Moore, Leimin Tian, Tatsuro Hori and Gianpiero Francesca (2019). Detecting Topic-Oriented Speaker Stance in Converstational Speech. In Proceedings of Interspeech 2019, September 2019. [html, pdf]

Philip John Gorinski, Honghan Wu, Claire Grover, Richard Tobin, Conn Talbot, Heather Whalley, Cathie Sudlow, William Whiteley and Beatrice Alex (2019). Named Entity Recognition for Electronic Health Records: A Comparison of Rule-based and Machine Learning Approaches. HealTAC 2019 Conference, 24-25th of April 2019. [arXiv.org]

Aurora Constantin, Catherine Lai, Elaine Farrow, Beatrice Alex, Ruth Pel-Littel, Henk Herman Nap and Johan Jeuring (2019). “Why is the Doctor a Man?” Reactions of Older Adults to a Virtual Training Doctor. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow., May 2019. [html, pdf, video]

2018

Claire Grover, Richard Tobin, Beatrice Alex, Catherine Sudlow, Grant Mair and William Whiteley (2018). Text Mining Brain Imaging Reports. In Proceedings of HealTAC-2018, April 2018.

James Loxley, Beatrice Alex, Miranda Anderson, Uta Hinrichs, Claire Grover, Tara Thomson, David Harris-Birtill, Aaron Quigley and Jon Oberlander (2018). ‘Multiplicity embarrasses the eye’: The digital mapping of literary Edinburgh. In: Ian Gregory, Don Debats, Don Lafreniere (eds.), Routledge Handbook of Spatial History. [html, accepted manuscript]

2017

Beatrice Alex (2017). Geoparsing English Text with the Edinburgh Geoparser, The Programming Historian lesson. October 2017. [html]

Themistoklis Diamantopoulos, Michael Roth, Andreas Symeonidis and Ewan Klein (2017). Software requirements as an application domain for natural language processing. Language Resources and Evaluation, 51(2), pp. 495-524. [html]

2016

Beatrice Alex, Clare Llewellyn, Claire Grover, Jon Oberlander and Richard Tobin (2016). Homing in on Twitter users: Evaluating an Enhanced Geoparser for User Profile Locations. In Proceedings of the 10th Language Resources and Evaluation Conference (LREC), 23-28 May 2016, Portorož, Slovenia. [pdf]

Beatrice Alex, Claire Grover, Jon Oberlander, Tara Thomson, Miranda Anderson, James Loxley, Uta Hinrichs and Ke Zhou (2016). Palimpsest: Improving assisted curation of loco-specific literature. Digital Scholarship in the Humanities 2016, 07/11/2016. [html]

Beatrice Alex, Claire Grover, Ewan Klein, Clare Llewellyn and Richard Tobin (2016). User-driven Text Mining of Historical Text. Emma Tonkin, Gregory Tourte (eds.). Working with text: Tools, techniques and approaches for text mining. Chandos Publishing. [html]

Clare Llewellyn, Claire Grover and Jon Oberlander (2016). Improving Topic Model Clustering of Newspaper Comments for Summarisation. In Proceedings of the ACL 2016 Student Research Workshop, pp. 43-50, Berlin, Germany. [pdf]

Daniel Duma, Maria Liakata, Amanda Clare, James Ravenscroft and Ewan Klein (2016). Applying core scientiﬁc concepts to context-based citation recommendation. In Proceedings of the 10th Language Resources and Evaluation Conference (LREC), 23-28 May 2016, Portorož, Slovenia. [pdf]

Daniel Duma, Maria Liakata, Amanda Clare, James Ravenscroft and Ewan Klein (2016). Rhetorical Classification of Anchor Text for Citation Recommendation. D-Lib Magazine, 22(9/10). [html]

Daniel Duma, Charles Sutton and Ewan Klein (2016). Context matters: Towards extracting a citation’s context using linguistic features. In Proceedings of the 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL), pp. 201-202. [html]

William Whiteley, Claire Grover, Beatrice Alex, Cathie Sudlow and Grant Mair (2016). A natural language processing algorithm to identify stroke in brain imaging reports on a large scale. Poster presented at the 2nd European Stroke Organisation Conference (ESOC 2016), Barcelona, Spain. [pdf]

Jim Clifford, Beatrice Alex, Colin Coates, Andrew Watson and Ewan Klein (2016). Geoparsing History: Locating Commodities in Ten Million Pages of Nineteenth-Century Sources. Historical Methods, 49(3), pp. 115-131. [html]

Shawn M. Jones, Herbert Van de Sompel, Harihar Shankar, Martin Klein, Richard Tobin and Claire Grover (2016). Scholarly Context Adrift: Three out of Four URI References Lead to Changed Content. PLoS ONE 11(12): e0167475. [pdf] doi:10.1371/journal.pone.0167475

2015

Beatrice Alex, Kate Byrne, Claire Grover and Richard Tobin (2015). Adapting the Edinburgh Geoparser for Historical Georeferencing. International Journal for Humanities and Arts Computing, 9(1), pp. 15-35, March 2015.[pdf][html]

Clare Llewellyn, Claire Grover, Beatrice Alex, Jon Oberlander and Richard Tobin (2015). Extracting a Topic Specific Dataset from a Twitter Archive. In Proceedings of TPDL 2015, September 2015, Poznań, Poland, pp. 364-367. ***Winner of the best poster/demo award.*** [pdf, poster]

Uta Hinrichs, Beatrice Alex, Jim Clifford, Andrew Watson, Aaron Quigley, Ewan Klein and Colin M. Coates (2015). Trading Consequences: A Case Study of Combining Text Mining and Visualization to Facilitate Document Exploration. Digital Scholarship in the Humanities (DSH), DH2014 Special Issue. [html, pdf]

Beatrice Alex, Claire Grover, Jon Oberlander, Ke Zhou and Uta Hinrichs (2015). Palimpsest: Improving assisted curation of loco-specific literature. In Proceedings of DH2015, Sydney, Australia. [pdf]

Ke Zhou, Claire Grover, Martin Klein and Richard Tobin (2015). No More 404s: Predicting Referenced Link Rot in Scholarly Articles for Pro-Active Archiving. In Proceedings of the 15th ACM/IEEECE on Joint Conference on Digital Libraries (JCDL ’15). New York, NY, USA: ACM, p. 233-236. [pdf]

Hai H. Nguyen, Stuart Taylor, Gemma Webster, Nophadol Jekjantuk, Chris Mellish, Jeff Z. Pan, Tristan ap Rheinallt and Kate Byrne (2015). A Lightweight Treatment of Inexact Dates. Semantic Technology, Lecture Notes in Computer Science, Volume 8943, February 2015, pp 187-193. [Springer Link]

Michael Roth and Ewan Klein (2015). Parsing software requirements with an ontology-based semantic role labeler. In Proceedings of the 1st Workshop on Language and Ontologies. [pdf]

2014

Beatrice Alex, Kate Byrne, Claire Grover and Richard Tobin (2014). A Web-based Geo-resolution Annotation and Evaluation Tool. In Proceedings of the 8th Linguistic Annotation Workshop (LAW VIII), COLING 2014, Dublin, Ireland. [pdf]

Beatrice Alex and John Burns (2014). Estimating and Rating the Quality of Optically Character Recognised Text. In Proceedings of DATeCH 2014, Madrid, Spain. [pdf]

Kate Byrne (2014). Can Documents be Linked Data? Cataloguing and Index: Periodical of CILIP Cataloguing and Indexing Group, no174, pages 19-24, March 2014

Kate Byrne (2014). Event Mining in Our Rural Past. Working Papers of the Communities & Culture Network+ Vol.3 (April 2014)[pdf]

Ewan Klein, Beatrice Alex and Jim Clifford (2014). Bootstrapping a historical commodities lexicon with SKOS and DBpedia. In Proceedings of LaTeCH 2014 at EACL 2014. Gothenburg, Sweden. [paper]

Ewan Klein, Beatrice Alex, Claire Grover, Richard Tobin, Colin Coates, Jim Clifford, Aaron Quigley, Uta Hinrichs, James Reid, Nicola Osborne and Ian Fieldhouse (2014). Digging Into Data White Paper: Trading Consequences. March 2014. [ paper]

Uta Hinrichs, Beatrice Alex, Jim Clifford and Aaron Quigley (2014). Trading Consequences: A Case Study of Combining Text Mining & Visualisation to Facilitate Document Exploration. In Prodeedings of DH2014. [abstract]

Daniel Duma and Ewan Klein (2014). Citation Resolution: A method for evaluating context-based citation recommendation systems. In Proceedings of the Association for Computational Linguistics (ACL’14). [pdf]

Clare Llewellyn, Claire Grover and Jon Oberlander (2014). Summarizing Newspaper Comments. In Proceedings of Eighth International AAAI Conference on Weblogs and Social Media. [pdf]

2013

Daniel Duma and Ewan Klein (2013). Generating natural language from linked data: Unsupervised template extraction. In Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013), pp. 83-94, Potsdam, Germany. Association for Computational Linguistics. [pdf]

2012

Bea Alex, Claire Grover, Ewan Klein and Richard Tobin (2012). Digitised Historical Text: Does it have to be mediOCRe? In Proceedings of KONVENS 2012 (LThist 2012 workshop), Vienna, Austria. [paper]

Elton Barker, Kate Byrne, Leif Isaksen, Eric Kansa and Nick Rabinowitz (2012). The Geographic Annotation Platform. NeDiMAH workshop on Space and Time, at Digital Humanities Conference (DH2012), July 2012, Hamburg [abstract]

2011

Leif Isaksen, Elton Barker, Eric C. Kansa and Kate Byrne (2011). GAP: A NeoGeo Approach to Classical Resources. Leonardo Transactions, May 2011. [paper]

Leif Isaksen, Elton Barker, Eric C. Kansa and Kate Byrne (2011). Googling Ancient Places. In Proceedings of Digital Humanities 2011 (DH2011), Stanford, CA, June 2011. [paper]

Ewan Klein and Michael Rovatsos (2011). Temporal vagueness, coordination and communication. In Nouwen, R., Schmitz, H.-C., van Rooij, R., and Sauerland, U., editors, Vagueness in Communication, LNCS. Springer. [paper]

Nikos Sarris, Gerasimos Potamianos, Jean-Michel Renders, Claire Grover, Eric Karstens, Leonidas Kallipolitis, Vasilis Tountopoulos, Georgios Petasis, Anastasia Krithara, Matthias Gallé, Guillaume Jacquet, Beatrice Alex, Richard Tobin, and Liliana Bounegru (2011). A system for synergistically structuring news content from traditional media and the blogosphere. In eChallenges 2011, Florence, Italy. [paper]

2010

Beatrice Alex and Alexander Onysko (2010). Zum Erkennen von Anglizismen im Deutschen: der Vergleich einer automatisierten und einer manuellen Erhebung. Carmen Scherer and Anke Holler (eds). Strategien der Isolation und Integration nicht-nativer Einheiten und Strukturen. de Gruyter, Berlin. [paper]

Bea Alex and Claire Grover (2010). Labelling and spatio-temporal grounding of news events. In Proceedings of the workshop on Computational Linguistics in a World of Social Media at NAACL 2010, Los Angeles, USA. [paper]

Bea Alex, Claire Grover, Rongzhou Shen and Mijail Kabadjov (2010). Agile corpus annotation in practice: An overview of manual and automatic annotation of CVs. In Proceedings of the 4th Linguistic Annotation Workshop (LAW IV), Uppsala, Sweden. [paper]

Claire Grover, Richard Tobin, Beatrice Alex, and Kate Byrne (2010). Edinburgh-LTG: TempEval-2 system description. In Proceedings of SemEval-2010, Uppsala, Sweden. [paper]

Claire Grover, Richard Tobin, Kate Byrne, Matthew Woollard, James Reid, Stuart Dunn, and Julian Ball (2010). Use of the Edinburgh Geoparser for georeferencing digitised historical collections. Philosophical Transactions of the Royal Society A, 368(1925):3875-3889. [paper]

Richard Tobin, Claire Grover, Kate Byrne, James Reid, and Jo Walsh (2010). Evaluation of georeferencing. In Proceedings of the 6th Workshop on Geographic Information Retrieval (GIR’10), Zurich, Switzerland. [paper]

2009

Steven Bird, Ewan Klein and Edward Loper (2009). Natural Language Processing with Python. O’Reilly Media, Sebastopol, CA. [book]

Kate Byrne and Ewan Klein (2009). Automatic extraction of archaeological events from text. In Proceedings of Computer Applications in Archaeology, CAA 2009 , Williamsburg, Virginia. [pdf]

K Byrne (2009). Putting Hybrid Cultural Data on the Semantic Web. Journal of Digital Information (JoDI), Vol. 10, no. 6. Special issue on Information Access to Cultural Heritage. Eds. Martha Larson, Kate Fernie, John Oomen. ISSN: 1368-7506. [pdf] [html]

Joshua Ritterman, Miles Osborne and Ewan Klein (2009). Using prediction markets and Twitter to predict a Swine Flu pandemic. In 1st International Workshop on Mining Social Media, November 2009, Seville, Spain. [paper]

2008

Beatrice Alex (2008). Comparing corpus-based to web-based lookup techniques for automatic english inclusion detection. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco. [paper]

Beatrice Alex, Claire Grover, Barry Haddow, Mijail Kabadjov, Ewan Klein, Michael Matthews, Stuart Roebuck, Richard Tobin, and Xinglong Wang (2008). Assisted curation: Does text mining really help? In Russ B. Altman, A. Keith Dunker, Lawrence Hunter, Tiffany Murray, and Teri E. Klein, editors, BIOCOMPUTING 2008. Proceedings of the Pacific Symposium on Biocomputing, Kohala Coast, Hawaii, USA. [paper]

Beatrice Alex, Claire Grover, Barry Haddow, Mijail Kabadjov, Ewan Klein, Michael Matthews, Richard Tobin, and Xinglong Wang (2008). Automating curation using a natural language processing pipeline. Genome Biology, 9(Suppl 2):S10. [paper]

Beatrice Alex, Claire Grover, Barry Haddow, Mijail Kabadjov, Ewan Klein, Michael Matthews, Richard Tobin, and Xinglong Wang (2008). The ITI TXM corpora: Tissue expressions and protein-protein interactions. In Proceedings of the Workshop on Building and Evaluating Resources for Biomedical Text Mining at the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco. [paper]

Kate Byrne (2008). Having Triplets – Holding Cultural Data as RDF. In Proceedings of IACH2008 Workshop on Information Access to Cultural Heritage at ECDL 2008, Aarhus, Denmark. (paper)

Claire Grover, Sharon Givon, Richard Tobin, and Julian Ball (2008). Named entity recognition for digitised historical texts. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco. [paper]

Barry Haddow (2008). Using automated feature optimisation to create an adaptable relation extraction system. In Proceedings of BioNLP 2008, Columbus, Ohio. [paper]

Barry Haddow and Beatrice Alex (2008). Exploiting multiply annotated corpora in biomedical information extraction tasks. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco. [paper]

Florian Leitner, Martin Krallinger, Carlos Rodriguez-Penagos, Jörg Hakenberg, Conrad Plake, Cheng-Ju Kuo, Chun-Nan Hsu, Richard Tzong-Han Tsai, Hsi-Chuan Hung, William W. Lau, Calvin A. Johnson, Rune Sætre, Kazuhiro Yoshida, Yan Hua Chen, Sun Kim, Soo-Yong Shin, Byoung-Tak Zhang, William A. Baumgartner Jr., Lawrence Hunter, Barry Haddow, Michael Matthews, Xinglong Wang, Patrick Ruch, Frédéric Ehrler, Arzucan Özgür, Günes Erkan, Dragomir R. Radev, Michael Krauthammer, TahiBinh Luong, Robert Hoffmann, Chris Sander, and Alfonso Valencia (2008). Introducing meta-services for biomedical information extraction. Genome Biology, 9(Suppl 2):S6. [paper]

Alexander A. Morgan, Zhiyong Lu, Xinglong Wang, Aaron M. Cohen, Juliane Fluck, Patrick Ruch, Anna Divoli, Katrin Fundel, Robert Leaman, Jörg Hakenberg, Chengjie Sun, Heng-hui Liu, Rafael Torres, Michael Krauthammer, William W. Lau, Hongfang Liu, Chun-Nan Hsu, Martijn Schuemie, and Lynette Hirschman (2008). Overview of BioCreative II gene normalization. Genome Biology, 9(Suppl 2):S3. [paper]

Larry Smith, Lorraine K. Tanabe, Rie Johnson (nee Ando), Cheng-Ju Kuo, I-Fang Chung, Yu-Shi Lin, Roman Klinger, Christoph M. Friedrich, Kuzman Ganchev, Manabu Torii, Hongfang Liu, Barry Haddow, Craig A. Struble, Richard J. Povinelli, Andreas Vlachos, William A. Baumgartner Jr., Lawrence Hunter, Bob Carpenter, Richard Tzong-Han Tsai, Hong-Jie Dai, Feng Liu, Yifei Chen, Chengjie Sun, Sophia Katrenko, Pieter Adriaans, Christian Blaschke, Rafael Torres, Mariana Neves, Preslav Nakov, Anna Divoli, Manuel Maña López, Jacinto Mata-Vázquez, and W. John Wilbur (2008). Overview of BioCreative II gene mention recognition. Genome Biology, 9(Suppl 2):S2. [paper]

Steve Bird, Ewan Klein, Edward Loper and Jason Baldridge (2008). Multidisciplinary instruction with the Natural Language Toolkit. In Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics, pages 62-70, Columbus, Ohio, USA. [paper]

Xinglong Wang and Claire Grover (2008). Learning the species of biomedical named entities from annotated corpora. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco. [paper]

Xinglong Wang and Michael Matthews (2008). Comparing usability of matching techniques for normalising biomedical named entities. In Russ B. Altman, A. Keith Dunker, Lawrence Hunter, Tiffany Murray, and Teri E. Klein, editors, BIOCOMPUTING 2008. Proceedings of the Pacific Symposium on Biocomputing, Kohala Coast, Hawaii, USA. [paper]

Xinglong Wang and Michael Matthews (2008). Species disambiguation for biomedical term identification. In Proceedings of BioNLP 2008, Columbus, Ohio. [paper]

2007

Beatrice Alex, Amit Dubey, and Frank Keller (2007). Using foreign inclusion detection to improve parsing performance. In Proceedings of EMNLP-CoNLL 2007, Prague, Czech Republic. [paper]

Beatrice Alex, Barry Haddow, and Claire Grover (2007). Recognising nested named entities in biomedical text. In Proceedings of BioNLP 2007, Prague, Czech Republic. [paper]

Tom Betts, Maria Milosavljevic, and Jon Oberlander (2007). The utility of information extraction in the classification of books. In Proceedings of the 29th European Conference on Information Retrieval (ECIR 2007), Rome, Italy. [paper]

Kate Byrne (2007). Nested Named Entity Recognition in historical archive text. In Proceedings of the first IEEE International Conference on Semantic Computing (ICSC 2007), Irvine, California. [paper]

Sharon Givon and Maria Milosavljevic (2007). Extracting useful information from the full text of fiction. In Proceedings of RIAO 2007, Pittsburgh, PA, USA. [paper]

Claire Grover, Barry Haddow, Ewan Klein, Michael Matthews, Leif Arda Nielsen, Richard Tobin, and Xinglong Wang (2007). Adapting a relation extraction pipeline for the BioCreAtIvE II task. In Proceedings of the BioCreAtIvE II Workshop 2007, Madrid, Spain. [paper]

Barry Haddow and Michael Matthews (2007). The extraction of enriched protein-protein interactions from biomedical text. In Proceedings of BioNLP 2007, Prague, Czech Republic. [paper]

Maria Milosavljevic, Claire Grover, and Louise Corti (2007). Smart qualitative data (SQUAD): Information extraction in a large document archive. In Proceedings of RIAO 2007, Pittsburgh, PA, USA. [paper]

Xinglong Wang (2007). Rule-based protein term identification with help from automatic species tagging. In Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLING 2007), Lecture Notes in Computer Science, pages 288-298, Mexico City, Mexico. [paper]

2006

Beatrice Alex (2006). Integrating language knowledge resources to extend the English inclusion classifier to a new language. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy. [paper]

Beatrice Alex, Malvina Nissim, and Claire Grover (2006). The impact of annotation on the performance of protein tagging in biomedical text. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy. [paper]

Kate Byrne (2006). Tethering cultural data with RDF. In Proceedings of Jena Users Conference (JUC2006), Bristol, UK. [paper]

Claire Grover, Michael Matthews, and Richard Tobin (2006). Tools to Address the Interdependence between Tokenisation and Standoff Annotation. In Proceedings of NLPXML-2006 (Multi-dimensional Markup in Natural Language Processing), pages 19-26. [paper]

Claire Grover and Richard Tobin (2006). Rule-based chunking and reusability. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy. [paper]

Ben Hachey (2006). Comparison of similarity models for the relation discovery task. In Proceedings of the ACL 2006 Linguistic Distances Workshop, Sydney, NSW, Australia.
[paper]

Ben Hachey and Claire Grover (2006). Extractive summarisation of legal texts. Artificial Intelligence and Law, 14(4):305-345. [paper]

Ben Hachey, Gabriel Murray, and David Reitter (2006). Dimensionality reduction aids term co-occurrence based multi-document summarisation. In Proceedings of the ACL 2006 Task-Focused Summarization and Question Answering Workshop, Sydney, NSW, Australia. [paper]

Ewan Klein (2006). Computational semantics in the Natural Language Toolkit. In Lawrence Cavedon and Ingrid Zukerman, editors, Proceedings of the 2006 Australasian Language Technology Workshop 2006 (ALTW), pages 26-41, Sydney. [paper]

Michael Matthews (2006). Improving biomedical text categorisation with NLP. In Proceedings of the SIGs, The Joint BioLINK-Bio-Ontologies Meeting, ISMB 2006, pages 93-96, Fortaleza, Brazil. [paper]

Fiona McNeill, Harry Halpin, Ewan Klein, and Alan Bundy (2006). Merging stories with shallow semantics. In Farah Benamara and Patrick Saint-Dizier, editors, Proceedings of the Workshop on Knowledge and Reasoning for Language Processing (KRAQ’06): 11th Conference of the European Chapter of the Association for Computational Linguistics, pages 37-42, Trento, Italy. Association for Computational Linguistics. [paper]

Leif Arda Nielsen (2006). Extracting protein-protein interactions using simple contextual features. In Proceedings of the BioNLP workshop, HLT/NAACL 2006 – poster session, pages 120-121, New York City, USA. [paper]

2005

Kisuh Ahn, Beatrice Alex, Johan Bos, Tiphaine Dalmas, Jochen L. Leidner, and Matthew B. Smillie (2005). Cross-lingual question answering using off-the-shelf machine translation. Multilingual Information Access for Text, Speech and Images. 5th Workshop of the Cross-Language Evaluation Forum, CLEF 2004, Bath, UK, September 15-17, 2004, Revised Selected Papers. Lecture Notes in Computer Science, 3491. [paper]

Beatrice Alex (2005). An unsupervised system for identifying English inclusions in German text. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), Student Research Workshop, pages 133-138, Ann Arbor, Michigan, USA. [paper]

Markus Becker, Ben Hachey, Beatrice Alex, and Claire Grover (2005). Optimising selective sampling for bootstrapping named entity recognition. In Proceedings of the International Conference on Machine Learning (ICML-2005) Workshop on Learning with Multiple Views, Bonn, Germany. [paper]

John Carroll, Roger Evans, and Ewan Klein (2005). Supporting text mining for e-Science: the challenges for Grid-enabled natural language processing. In Proceedings of the Fourth UK e-Science Programme All Hands Meeting (AHM 2005), Nottingham. [paper]

Jenny Finkel, Shipra Dingare, Christopher Manning, Malvina Nissim, and Beatrice Alex (2005). Exploring the boundaries: Gene and protein identification in biomedical text. BMC Bioinformatics, 6:S5. [paper]

Claire Grover, Mirella Lapata, and Alex Lascarides (2005). A comparison of parsing technology for the biomedical domain. Journal of Natural Language Engineering, 11(1):27-65. [paper]

Ben Hachey, Beatrice Alex, and Markus Becker (2005). Investigating the effects of selective sampling on the annotation task. In Proceedings of the 9th Conference on Computational Natural Language Learning (CoNLL-2005), Ann Arbor, Michigan, USA. [paper]

Ben Hachey, Markus Becker, Claire Grover, and Ewan Klein (2005). Selective sampling for information extraction with a committee of classifiers. In First PASCAL Challenges Workshop, Southampton, UK. [presentation]

Ben Hachey and Claire Grover (2005). Automatic legal text summarisation: Experiments with summary structuring. In Proceedings of the 10th International Conference on Artificial Intelligence and Law (ICAIL 2005), Bologna, Italy. [paper]

Ben Hachey and Claire Grover (2005). Sequence modelling for sentence classification in a legal summarisation system. In Proceedings of the 2005 ACM Symposium on Applied Computing (SAC 2005), Santa Fe, New Mexico, USA. [paper]

Ben Hachey and Claire Grover (2005). Sentence extraction for legal text summarisation. In IJCAI-05 Technical Poster Session, Edinburgh, UK. [paper]

Ben Hachey, Gabriel Murray, and David Reitter (2005). The Embra System at DUC 2005: Query-oriented multi-document summarization with a very large latent semantic space. In Proceedings of the 2005 Document Understanding Conference, Vancouver, British Columbia, Canada. [paper]

Sebastian Riedel and Ewan Klein (2005). Genic interaction extraction with semantic and syntactic chains. In Proceedings of the Learning Language in Logic Workshop, ICML 2005, Bonn, Germany. [paper]

2004

Kisuh Ahn, Beatrice Alex, Johan Bos, Tiphaine Dalmas, Jochen L. Leidner, and Matthew B. Smillie (2004). Cross-lingual question answering with QED. In Workshop of the Cross-Lingual Evaluation Forum (CLEF-2004) held at the European Conference for Digital Libraries (ECDL-2004), Bath, UK. [paper]

Beatrice Alex and Claire Grover (2004). An XML-based tool for tracking English inclusions in German text. In PAPILLON 2004 – Workshop on Multilingual Lexical Databases, Grenoble, France. [paper]

Shipra Dingare, Jenny Finkel, Malvina Nissim, Chris Manning, and Beatrice Alex (2004). Exploring the boundaries: Gene and protein identification in biomedical text. In Proceedings of the BioCreative (Critical Assessment of Information Extraction Systems in Biology) Workshop 2004, Granada, Spain. [paper]

Shipra Dingare, Jenny Finkel, Malvina Nissim, Christopher Manning, and Claire Grover (2004). A system for identifying named entities in biomedical text: How results from two evaluations reflect on both the system and the evaluations. In The 2004 BioLink meeting: Linking Literature, Information and Knowledge for Biology at ISMB 2004, Glasgow, UK. [paper]

Jenny Finkel, Shipra Dingare, Huy Nguyen, Malvina Nissim, Chris Manning, and Gail Sinclair (2004). Exploiting context for biomedical entity recognition: From syntax to the web. In Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications, COLING 2004, Geneva, Switzerland. [paper]

Claire Grover, Ben Hachey, and Ian Hughson (2004). The HOLJ corpus: Supporting summarisation of legal texts. In Proceedings of the 5th International Workshop on Linguistically Interpreted Corpora (LINC-04), Geneva, Switzerland. [paper]

Claire Grover, Harry Halpin, Ewan Klein, Jochen L. Leidner, Stephen Potter, Sebastian Riedel, Sally Scrutchin, and Richard Tobin (2004). A framework for text mining services. In Proceedings of the UK e-Science Programme All Hands Meeting 2004 (AHM 2004), Nottingham, UK. [paper]

Ben Hachey and Claire Grover (2004). Sentence classification experiments for legal text summarisation. In Proceedings of the 17th Annual Conference on Legal Knowledge and Information Systems (Jurix 2004), Berlin, Germany. [paper]

Ben Hachey and Claire Grover (2004). A rhetorical status classifier for legal text summarisation. In Proceedings of the ACL-2004 Text Summarization Branches Out Workshop, Barcelona, Spain. [paper]

Ben Hachey, Huy Nguyen, Malvina Nissim, Beatrice Alex, and Claire Grover (2004). Grounding gene mentions with respect to gene database identifiers. In Proceeding of the BioCreAtIvE (Critical Assessment of Information Extraction Systems in Biology) Workshop 2004, Granada, Spain. [paper]

Vangelis Karkaletsis, Constantine D. Spyropoulos, Claire Grover, Maria-Teresa Pazienza, Jose Coch, and Dimitris Souflis (2004). A platform for cross-lingual, domain and user adaptive web information extraction. ECAI, 16:725-729. [paper]

Ewan Klein and Stephen Potter (2004). An ontology for NLP services. In Thierry Declerck, editor, Proceedings of Workshop on a Registry of Linguistic Data Categories within an Integrated Language Resource Repository Area, LREC 2004. [paper]

Yuval Krymolowski, Beatrice Alex, and Jochen L. Leidner (2004). Biocreative task 2.1. The Edinburgh-Stanford system. In Proceedings of the BioCreAtIvE (Critical Assessment of Information Extraction Systems in Biology) Workshop 2004, Granada, Spain. [paper]

Malvina Nissim, Colin Matheson, and James Reid (2004). Recognising geographical entities in Scottish historical documents. In Proceedings of the Workshop on Geographic Information Retrieval, SIGIR 2004, Sheffield, UK. [paper]

Georgios Petasis, Vangelis Karkaletsis, Claire Grover, Benjamin Hachey, Maria Teresa Pazienza, Michele Vindigni, and Jose Coch (2004). Adaptive, multilingual named entity recognition in web pages. In Proceedings of 16th European Conference on Artificial Intelligence (ECAI 2004), Valencia, Spain. [paper]

2003

Johan Bos, Ewan Klein, Oliver Lemon, and Tetsushi Oka (2003). DIPPER: Description and formalisation of an information-state update dialogue system architecture. In 4th SIGdial Workshop on Discourse and Dialogue, ACL, Sapporo. [paper]

Johan Bos, Ewan Klein, and Tetsushi Oka (2003). Meaningful conversation with a mobile robot. In Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL10), pages 71-74, Budapest. [paper]

Kate Byrne and Ewan Klein (2003). Image retrieval using natural language and content-based techniques. In Arjen P. de Vries, editor, Proceedings of the 4th Dutch-Belgian Information Retrieval Wrokshop (DIR 2003), pages 57-62. Institute for Logic, Language and Computation. [paper]

Nicola Cathcart, Jean Carletta, and Ewan Klein (2003). A shallow model of backchannel continuers in spoken dialogue. In Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL10), pages 51-58, Budapest. [paper]

Claire Grover, Ben Hachey, Ian Hughson, and Chris Korycinski (2003). Automatic summarisation of legal documents. In Proceedings of the 9th International Conference on Artificial Intelligence and Law (ICAIL 2003), Edinburgh, UK. [paper]

Claire Grover, Ben Hachey, and Chris Korycinski (2003). Summarising legal texts: Sentential tense and argumentative roles. In Proceedings of HLT-NAACL 2003 Workshop on Text Summarization, Edmonton, Alberta, Canada. [paper]

Ben Hachey, Claire Grover, Vangelis Karkaletsis, Alexandros Valarakos, Maria Teresa Pazienza, Michele Vindigni, Emmanuel Cartier, and Jose Coch (2003). Use of ontologies for cross-lingual information management in the web. In Proceedings of the International Workshop on Ontologies and Information Extraction, Bucharest, Rumania. [paper]

Vangelis Karkaletsis, Constantine D. Spyropoulos, Dimitris Souflis, Claire Grover, Ben Hachey, Maria Teresa Pazienza, Michele Vindigni, Emmanuel Cartier, and Jose Coch (2003). Demonstration of the CROSSMARC system. In HLT/NAACL-2003 Demonstration Session, Edmonton, Canada. [paper]

Konstantinos Stamatakis, Vangelis Karkaletsis, Georgios Paliouras, James Horlock, Claire Grover, James R. Curran, and Shipra Dingare (2003). Domain-specific web site identification: The CROSSMARC focused web crawler. In Proceedings of the Second International Workshop on Web Document Analysis (WDA 2003), Edinburgh, UK. [paper]

2002

Claire Grover, Ewan Klein, Maria Lapata, and Alex Lascarides (2002). XML-based NLP tools for analysing and annotating medical language. In Proceedings of the Second International Workshop on NLP and XML (NLPXML-2002), Taipei, Taiwan. [paper]

Claire Grover, Scott McDonald, Donnla Nic Gearailt, Vangelis Karkaletsis, Dimitra Farmakiotou, Georgios Samaritakis, Georgios Petasis, Maria Teresa Pazienza, Michele Vindigni, Frantz Vichot, and Francis Wolinski (2002). Multilingual XML-based named entity recognition for E-retail domains. In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas, Spain. [paper]

2001

Claire Grover and Alex Lascarides (2001). XML-based data preparation for robust deep parsing. In Proceedings of the Joint EACL-ACL Meeting (ACL-EACL 2001), Toulouse, France. [paper]

2000

Claire Grover, Colin Matheson, Andrei Mikheev, and Marc Moens (2000). LT TTT – A flexible tokenisation tool. In Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC 2000), Athens, Greece. [paper]

Alexander Holt, Ewan Klein, and Claire Grover (2000). Natural language specifications for hardware verification. Journal of Language and Computation, 1(2):275-282.

1999

Alexander Holt and Ewan Klein (1999). A semantically-derived subset of English for hardware verification. In 37th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference: 20-26 June 1999, University of Maryland, College Park, Maryland, USA, pages 451-456. Association for Computational Linguistics. [paper]

Alexander Holt, Ewan Klein, and Claire Grover (1999). Natural language for hardware verification: Semantic interpretation and model checking. In Proceedings of ICoS-1: Inference in Computational Semantics, August 15, 1999, pages 133-137, Amsterdam, Netherlands. Institute for Logic, Language and Computation, University of Amsterdam. [paper]

Andrei Mikheev (1999). A knowledge-free method for capitalized word disambiguation. In Proceedings of the 37th Annual Meeting of the ACL, pages 159-166. [paper]

Andrei Mikheev, Claire Grover, and Marc Moens (1999). XML tools and architecture for named entity recognition. Journal of Markup Languages: Theory and Practice, 1(3):89-113. [paper]

Andrei Mikheev, Moens Moens, and Claire Grover (1999). Named entity recognition without gazetteers. In Proceedings of the Ninth Conference of the European Chapter of the Association for Computational Linguistics (EACL’99), pages 1-8, Bergen, Norway. [paper]

1998

Andrei Mikheev, Claire Grover, and Marc Moens (1998). Description of the LTG system used for MUC-7. In Nancy A. Chinchor, editor, Proceedings of Seventh Message Understanding Conference (MUC-7), Fairfax, Virginia. [paper]