A place name in a text may have more than one potential referent (e.g. Peru the country vs. Peru the city in Indiana). The Edinburgh Geoparser is a system to automatically recognise place names in text and disambiguate them with respect to a gazetteer. The geoparser can be used with several gazetteers including Unlock and GeoNames. It can also be used to process a variety of input text processes. The demonstration version displays output with a Google maps visualisation. Edina’s Unlock text/places service provides another demonstration version. An open source release is available for download below.
The Edinburgh Geoparser has been developed as part of a number of projects and applied to a range of data sets:
- html – multiple pages, including links to data examples
- pdf – single document for downloading
- epub – for e-readers
- Beatrice Alex (2017). Geoparsing English Text with the Edinburgh Geoparser, The Programming Historian lesson, October 2017. [html]
- Beatrice Alex (2016). The Edinburgh Geoparser: a hands-on workshop, taught at the Digital Day of Ideas 2016, Edinburgh, May 2016. [pdf]
Available from the School of Informatics Software Download Database under the The University of Edinburgh GPL license.
|Filename||Package Size (compressed)||Version||Date||Notes||Checksum|
|The Edinburgh Geoparser||153 MB||1.1||16/03/2016||–||–|
- Beatrice Alex, Clare Llewellyn, Claire Grover, Jon Oberlander and Richard Tobin. Homing in on Twitter users: Evaluating an Enhanced Geoparser for User Profile Locations. 2016. In the Proceedings of the 10th Language Resources and Evaluation Conference (LREC), 23-28 May 2016, Portorož, Slovenia. [pdf]
- Beatrice Alex, Kate Byrne, Claire Grover and Richard Tobin. 2015. Adapting the Edinburgh Geoparser for Historical Georeferencing. International Journal for Humanities and Arts Computing, 9(1), pp. 15-35, March 2015. [pdf]
- Beatrice Alex, Kate Byrne, Claire Grover and Richard Tobin. 2014. A Web-based Geo-resolution Annotation and Evaluation Tool. In Proceedings of the 8th Linguistic Annotation Workshop (LAW VIII), COLING 2014, Dublin, Ireland. [pdf]
- Presentation at Pelagious workshop, March 2011.
- Bea Alex and Claire Grover. 2010. Labelling and spatio-temporal grounding of news events. In Proceedings of the workshop on Computational Linguistics in a World of Social Media at NAACL 2010, Los Angeles, USA. [paper]
- Claire Grover, Richard Tobin, Kate Byrne, Matthew Woollard, James Reid, Stuart Dunn, and Julian Ball. 2010b. Use of the Edinburgh Geoparser for georeferencing digitised historical collections. Philosophical Transactions of the Royal Society A, 368(1925):3875-3889. [paper]