Geoparsing is the process of converting free-text descriptions of places (such as "twenty miles northeast of Jalalabad") into unambiguous geographic identifiers, such as geographic coordinates expressed as latitude-longitude. One can also geoparse location references from other forms of media, for example audio content in which a speaker mentions a place. With geographic coordinates the features can be mapped and entered into Geographic Information Systems. Two primary uses of the geographic coordinates derived from unstructured content are to plot portions of the content on maps and to search the content using a map as a filter.
Geoparsing goes beyond geocoding. Geocoding analyzes unambiguous structured location references, such as postal addresses and rigorously formatted numerical coordinates. Geoparsing handles ambiguous references in unstructured discourse, such as "Al Hamra," which is the name of several places, including towns in both Syria and Yemen.
A geoparser is a piece of software or a (web) service that helps in this process.
- GEOLocate automated georeferencing
- BioGeomancer - Semi-automatic georeferencing
- GEOnet Names Server - Freely available GIS information for areas outside of the U.S.A. and Antarctica, updated monthly by the National Geospatial-Intelligence Agency (NGA) and the U.S. Board on Geographic Names (US BGN)
- Geographic Names Information System (GNIS) - Freely available database containing information on almost 2 million physical features, places, and landmarks in the U.S.A.
- CLAVIN - CLAVIN (Cartographic Location And Vicinity INdexer) is an open source software package for document geotagging and geoparsing that employs context-based geographic entity resolution.
- Geoparser.io - Geoparser.io is a web service that identifies places mentioned in text, disambiguates those places, and returns GeoJSON with detailed metadata about the places found in the text.