Geographic information retrieval
|
|
This article may require cleanup to meet Wikipedia's quality standards. The specific problem is: this article features poor external link use, lots of non-notable facts, and industry lingo with no explanation. Please help improve this article if you can. The talk page may contain suggestions. (February 2012) |
Geographic information retrieval (GIR) or geographical information retrieval is the augmentation of information retrieval with geographic metadata.
Information retrieval generally views documents as a collection or `bag' of words. In contrast, geographic information retrieval requires a small amount of semantic data to be present (namely a location or geographic feature associated with a document). Because of this it is common in GIR to separate the text indexing and analysis from the geographic indexing.
GIR systems can commonly be broken down into the following stages: GeoTagging, text and Geographic indexing, data storage, geographic relevance ranking (wrt a geographic query) and browsing results (commonly with a map interface).
[edit] GIR systems
GIR involves extracting and resolving the meaning of locations in unstructured text. This is known as Geoparsing. A few tools offer this kind of capabilities, including GeoLocator and MetaCarta's GeoTagger.
After identifying location references in text, a GIR system must index this information for search and retrieval. Only a few such systems exist: Google Maps, Tumba, MetaCarta's Geographic text search (GTS) system, and the EU-funded SPIRIT (spatially aware information retrieval on the Internet) project.
[edit] Evaluation
In 2005 the Cross-Language Evaluation Forum added a geographic track: GeoCLEF. GeoCLEF was the first TREC style evaluation forum for GIR systems and provided participants a chance to compare systems.