This article needs additional citations for verification. (March 2013) (Learn how and when to remove this template message)
Database and web services
The GeoNames database contains over 25,000,000 geographical names corresponding to over 11,800,000 unique features. All features are categorized into one of nine feature classes and further subcategorized into one of 645 feature codes. Beyond names of places in various languages, data stored include latitude, longitude, elevation, population, administrative subdivision and postal codes. All coordinates use the World Geodetic System 1984 (WGS84).
Those data are accessible free of charge through a number of Web services and a daily database export.
The core of GeoNames database is provided by official public sources, the quality of which may vary. Through a wiki interface, users are invited to manually edit and improve the database by adding or correcting names, move existing features, add new features, etc.
Semantic Web integration
Each GeoNames feature is represented as a web resource identified by a stable URI. This URI provides access, through content negotiation, either to the HTML wiki page, or to a RDF description of the feature, using elements of the GeoNames ontology. This ontology describes the GeoNames features properties using the Web Ontology Language, the feature classes and codes being described in the SKOS language. Through Wikipedia articles URL linked in the RDF descriptions, GeoNames data are linked to DBpedia data and other RDF Linked Data.
Accuracy and improvements
As in other crowdsourcing schemes, GeoNames edit interface allows everyone to sign in and edit the database, hence false information can be entered and such information can remain undetected especially for places that are not accessed frequently. Ahlers (2013) studies these inaccuracies and classifies them into loss in the granularity of coordinates (e.g., due to truncation and low-resolution geocoding in some cases), wrong feature codes, near-identical places, and the placement of places outside their designated countries. Manually correcting these inaccuracies is both tedious and error prone (due to the database size) and may require experts.
The literature provides very few works on automatically resolving them. Singh & Rafiei (2018) study the problem of automatically detecting the scope of locations in a geographical database and its applications in identifying inconsistencies and improving the quality of the database. Computing the boundary information can help detect inconsistencies such as near-identical places and the placement of locations such as cities under wrong parents such as provinces or countries. Singh and Rafiei show that the boundary information derived in their work can move more than 20% of locations in GeoNames to better positions in the spatial hierarchy and the accuracy of those moves is over 90%.
- Ahlers, Dirk (2013), "Assessment of the accuracy of GeoNames gazetteer data", Proceedings of the GIR Workshop, pp. 74–81
- Singh, Sanket Kumar; Rafiei, Davood (2018), "Strategies for Geographical Scoping and Improving a Gazetteer", Proceedings of the Web Conference (PDF), pp. 1663–1672