A data hub (data management system, or DMS) is software for collaborating on gathering, sharing and using data.
The term is usually used to refer to the new web-based generation of such products. They can be either platforms for handling lots of different kinds of data, or in verticals specialising in one particular field.
At core, a DMS is a list of datasets that are of diverse schema.
Once you have that, people expect the following features, and/or tight integration with tools that provide them:
- Load and update data from any source (ETL)
- Store datasets and index them for querying
- View, analyze and update data in a tabular interface (spreadsheet)
- Visualise data, for example with charts or maps
- Analyze data, for example with statistics and machine learning
- Organize many people to enter or correct data (crowd-sourcing)
- Measure and ensure the quality of data, and its provenance
- Permissions; data can be open, private or shared
- Find datasets, and organize them to help others find them
- Sell data, sharing processing costs between users
List of data hubs
- Windows Azure MarketPlace
- Avoiding Mass Extinctions Engine
- PANDA project
It's considered that a desktop operating system (e.g. Unix, OSX, Windows) is the legacy DMS that we use at the moment to do the things that would be better done by a good DMS.
- "Data Hubs, Data Management Systems and CKAN | OKFN Notebook". Open Knowledge Foundation. 2011-04-27. Retrieved 2012-03-08.
- "From CMS to DMS: C is for Content, D is for Data". ScraperWiki. 2012-03-09. Retrieved 2012-03-12.
|This database software-related article is a stub. You can help Wikipedia by expanding it.|