Metadata registry

From Wikipedia, the free encyclopedia

A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.

A metadata repository is the database where metadata is stored. The registry also adds relationships with related metadata types. A metadata engine collects, stores and analyzes information about data and metadata (data about data) in use within a domain.[1]

Use of metadata registries[edit]

Metadata registries are used whenever data must be used consistently within an organization or group of organizations. Examples of these situations include:

  • Organizations that transmit data using structures such as XML, Web Services or EDI
  • Organizations that need consistent definitions of data across time, between databases, between organizations or between processes, for example when an organization builds a data warehouse
  • Organizations that are attempting to break down "silos" of information captured within applications or proprietary file formats

Central to the charter of any metadata management programme is the process of creating trusting relationships with stakeholders and that definitions and structures have been reviewed and approved by appropriate parties.

Common characteristics of a metadata registry[edit]

A metadata registry typically has the following characteristics:

  • Protected environment where only authorized individuals may make changes
  • Stores data elements that include both semantics and representations
  • Semantic areas of a metadata registry contain the meaning of a data element with precise definitions
  • Representational areas of a metadata registry define how the data is represented in a specific format, such as in a database or a structured file format (e.g., XML)

Clear separation of semantics and system-specific constraints[edit]

Because metadata registries are used to store both semantics (the meaning of a data element) and systems-specific constraints (for example the maximum length of a string) it is important to identify what systems impose these constraints and to document them. For example the maximum length of a string should not change the meaning of a data element.

The International Organization for Standardization (ISO) has published standards for a metadata registry called ISO/IEC 11179 and also ISO15000-3 and ISO15000-4 ebXML registry and repository (regrep) EbXML RegRep

International standards[edit]

There are two international standards which are commonly referred to as metadata registry standards: ISO/IEC 11179 and ISO 15000-3. There are some who believe that ISO/IEC 11179 and ISO 15000-3 are interchangeable or at least in some way similar. e.g.

"Of interest is that the ISO 11179 model was one of the inputs to the ebXML RIM (registry information model) and so has much functional equivalence to the "registry" region of the ISO 11179 conceptual model." [1]

This is however incorrect. Although the specification ebRIM v2.0 (5 December 2001) says at the beginning in its Design Objectives: "Leverage as much as possible the work done in the OASIS [OAS] and the ISO 11179 [ISO] Registry models" [2] by the time of ebRIM v3.0 (2 May 2005) all reference to ISO/IOEC 11179 is reduced to a mention under informative references on page 76 of 78. [3] It was recognised by some team members that the ebXML RIM data model had no place to store "fine grained artifacts" [4] ie. the data elements which are at the heart of ISO/IEC 11179, but not until 2009 can an explicit and definitive statement from the team be found. [5]

ISO/IEC 11179[edit]

ISO/IEC 11179 says that it is concerned with "traditional" metadata: "We limit the scope of the term as it is used here in ISO/IEC 11179 to descriptions of data - the more traditional use of the term." Originally the standard named itself a "data element" registry. It describes data elements: "data elements are the fundamental units of data" and "data elements themselves contain various kinds of data that include characters, images, sound, etc." It also describes a registry with an analogy: "This is analogous to the registries maintained by governments to keep track of motor vehicles. A description of each motor vehicle is entered in the registry, but not the vehicle itself."

ebXML[edit]

The ebXML RIM says about its Repository and Registry that it is

  • "... capable of storing any type of electronic content such as XML documents, text documents, images, sound and video … RepositorytItems (sic) are stored in a content repository".

It also says that it is

  • "... capable of storing standardized metadata that MAY be used to further describe RepositoryItems" which metadata "… are stored in the registry".

It also describes itself with "...this familiar metaphor. An ebXML Registry is like your local library. The repository is like the bookshelves in the library. The repository items in the repository are like book (sic) on the bookshelves." It goes on to say "The registry is like the card catalog … A RegistryObject is like a card in the card catalog."

What should be immediately apparent is that something which holds catalogue cards is not "like" a catalogue, it IS a catalogue.

Unfortunately for a number of organisations that have implemented ebXML RIM to satisfy a requirement for an ISO/IEC 11179 registry, ebXML RIM

  • is neither a registry
  • nor does it store metadata.

It is

Metadata registry roles[edit]

A metadata registry is frequently set up and administered by an organization's data architect or data modeling team.

Data elements are frequently assigned to data stewards or data stewardship teams that are responsible for the maintenance of individual data elements through a secure system.

Metadata element workflow[edit]

Metadata registries frequently have a formal data element submission, approval and publishing approval process. Each data element should be accepted by a data stewardship team and reviewed before data elements are published. After publication change control processes should be used.

Metadata navigation, search and publishing[edit]

Metadata registries are frequently large and complex structures and require navigation, visualization and searching tools. Use of hierarchical viewing tools are frequently an essential part of a metadata registry system. Metadata publishing consists of making data element definitions and structures available to both people and other systems.

Examples of public metadata registries[edit]

  • Agency for Healthcare Research and Quality- United States Health Information Knowledgebase (USHIK) [6]
  • Apelon Medical Registry [7]
  • Australian Institute of Health and Welfare [8]
  • Dublin Core Metadata Registry [9]
  • Knowledge Network for Biocomplexity [10]
  • Cancer Data Standards Repository [11]
  • Global Justice XML Data Model (GJXDM) [12]
  • Minnesota Department of Education Metadata Registry (K-12 Data)[13]
  • National Information Exchange Model [14]
  • NIST ebXML Registry for HL7 / HIMSS / IHE [15]
  • Open Metadata Registry (formerly the National Science Digital Library (NSDL) Metadata Registry) [16]
  • Portal of Medical Data Models
  • US Department of Defense Metadata Registry (requires sponsored registration) [17]
  • US Environmental Protection Agency - Environmental Data Registry [18]

Metadata registry vendors / solutions[edit]

In alphabetical order:

See also[edit]

In alphabetical order:

References[edit]

  1. ^ Kendall, Aaron. "Metadata-Driven Design: Designing a Flexible Engine for API Data Retrieval". InfoQ. Retrieved 25 April 2017.

Open Forums on Metadata Registries, in reverse chronological order: