Implicit data structure

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer science, an implicit data structure or space-efficient data structure is a data structure that stores very little information other than the main or required data: a data structure that requires low overhead. They are called "implicit" because the position of the elements carries meaning and relationship between elements; this is contrasted with the use of pointers to give an explicit relationship between elements. Definitions of "low overhead" vary, but generally means constant overhead; in big O notation, O(1) overhead. A less restrictive definition is a succinct data structure, which allows greater overhead.

Definition[edit]

Formally, an implicit data structure is one with constant (O(1)) space overhead (above the information-theoretic lower bound).

Historically, Munro, & Suwanda (1980) defined an implicit data structure (and algorithms acting on one) as one "in which structural information is implicit in the way data are stored, rather than explicit in pointers." They are somewhat vague in the definition, defining it most strictly as a single array, with only the size retained (a single number of overhead),[1] or more loosely as a data structure with constant overhead (O(1)).[2] This latter definition is today more standard, and the still-looser notion of a data structure with non-constant but small (o(N)) overhead is today known as a succinct data structure, as defined by Jacobson (1988); it was referred to as semi-implicit by Munro, & Suwanda (1980).[3]

A fundamental distinction is between static data structures (read-only) and dynamic data structures (which can be modified). Simple implicit data structures, such as representing a sorted list as an array, may be very efficient as a static data structure, but inefficient as a dynamic data structure, due to modification operations (such as insertion in the case of a sorted list) being inefficient.

Examples[edit]

A trivial example of an implicit data structure is an array data structure, which is an implicit data structure for a list, and requires only the constant overhead of the length; unlike a linked list, which has a pointer associated with each data element, which explicitly gives the relationship from one element to the next. Similarly, a null-terminated string is an implicit data structure for a string (list of characters). These are considered very simple because they are static data structures (read-only), and only admit the simple operation of iteration over the elements.

Similarly simple is representing a multi-dimensional array as a single 1-dimensional array, together with its dimensions. For example, representing an m × n array as a single list of length m·n, together with the numbers m and n (instead of as a 1-dimensional array of pointers to each 1-dimensional subarray). The elements need not be of the same type, and a table of data (a list of records) may similarly be represented implicitly as a flat (1-dimensional) list, together with the length of each field, so long as each field has uniform size (so a single size can be used per field, not per record).

A less trivial example is a representing a sorted list by a sorted array, which allows search in logarithmic time by binary search. Contrast with a search tree, specifically a binary search tree, which also allows logarithmic-time search, but requires pointers. A sorted array is only efficient as a static data structure, as modifying the list is slow – unlike a binary search tree – but does not require the space overhead of a tree.

An important example of an implicit data structure is representing a perfect binary tree as a list, in increasing order of depth, so root, first left child, first right child, first left child of first left child, etc. Such a tree occurs notably for an ancestry chart to a give depth, and the implicit representation is known as an Ahnentafel (ancestor table).

This can be generalized to a complete binary tree (where the last level may be incomplete), which yields the best-known example of an implicit data structure, namely the binary heap, which is an implicit data structure for a priority queue. This is more sophisticated than earlier examples because it allows multiple operations, and is an efficient dynamic data structure (it allows efficient modification of the data): not only top, but also insert and pop.

More sophisticated implicit data structures include the beap (bi-parental heap).

History[edit]

The trivial examples of lists or tables of values date to prehistory, while historically non-trivial implicit data structures date at least to the Ahnentafel, which was introduced by Michaël Eytzinger in 1590 for use in genealogy. In formal computer science, the first implicit data structure is generally considered to be the sorted list, used for binary search, which was introduced by John Mauchly in 1946, in the Moore School Lectures, the first ever set of lectures regarding any computer-related topic.[4][5] The binary heap was introduced in Williams (1964) to implement the heapsort.[5] The notion of an implicit data structure was formalized in Munro & Suwanda (1980), as part of introducing and analyzing the beap.[5]

Implementation[edit]

Efficiency concerns[edit]

In an implicit data structure, everything is accessed in-place, by reading bits at various positions in the data. To achieve memory-optimal coding, appropriate data items use bits instead of bytes. Implicit data structures are designed to improve main memory utilization, concomitantly reducing access to slower storage. A greater fraction of data in an implicit data structure can fit in main memory, reducing administrative processing. Implicit data structures can improve time efficiency due to improving cache efficiency thanks to locality of reference, due to avoiding the indirection introduced by pointers.

Weighted element[edit]

For presentation of elements with different weights, several data structures are required. The structure[clarification needed] uses one more location besides those required for element values.[clarification needed] The first structure supports worst case search time in terms of rank of weight of elements with respect to set of weights.[clarification needed] If the elements are drawn from uniform distribution, then variation[clarification needed] of this structure will take average time.[clarification needed] The same result obtains[clarification needed] for the data structures in which the intervals between consecutive values[clarification needed] have access probabilities.[clarification needed]

See also[edit]

References[edit]

  1. ^ "Thus, only a simple array is needed for the data.", p. 236; "We will draw no formal distinction between a pointer and an integer (index) in the range . A data structure is then implicit, if the only such integer which need be retained is N itself.", p. 238
  2. ^ "... one might prefer to permit a constant number of pointers to be retained and still designate the structure as implicit.", p. 238
  3. ^ "We will also suggest two structures which might be described as “semi-implicit,” in that a variable, but o(N), number of pointers (indices) is kept.", p. 238
  4. ^ Knuth 1998, §6.2.1 ("Searching an ordered table"), subsection "History and bibliography".
  5. ^ a b c Franceschini, Gianni; Munro, J. Ian (2006). Implicit dictionaries with O(1) modifications per update and fast search. Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms. Miami, FL, United States. pp. 404–413. doi:10.1145/1109557.1109603. 

Further reading[edit]

See publications of Hervé Brönnimann, J. Ian Munro, and Greg Frederickson.