Stack (abstract data type): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Reverted 1 edit by Ztothefifth (talk). (TW)
Line 35: Line 35:


==Inessential operations==
==Inessential operations==
In modern computer languages, the stack is usually implemented with more operations than just "push" and "pop". Some implementations have a function which returns the current number of items on the stack. Another typical helper operation ''top''<ref>Horowitz, Ellis: "Fundamentals of Data Structures in Pascal", page 67. Computer Science Press, 1984</ref> (also known as ''peek'') can return the current top element of the stack without removing it.
In modern computer languages, the stack is usually implemented with more operations than just "push" and "pop". Some implementations have a function which returns the current number of items on the stack. Another typical helper operation ''top''<ref>Horowitz, Ellis: "Fundamentals of Data Structures in Pascal", page 67. Computer Science Press, 1984</ref> (also known as ''peek'') can return the current top element of the stack without removing it.ccol ia os ndj jdmm entity ia s a subset of no of entutpoq adda psf file readers are avalable ata your home stores ani


==Software stacks==
==Software stacks==

Revision as of 04:12, 7 January 2011

Simple representation of a stack

In computer science, a stack is a last in, first out (LIFO) abstract data type and data structure. A stack can have any abstract data type as an element, but is characterized by only two fundamental operations: push and pop. The push operation adds an item to the top of the stack, hiding any items already on the stack, or initializing the stack if it is empty. The pop operation removes an item from the top of the stack, and returns this value to the caller. A pop either reveals previously concealed items, or results in an empty stack.

A stack is a restricted data structure, because only a small number of operations are performed on it. The nature of the pop and push operations also means that stack elements have a natural order. Elements are removed from the stack in the reverse order to the order of their addition: therefore, the lower elements are those that have been on the stack the longest.[1]

History

The stack was first proposed in 1955, and then patented in 1957, by the German Friedrich L. Bauer.[2] The same concept was developed independently, at around the same time, by the Australian Charles Leonard Hamblin.[citation needed]

Abstract definition

A stack is a fundamental computer science data structure and can be defined in an abstract, implementation free, manner.

This is a VDM (Vienna Development Method) description of a stack

Function signatures:

  init: -> Stack 
  push: N x Stack -> Stack 
  top: Stack -> (N U ERROR) 
  remove: Stack -> Stack 
  isempty: Stack -> Boolean 

(where N indicates an element (natural numbers in this case), and U indicates set union)


Semantics

  top(init()) = ERROR 
  top(push(i,s)) = i 
  remove(init()) = init() 
  remove(push(i, s)) = s 
  isempty(init()) = true 
  isempty(push(i, s)) = false 

Reference [3]

Inessential operations

In modern computer languages, the stack is usually implemented with more operations than just "push" and "pop". Some implementations have a function which returns the current number of items on the stack. Another typical helper operation top[4] (also known as peek) can return the current top element of the stack without removing it.ccol ia os ndj jdmm entity ia s a subset of no of entutpoq adda psf file readers are avalable ata your home stores ani

Software stacks

Implementation

In most high level languages, a stack can be easily implemented either through an array or a linked list. What identifies the data structure as a stack in either case is not the implementation but the interface: the user is only allowed to pop or push items onto the array or linked list, with few other helper operations. The following will demonstrate both implementations, using C.

Array

The array implementation aims to create an array where the first element (usually at the zero-offset) is the bottom. That is, array[0] is the first element pushed onto the stack and the last element popped off. The program must keep track of the size, or the length of the stack. The stack itself can therefore be effectively implemented as a two-element structure in C:

    typedef struct {
        int size;
        int items[STACKSIZE];
    } STACK;

The push() operation is used both to initialize the stack, and to store values to it. It is responsible for inserting (copying) the value into the ps->items[] array and for incrementing the element counter (ps->size). In a responsible C implementation, it is also necessary to check whether the array is already full to prevent an overrun.

    void push(STACK *ps, int x)
    {
        if (ps->size == STACKSIZE) {
            fputs("Error: stack overflow\n", stderr);
            abort();
        } else 
            ps->items[ps->size++] = x;
    }

The pop() operation is responsible for removing a value from the stack, and decrementing the value of ps->size. A responsible C implementation will also need to check that the array is not already empty.

    int pop(STACK *ps)
    {
        if (ps->size == 0){
            fputs("Error: stack underflow\n", stderr);
            abort();
        } else 
            return ps->items[--ps->size];
    }

Linked list

The linked-list implementation is equally simple and straightforward. In fact, a stack linked-list is much simpler than most linked-list implementations: it requires that we implement a linked-list where only the head node or element can be removed, or popped, and a node can only be inserted by becoming the new head node.

Unlike the array implementation, our structure typedef corresponds not to the entire stack structure, but to a single node:

    typedef struct stack {
        int data;
        struct stack *next;
    } STACK;

Such a node is identical to a typical linked-list node, at least to those that are implemented in C.

The push() operation both initializes an empty stack, and adds a new node to a non-empty one. It works by receiving a data value to push onto the stack, along with a target stack, creating a new node by allocating memory for it, and then inserting it into a linked list as the new head:

    void push(STACK **head, int value)
    {
        STACK *node = malloc(sizeof(STACK));  /* create a new node */

        if (node == NULL){
            fputs("Error: no space available for node\n", stderr);
            abort();
        } else {                                      /* initialize node */
            node->data = value;
            node->next = empty(*head) ? NULL : *head; /* insert new head if any */
            *head = node;
        }
    }

A pop() operation removes the head from the linked list, and assigns the pointer to the head to the previous second node. It checks whether the list is empty before popping from it:

    int pop(STACK **head)
    {
        if (empty(*head)) {                          /* stack is empty */
           fputs("Error: stack underflow\n", stderr);
           abort();
        } else {                                     /* pop a node */
            STACK *top = *head;
            int value = top->data;
            *head = top->next;
            free(top);
            return value;
        }
    }

Stacks and programming languages

Some languages, like LISP and Python, do not call for stack implementations, since push and pop functions are available for any list. All Forth-like languages (such as Adobe PostScript) are also designed around language-defined stacks that are directly visible to and manipulated by the programmer.

C++'s Standard Template Library provides a "stack" templated class which is restricted to only push/pop operations. Java's library contains a Stack class that is a specialization of Vector---this could be considered a design flaw, since the inherited get() method from Vector ignores the LIFO constraint of the Stack.

Related data structures

The functionality of a queue and a stack can be combined in a deque data structure. Briefly put, a queue is a First In First Out (FIFO) data structure.

Hardware stacks

A common use of stacks at the architecture level is as a means of allocating and accessing memory.

Basic architecture of a stack

A typical stack, storing local data and call information for nested procedure calls (not necessarily nested procedures!). This stack grows downward from its origin. The stack pointer points to the current topmost datum on the stack. A push operation decrements the pointer and copies the data to the stack; a pop operation copies data from the stack and then increments the pointer. Each procedure called in the program stores procedure return information (in yellow) and local data (in other colors) by pushing them onto the stack. This type of stack implementation is extremely common, but it is vulnerable to buffer overflow attacks (see the text).

A typical stack is an area of computer memory with a fixed origin and a variable size. Initially the size of the stack is zero. A stack pointer, usually in the form of a hardware register, points to the most recently referenced location on the stack; when the stack has a size of zero, the stack pointer points to the origin of the stack.

The two operations applicable to all stacks are:

  • a push operation, in which a data item is placed at the location pointed to by the stack pointer, and the address in the stack pointer is adjusted by the size of the data item;
  • a pop or pull operation: a data item at the current location pointed to by the stack pointer is removed, and the stack pointer is adjusted by the size of the data item.

There are many variations on the basic principle of stack operations. Every stack has a fixed location in memory at which it begins. As data items are added to the stack, the stack pointer is displaced to indicate the current extent of the stack, which expands away from the origin.

Stack pointers may point to the origin of a stack or to a limited range of addresses either above or below the origin (depending on the direction in which the stack grows); however, the stack pointer cannot cross the origin of the stack. In other words, if the origin of the stack is at address 1000 and the stack grows downwards (towards addresses 999, 998, and so on), the stack pointer must never be incremented beyond 1000 (to 1001, 1002, etc.). If a pop operation on the stack causes the stack pointer to move past the origin of the stack, a stack underflow occurs. If a push operation causes the stack pointer to increment or decrement beyond the maximum extent of the stack, a stack overflow occurs.

Some environments that rely heavily on stacks may provide additional operations, for example:

  • Dup(licate): the top item is popped, and then pushed again (twice), so that an additional copy of the former top item is now on top, with the original below it.
  • Peek: the topmost item is inspected (or returned), but the stack pointer is not changed, and the stack size does not change (meaning that the item remains on the stack). This is also called top operation in many articles.
  • Swap or exchange: the two topmost items on the stack exchange places.
  • Rotate (or Roll): the n topmost items are moved on the stack in a rotating fashion. For example, if n=3, items 1, 2, and 3 on the stack are moved to positions 2, 3, and 1 on the stack, respectively. Many variants of this operation are possible, with the most common being called left rotate and right rotate.

Stacks are either visualized growing from the bottom up (like real-world stacks), or, with the top of the stack in a fixed position (see image [note in the image, the top (28) is the stack 'bottom', since the stack 'top' is where items are pushed or popped from]), a coin holder, a Pez dispenser, or growing from left to right, so that "topmost" becomes "rightmost". This visualization may be independent of the actual structure of the stack in memory. This means that a right rotate will move the first element to the third position, the second to the first and the third to the second. Here are two equivalent visualizations of this process:

apple                         banana
banana    ===right rotate==>  cucumber
cucumber                      apple
cucumber                      apple
banana    ===left rotate==>   cucumber 
apple                         banana

A stack is usually represented in computers by a block of memory cells, with the "bottom" at a fixed location, and the stack pointer holding the address of the current "top" cell in the stack. The top and bottom terminology are used irrespective of whether the stack actually grows towards lower memory addresses or towards higher memory addresses.

Pushing an item on to the stack adjusts the stack pointer by the size of the item (either decrementing or incrementing, depending on the direction in which the stack grows in memory), pointing it to the next cell, and copies the new top item to the stack area. Depending again on the exact implementation, at the end of a push operation, the stack pointer may point to the next unused location in the stack, or it may point to the topmost item in the stack. If the stack points to the current topmost item, the stack pointer will be updated before a new item is pushed onto the stack; if it points to the next available location in the stack, it will be updated after the new item is pushed onto the stack.

Popping the stack is simply the inverse of pushing. The topmost item in the stack is removed and the stack pointer is updated, in the opposite order of that used in the push operation.

Hardware support

Stack in main memory

Most CPUs have registers that can be used as stack pointers. Processor families like the x86, Z80, 6502, and many others have special instructions that implicitly use a dedicated (hardware) stack pointer to conserve opcode space. Some processors, like the PDP-11 and the 68000, also have special addressing modes for implementation of stacks, typically with a semi-dedicated stack pointer as well (such as A7 in the 68000). However, in most processors, several different registers may be used as additional stack pointers as needed (whether updated via addressing modes or via add/sub instructions).

Stack in registers or dedicated memory

The x87 floating point architecture is an example of a set of registers organised as a stack where direct access to individual registers (relative the current top) is also possible. As with stack-based machines in general, having the top-of-stack as an implicit argument allows for a small machine code footprint with a good usage of bus bandwidth and code caches, but it also prevents some types of optimizations possible on processors permitting random access to the register file for all (two or three) operands. A stack structure also makes superscalar implementations with register renaming (for speculative execution) somewhat more complex to implement, although it is still feasible, as exemplified by modern x87 implementations.

Sun SPARC, AMD Am29000, and Intel i960 are all examples of architectures using register windows within a register-stack as another strategy to avoid the use of slow main memory for function arguments and return values.

There are also a number of small microprocessors that implements a stack directly in hardware and some microcontrollers have a fixed-depth stack that is not directly accessible. Examples are the PIC microcontrollers, the Computer Cowboys MuP21, the Harris RTX line, and the Novix NC4016. Many stack-based microprocessors were used to implement the programming language Forth at the microcode level. Stacks were also used as a basis of a number of mainframes and mini computers. Such machines were called stack machines, the most famous being the Burroughs B5000.

Applications

Stacks are ubiquitous in the computing world.

Expression evaluation and syntax parsing

Calculators employing reverse Polish notation use a stack structure to hold values. Expressions can be represented in prefix, postfix or infix notations. Conversion from one form of the expression to another form may be accomplished using a stack. Many compilers use a stack for parsing the syntax of expressions, program blocks etc. before translating into low level code. Most of the programming languages are context-free languages allowing them to be parsed with stack based machines.

Example (general)

The calculation: 1 + 2 * 4 + 3 can be written down like this in postfix notation with the advantage of no precedence rules and parentheses needed:

1 2 4 * + 3 +

The expression is evaluated from the left to right using a stack:

  1. when encountering an operand: push it
  2. when encountering an operator: pop two operands, evaluate the result and push it.

Like the following way (the Stack is displayed after Operation has taken place):

Input Operation Stack (after op)
1 Push operand 1
2 Push operand 2, 1
4 Push operand 4, 2, 1
* Multiply 8, 1
+ Add 9
3 Push operand 3, 9
+ Add 12

The final result, 12, lies on the top of the stack at the end of the calculation.

Example in C

#include<stdio.h>

int main() 
{
    int a[100], i;
    printf("To pop enter -1\n");
    for(i = 0;;) 
     {
        printf("Push ");
        scanf("%d", &a[i]);
          if(a[i] == -1) 
          {
            if(i == 0) 
              {
                printf("Underflow\n");
              } 
           else 
              {
                printf("pop = %d\n", a[--i]);
              }
           } 
           else  
           {
            i++;
           }
      }
}

Example (Pascal)

This is an implementation in Pascal, using marked sequential file as data archives.

{
programmer : clx321
file  : stack.pas
unit  : Pstack.tpu
}
program TestStack;
{this program use ADT of Stack, i will assume that the unit of ADT of Stack has already existed}

uses
   PStack;   {ADT of STACK}

{dictionary}
const
   mark = '.';

var
   data : stack;
   f : text;
   cc : char;
   ccInt, cc1, cc2 : integer;
  
  {functions}
  IsOperand (cc : char) : boolean;    {JUST  Prototype}
    {return TRUE if cc is operand}
  ChrToInt (cc : char) : integer;     {JUST Prototype}
    {change char to integer}
  Operator (cc1, cc2 : integer) : integer;     {JUST Prototype}
    {operate two operands}

{algorithms}
begin
  assign (f, cc);
  reset (f);
  read (f, cc);  {first elmt}
  if (cc = mark) then
     begin
        writeln ('empty archives !');
     end
  else   
     begin
        repeat
          if (IsOperand (cc)) then
             begin
               ccInt := ChrToInt (cc);
               push (ccInt, data);               
             end
          else
             begin
               pop (cc1, data);
               pop (cc2, data);
               push (data, Operator (cc2, cc1));
             end;
           read (f, cc);   {next elmt}
        until (cc = mark);
     end;
  close (f);
end

}

Runtime memory management

A number of programming languages are stack-oriented, meaning they define most basic operations (adding two numbers, printing a character) as taking their arguments from the stack, and placing any return values back on the stack. For example, PostScript has a return stack and an operand stack, and also has a graphics state stack and a dictionary stack.

Forth uses two stacks, one for argument passing and one for subroutine return addresses. The use of a return stack is extremely commonplace, but the somewhat unusual use of an argument stack for a human-readable programming language is the reason Forth is referred to as a stack-based language.

Many virtual machines are also stack-oriented, including the p-code machine and the Java Virtual Machine.

Almost all computer runtime memory environments use a special stack (the "call stack") to hold information about procedure/function calling and nesting in order to switch to the context of the called function and restore to the caller function when the calling finishes. They follow a runtime protocol between caller and callee to save arguments and return value on the stack. Stacks are an important way of supporting nested or recursive function calls. This type of stack is used implicitly by the compiler to support CALL and RETURN statements (or their equivalents) and is not manipulated directly by the programmer.

Some programming languages use the stack to store data that is local to a procedure. Space for local data items is allocated from the stack when the procedure is entered, and is deallocated when the procedure exits. The C programming language is typically implemented in this way. Using the same stack for both data and procedure calls has important security implications (see below) of which a programmer must be aware in order to avoid introducing serious security bugs into a program.

Security

Some computing environments use stacks in ways that may make them vulnerable to security breaches and attacks. Programmers working in such environments must take special care to avoid the pitfalls of these implementations.

For example, some programming languages use a common stack to store both data local to a called procedure and the linking information that allows the procedure to return to its caller. This means that the program moves data into and out of the same stack that contains critical return addresses for the procedure calls. If data is moved to the wrong location on the stack, or an oversized data item is moved to a stack location that is not large enough to contain it, return information for procedure calls may be corrupted, causing the program to fail.

Malicious parties may attempt a stack smashing attack that takes advantage of this type of implementation by providing oversized data input to a program that does not check the length of input. Such a program may copy the data in its entirety to a location on the stack, and in so doing it may change the return addresses for procedures that have called it. An attacker can experiment to find a specific type of data that can be provided to such a program such that the return address of the current procedure is reset to point to an area within the stack itself (and within the data provided by the attacker), which in turn contains instructions that carry out unauthorized operations.

This type of attack is a variation on the buffer overflow attack and is an extremely frequent source of security breaches in software, mainly because some of the most popular programming languages (such as C) use a shared stack for both data and procedure calls, and do not verify the length of data items. Frequently programmers do not write code to verify the size of data items, either, and when an oversized or undersized data item is copied to the stack, a security breach may occur.

See also

References

  1. ^ http://www.cprogramming.com/tutorial/computersciencetheory/stack.html cprogramming.com
  2. ^ "Verfahren zur automatischen Verarbeitung von kodierten Daten und Rechenmaschine zur Ausübung des Verfahrens" (in German). Deutsches Patentamt. 30. März 1957. Retrieved 2010-10-01. {{cite journal}}: Check date values in: |date= (help); Cite journal requires |journal= (help); Cite uses deprecated parameter |authors= (help); Unknown parameter |address= ignored (|location= suggested) (help)
  3. ^ Jones: "Systematic Software Development Using VDM"
  4. ^ Horowitz, Ellis: "Fundamentals of Data Structures in Pascal", page 67. Computer Science Press, 1984

Further reading

External links