Jump to content

Network socket

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by EncMstr (talk | contribs) at 06:09, 8 September 2011 (Reverted edits by 218.248.12.97 (talk) to last version by Woohookitty). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer networking, an Internet socket or network socket is an endpoint of a bidirectional inter-process communication flow across an Internet Protocol-based computer network, such as the Internet.

The term Internet sockets is also used as a name for an application programming interface (API) for the TCP/IP protocol stack, usually provided by the operating system. Internet sockets constitute a mechanism for delivering incoming data packets to the appropriate application process or thread, based on a combination of local and remote IP addresses and port numbers. Each socket is mapped by the operating system to a communicating application process or thread.

A socket address is the combination of an IP address (the location of the computer) and a port (which is mapped to the application program process) into a single identity, much like one end of a telephone connection is the combination of a phone number and a particular extension.

Overview

An Internet socket is characterized by a unique combination of the following:

  • Local socket address: Local IP address and port number
  • Remote socket address: Only for established TCP sockets. As discussed in the Client-Server section below, this is necessary since a TCP server may serve several clients concurrently. The server creates one socket for each client, and these sockets share the same local socket address.
  • Protocol: A transport protocol (e.g., TCP, UDP), raw IP, or others. TCP port 53 and UDP port 53 are consequently different, distinct sockets.

Within the operating system and the application that created a socket, the socket is referred to by a unique integer number called socket identifier or socket number. The operating system forwards the payload of incoming IP packets to the corresponding application by extracting the socket address information from the IP and transport protocol headers and stripping the headers from the application data.

In IETF Request for Comments, Internet Standards, in many textbooks, as well as in this article, the term socket refers to an entity that is uniquely identified by the socket number. In other textbooks[1], the socket term refers to a local socket address, i.e. a "combination of an IP address and a port number". In the original definition of socket given in RFC 147, as it was related to the ARPA network in 1971, "the socket is specified as a 32 bit number with even sockets identifying receiving sockets and odd sockets identifying sending sockets." Today, however, socket communications are bidirectional.

On Unix-like and Microsoft Windows based operating systems the netstat command line tool may be used to list all currently established sockets and related information.

Socket types

There are several Internet socket types available:

There are also non-Internet sockets, implemented over other transport protocols, such as Systems Network Architecture (SNA).[3] See also Unix domain sockets (UDS), for internal inter-process communication.

Socket states and the client-server model

Computer processes that provide application services are called servers, and create sockets on start up that are in listening state. These sockets are waiting for initiatives from client programs. For a listening TCP socket, the remote address presented by netstat may be denoted 0.0.0.0 and the remote port number 0.

A TCP server may serve several clients concurrently, by creating a child process for each client and establishing a TCP connection between the child process and the client. Unique dedicated sockets are created for each connection. These are in established state, when a socket-to-socket virtual connection or virtual circuit (VC), also known as a TCP session, is established with the remote socket, providing a duplex byte stream.

Other possible TCP socket states presented by the netstat command are Syn-sent, Syn-Recv, Fin-wait1, Fin-wait2, Time-wait, Close-wait and Closed which relate to various start up and shutdown steps.[4]

A server may create several concurrently established TCP sockets with the same local port number and local IP address, each mapped to its own server-child process, serving its own client process. They are treated as different sockets by the operating system, since the remote socket address (the client IP address and/or port number) are different; i.e. since they have different socket pair tuples (see below).

A UDP socket cannot be in an established state, since UDP is connectionless. Therefore, netstat does not show the state of a UDP socket. A UDP server does not create new child processes for every concurrently served client, but the same process handles incoming data packets from all remote clients sequentially through the same socket. This implies that UDP sockets are not identified by the remote address, but only by the local address, although each message has an associated remote address.

Socket pairs

Communicating local and remote sockets are called socket pairs. Each socket pair is described by a unique 4-tuple consisting of source and destination IP addresses and port numbers, i.e. of local and remote socket addresses.[5][6] As seen in the discussion above, in the TCP case, each unique socket pair 4-tuple is assigned a socket number, while in the UDP case, each unique local socket address is assigned a socket number.

Implementation issues

TCP Socket flow diagram.

Sockets are usually implemented by an API library such as Berkeley sockets, first introduced in 1983. Most implementations are based on Berkeley sockets, for example Winsock introduced in 1991. Other socket API implementations exist, such as the STREAMS-based Transport Layer Interface (TLI).

Development of application programs that utilize this API is called socket programming or network programming.

These are examples of functions or methods typically provided by the API library[7]:

  • socket() creates a new socket of a certain socket type, identified by an integer number, and allocates system resources to it.
  • bind() is typically used on the server side, and associates a socket with a socket address structure, i.e. a specified local port number and IP address.
  • listen() is used on the server side, and causes a bound TCP socket to enter listening state.
  • connect() is used on the client side, and assigns a free local port number to a socket. In case of a TCP socket, it causes an attempt to establish a new TCP connection.
  • accept() is used on the server side. It accepts a received incoming attempt to create a new TCP connection from the remote client, and creates a new socket associated with the socket address pair of this connection.
  • send() and recv(), or write() and read(), or recvfrom() and sendto(), are used for sending and receiving data to/from a remote socket.
  • close() causes the system to release resources allocated to a socket. In case of TCP, the connection is terminated.
  • gethostbyname() and gethostbyaddr() are used to resolve host names and addresses.
  • select() is used to prune a provided list of sockets for those that are ready to read, ready to write or have errors
  • poll() is used to check on the state of a socket. The socket can be tested to see if it can be written to, read from or has errors.

Sockets in network equipment

The socket is primarily a concept used in the Transport Layer of the Internet model. Networking equipment such as routers and switches do not require implementations of the Transport Layer, as they operate on the Link Layer level (switches) or at the Internet Layer (routers). However, stateful network firewalls, network address translators, and proxy servers keep track of active socket pairs. Also in fair queuing, layer 3 switching and quality of service (QoS) support in routers, packet flows may be identified by extracting information about the socket pairs.

Raw sockets are typically available in network equipment, and used for routing protocols such as IGMP and OSPF, and in Internet Control Message Protocol (ICMP).

Early implementations

1983 Berkeley sockets (also known as the BSD socket API) originated with the 4.2BSD Unix operating system (released in 1983) as an API. Only in 1989, however, could UC Berkeley release versions of its operating system and networking library free from the licensing constraints of AT&T's copyright-protected Unix.[8]

1987 Transport Layer Interface (TLI) was the networking API provided by AT&T UNIX System V Release 3 (SVR3) in 1987[9] and continued into Release 4 (SVR4).[10][11]

Other early implementations were written for TOPS-20[12] , MVS[12], VM[12], IBM-DOS (PCIP)[12][13] .

See also

References

  1. ^ Cisco Networking Academy Program, CCNA 1 and 2 Companion Guide Revised Third Edition, P.480, ISBN 1-58713-150-1
  2. ^ Raw IP Networking FAQ[dead link]
  3. ^ www-306.ibm.com - AnyNet Guide to Sockets over SNA
  4. ^ colorado.edu - Linux - netstat(8)
  5. ^ books.google.com - UNIX Network Programming: The sockets networking API
  6. ^ books.google.com - Designing BSD Rootkits: An Introduction to Kernel Hacking
  7. ^ Stevens, Richard. UNIX Network Programming, Volume 1, Second Edition: Networking APIs: Sockets. ISBN 0-13-490012-X.
  8. ^ Wikipedia: Berkeley sockets 2011-02-18
  9. ^ (Goodheart 1994, p. 11)
  10. ^ (Goodheart 1994, p. 17)
  11. ^ Wikipedia: Transport Layer Interface 2011-02-18
  12. ^ a b c d historyofcomputercommunications.info - Book: 9.8 TCP/IP and XNS 1981 - 1983
  13. ^ mit.edu - The Desktop Computer as a Network Participant.pdf 1985

{{IPC athlete}} template missing ID and not present in Wikidata.