Primitive data type: Difference between revisions

Content deleted Content added

Inline

Revision as of 06:17, 19 February 2015

In computer science, primitive data type is either of the following:^{[citation needed]}

a basic type is a data type ^[1] provided by a programming language as a basic building block. Most languages allow more complicated composite types to be recursively constructed starting from basic types.
a built-in type is a data type for which the programming language provides built-in support.

In most programming languages, all basic data types are built-in. In addition, many languages also provide a set of composite data types. Opinions vary as to whether a built-in type that is not basic should be considered "primitive".^{[citation needed]}

Depending on the language and its implementation, primitive data types may or may not have a one-to-one correspondence with objects in the computer's memory. However, one usually expects operations on basic primitive data types to be the fastest language constructs there are.^{[citation needed]} Integer addition, for example, can be performed as a single machine instruction, and some processors offer specific instructions to process sequences of characters with a single instruction.^{[citation needed]} In particular, the C standard mentions that "a 'plain' int object has the natural size suggested by the architecture of the execution environment". This means that int is likely to be 32 bits long on a 32-bit architecture. Basic primitive types are almost always value types.

Most languages do not allow the behavior or capabilities of primitive (either built-in or basic) data types to be modified by programs. Exceptions include Smalltalk, which permits all data types to be extended within a program, adding to the operations that can be performed on them or even redefining the built-in operations.

Overview

The actual range of primitive data types that is available is dependent upon the specific programming language that is being used. For example, in C, strings are a composite but built-in data type, whereas in modern dialects of BASIC and in JavaScript, they are assimilated to a primitive data type that is both basic and built-in.

Classic basic primitive types may include:

Character (character, char);
Integer (integer, int, short, long, byte) with a variety of precisions;
Floating-point number (float, double, real, double precision);
Fixed-point number (fixed) with a variety of precisions and a programmer-selected scale.
Boolean, logical values true and false.
Reference (also called a pointer or handle), a small value referring to another object's address in memory, possibly a much larger one.

More sophisticated types which can be built-in include:

Tuple in ML, Python
List in Lisp
Complex number in Fortran, C (C99), Lisp, Python, Perl 6, D
Rational number in Lisp, Perl 6
Associative array in various guises, in Lisp, Perl, Python, Lua, D
First-class function, closure, continuation in languages that support functional programming such as Lisp, ML, Perl 6, D and C# 3.0

Specific primitive data types

Integer numbers

An integer data type can hold a whole number, but no fraction. Integers may be either signed (allowing negative values) or unsigned (nonnegative values only). Typical sizes of integers are:

Size	Names	Signed Range	Unsigned Range
8 bits	Byte	−128 to +127	0 to 255
16 bits	Word, short int	−32,768 to +32,767	0 to 65,535
32 bits	Double Word, long int (win32, win64, 32-bit Linux^[2])	−2,147,483,648 to +2,147,483,647	0 to 4,294,967,295
64 bits	long int (C in 64-bit linux^[2]), long long (C), long (Java, the signed integer variant only^[3])	−9,223,372,036,854,775,808 to +9,223,372,036,854,775,807	0 to 18,446,744,073,709,551,615
unlimited	Bignum

Literals for integers consist of a sequence of digits. Most programming languages disallow use of commas for digit grouping, although Fortran (77, 90, and above, fixed form source but not free form source) allows embedded spaces, and Perl, Ruby, Java and D allow embedded underscores. Negation is indicated by a minus sign (−) before the value. Examples of integer literals are:

42
10000
−233000

Booleans

A boolean type, typically denoted "bool" or "boolean", is typically a logical type that can be either "true" or "false". Although only one bit is necessary to accommodate the value set "true" and "false", programming languages typically implement boolean types as one or more bytes.

Most languages (Java, Pascal and Ada, e.g.) implement booleans adhering to the concept of boolean as a distinct logical type. Languages, though, may implicitly convert booleans to numeric types at times to give extended semantics to booleans and boolean expressions or to achieve backwards compatibility with earlier versions of the language. In C++, e.g., boolean values may be implicitly converted to integers, according to the mapping false → 0 and true → 1 (for example, true + true would be a valid expression evaluating to 2). The boolean type bool in C++ is considered an integral type and is a cross between numeric type and a logical type.

Floating-point numbers

A floating-point number represents a limited-precision rational number that may have a fractional part. These numbers are stored internally in a format equivalent to scientific notation, typically in binary but sometimes in decimal. Because floating-point numbers have limited precision, only a subset of real or rational numbers are exactly representable; other numbers can be represented only approximately.

Many languages have both a single precision (often called "float") and a double precision type.

Literals for floating point numbers include a decimal point, and typically use "e" or "E" to denote scientific notation. Examples of floating-point literals are:

20.0005
99.9
−5000.12
6.02e23

Some languages (e.g., Fortran, Python, D) also have a complex number type comprising two floating-point numbers: a real part and an imaginary part.

Fixed-point numbers

A fixed-point number represents a limited-precision rational number that may have a fractional part. These numbers are stored internally in a scaled-integer form, typically in binary but sometimes in decimal. Because fixed-point numbers have limited precision, only a subset of real or rational numbers are exactly representable; other numbers can be represented only approximately. Fixed-point numbers also tend to have a more limited range of values than floating point, and so the programmer must be careful to avoid overflow in intermediate calculations as well as the final results.

Characters and strings

A character type (typically called "char") may contain a single letter, digit, punctuation mark, symbol, formatting code, control code, or some other specialized code (e.g., a byte order mark). Some languages have two or more character types, for example a single-byte type for ASCII characters and a multi-byte type for Unicode characters. The term "character type" is normally used even for types whose values more precisely represent code units, for example a UTF-16 code unit as in Java and JavaScript.

Characters may be combined into strings. The string data can include numbers and other numerical symbols but will be treated as text.

In most languages, a string is equivalent to an array of characters or code units, but Java treats them as distinct types (java.lang.String and char[]). Other languages (such as Python, and many dialects of BASIC) have no separate character type; strings with a length of one are normally used to represent (single code unit) characters.

Literals for characters and strings are usually surrounded by quotation marks: sometimes, single quotes (') are used for characters and double quotes (") are used for strings.

Examples of character literals in C syntax are:

'A'
'4'
'$'
'\t' (tab character)

Examples of string literals in C syntax are:

"A"
"Hello World"

Numeric data type ranges

Each numeric data type has its maximum and minimum value known as the range. Attempting to store a number outside the range may lead to compiler/runtime errors, or to incorrect calculations (due to truncation) depending on the language being used.

The range of a variable is based on the number of bytes used to save the value, and an integer data type is usually able to store $2^{n}$ values (where $n$ is the number of bits that contribute to the value). For other data types (e.g. floating point values) the range is more complicated and will vary depending on the method used to store it. There are also some types that do not use entire bytes, e.g. a boolean that requires a single bit, and represents a binary value (although in practice a byte is often used, with the remaining 7 bits being redundant). Some programming languages (such as Ada and Pascal) also allow the opposite direction, that is, the programmer defines the range and precision needed to solve a given problem and the compiler chooses the most appropriate integer or floating point type automatically.

References

^ "C++ Data Types".
^ ^a ^b Fog, Agner (2010-02-16). "Calling conventions for different C++ compilers and operating systems: Chapter 3, Data Representation" (PDF). Retrieved 2010-08-30.
^ Owens, Sean R. (2009-11-05). "Java and unsigned int, unsigned short, unsigned byte, unsigned long, etc. (Or rather, the lack thereof)". Retrieved 2010-09-05.

[1] "C++ Data Types".

[agnerfog-2] Fog, Agner (2010-02-16). "Calling conventions for different C++ compilers and operating systems: Chapter 3, Data Representation" (PDF). Retrieved 2010-08-30.

[unsignedjava-3] Owens, Sean R. (2009-11-05). "Java and unsigned int, unsigned short, unsigned byte, unsigned long, etc. (Or rather, the lack thereof)". Retrieved 2010-09-05.

[1]

[2]

[3]

@@ Line 3: / Line 3: @@
 In [[computer science]], '''primitive data type''' is either of the following:{{Citation needed|date=May 2009}}
-* a ''basic type'' is a [[data type]] provided by a [[programming language]] as a basic building block. Most languages allow more complicated ''[[composite type]]s'' to be recursively constructed starting from basic types.
+* a ''basic type'' is a [[data type]] <ref>{{cite web
+ | title=C++ Data Types
+ | url=https://www.tutorialcup.com/cplusplus/data-types.htm
+}}</ref> provided by a [[programming language]] as a basic building block. Most languages allow more complicated ''[[composite type]]s'' to be recursively constructed starting from basic types.
 * a ''built-in type'' is a data type for which the programming language provides built-in support.

v t e Data types
Uninterpreted	Bit Byte Trit Tryte Word Bit array
Numeric	Arbitrary-precision or bignum Complex Decimal Fixed point Floating point Reduced precision Minifloat Half precision bfloat16 Single precision Double precision Quadruple precision Octuple precision Extended precision Long double Integer signedness Interval Rational
Pointer	Address physical virtual Reference
Text	Character String null-terminated
Composite	Algebraic data type generalized Array Associative array Class Dependent Equality Inductive Intersection List Object metaobject Option type Product Record or Struct Refinement Set Union tagged
Other	Boolean Bottom type Collection Enumerated type Exception Function type Opaque data type Recursive data type Semaphore Stream Strongly typed identifier Top type Type class Empty type Unit type Void
Related topics	Abstract data type Boxing Data structure Generic Kind metaclass Parametric polymorphism Primitive data type Interface Subtyping Type constructor Type conversion Type system Type theory Variable