Linear search

Linear search
Class	Search algorithm
Worst-case performance	O(n)
Best-case performance	O(1)
Average performance	O(n)
Worst-case space complexity	O(1) iterative
Optimal	Yes

In computer science, linear search or sequential search is a method for finding an element within a list. It sequentially checks each element of the list until a match is found or the whole list has been searched.^[1]

A linear search runs in linear time in the worst case, and makes at most $n$ comparisons, where $n$ is the length of the list. If each element is equally likely to be searched, then linear search has an average case of $.mw-parser-output .sfrac{white-space:nowrap}.mw-parser-output .sfrac.tion,.mw-parser-output .sfrac .tion{display:inline-block;vertical-align:-0.5em;font-size:85%;text-align:center}.mw-parser-output .sfrac .num{display:block;line-height:1em;margin:0.0em 0.1em;border-bottom:1px solid}.mw-parser-output .sfrac .den{display:block;line-height:1em;margin:0.1em 0.1em}.mw-parser-output .sr-only{border:0;clip:rect(0,0,0,0);clip-path:polygon(0px 0px,0px 0px,0px 0px);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}⁠n+1/2⁠$ comparisons, but the average case can be affected if the search probabilities for each element vary. Linear search is rarely practical because other search algorithms and schemes, such as the binary search algorithm and hash tables, allow significantly faster searching for all but short lists.^[2]

Algorithm

A linear search sequentially checks each element of the list until it finds an element that matches the target value. If the algorithm reaches the end of the list, the search terminates unsuccessfully.^[1]

Basic algorithm

Given a list $L$ of $n$ elements with values or records $L 0 .... L n -1$ , and target value $T$ , the following subroutine uses linear search to find the index of the target $T$ in $L$ .^[3]

Set $i$ to 0.
If $L i = T$ , the search terminates successfully; return $i$ .
Increase $i$ by 1.
If $i < n$ , go to step 2. Otherwise, the search terminates unsuccessfully.

With a sentinel^[4]

The basic algorithm above makes two comparisons per iteration: one to check if $L i$ equals T, and the other to check if $i$ still points to a valid index of the list. By adding an extra record $L n$ to the list (a sentinel value) that equals the target, the second comparison can be eliminated until the end of the search, making the algorithm faster. The search will reach the sentinel if the target is not contained within the list.^[5]

Set $i$ to 0.
If $L i = T$ , go to step 4.
Increase $i$ by 1 and go to step 2.
If $i < n$ , the search terminates successfully; return $i$ . Else, the search terminates unsuccessfully.

In an ordered table

If the list is ordered such that $L 0 \leq L 1 ... \leq L n -1$ , the search can establish the absence of the target more quickly by concluding the search once $L i$ exceeds the target. This variation requires a sentinel that is greater than the target.^[6]

Set $i$ to 0.
If $L i \geq T$ , go to step 4.
Increase $i$ by 1 and go to step 2.
If $L i = T$ , the search terminates successfully; return $i$ . Else, the search terminates unsuccessfully.

Analysis

For a list with n items, the best case is when the value is equal to the first element of the list, in which case only one comparison is needed. The worst case is when the value is not in the list (or occurs only once at the end of the list), in which case n comparisons are needed.

If the value being sought occurs k times in the list, and all orderings of the list are equally likely, the expected number of comparisons is

{\begin{cases}n&{\mbox{if }}k=0\\[5pt]\displaystyle {\frac {n+1}{k+1}}&{\mbox{if }}1\leq k\leq n.\end{cases}}

For example, if the value being sought occurs once in the list, and all orderings of the list are equally likely, the expected number of comparisons is ${\frac {n+1}{2}}$ . However, if it is known that it occurs once, then at most n - 1 comparisons are needed, and the expected number of comparisons is

\displaystyle {\frac {(n+2)(n-1)}{2n}}

(for example, for n = 2 this is 1, corresponding to a single if-then-else construct).

Either way, asymptotically the worst-case cost and the expected cost of linear search are both O(n).

Non-uniform probabilities

The performance of linear search improves if the desired value is more likely to be near the beginning of the list than to its end. Therefore, if some values are much more likely to be searched than others, it is desirable to place them at the beginning of the list.

In particular, when the list items are arranged in order of decreasing probability, and these probabilities are geometrically distributed, the cost of linear search is only O(1). ^[7]

Application

Linear search is usually very simple to implement, and is practical when the list has only a few elements, or when performing a single search in an un-ordered list.

When many values have to be searched in the same list, it often pays to pre-process the list in order to use a faster method. For example, one may sort the list and use binary search, or build an efficient search data structure from it. Should the content of the list change frequently, repeated re-organization may be more trouble than it is worth.

As a result, even though in theory other search algorithms may be faster than linear search (for instance binary search), in practice even on medium-sized arrays (around 100 items or less) it might be infeasible to use anything else. On larger arrays, it only makes sense to use other, faster search methods if the data is large enough, because the initial time to prepare (sort) the data is comparable to many linear searches.^[4]

References

Citations

^ ^a ^b Knuth 1998, §6.1 ("Sequential search").
^ Knuth 1998, §6.2 ("Searching by Comparison Of Keys").
^ Knuth 1998, §6.1 ("Sequential search"), subsection "Algorithm B".
^ ^a ^b Horvath, Adam. "Binary search and linear search performance on the .NET and Mono platform". Retrieved 19 April 2013.
^ Knuth 1998, §6.1 ("Sequential search"), subsection "Algorithm Q".
^ Knuth 1998, §6.1 ("Sequential search"), subsection "Algorithm T".
^ Knuth, Donald (1997). "Section 6.1: Sequential Searching". Sorting and Searching. The Art of Computer Programming. Vol. 3 (3rd ed.). Addison-Wesley. pp. 396–408. ISBN 0-201-89685-0.

Works

Knuth, Donald (1998). Sorting and Searching. The Art of Computer Programming. Vol. 3 (2nd ed.). Reading, MA: Addison-Wesley Professional. ISBN 0-201-89685-0

[FOOTNOTEKnuth1998§6.1_("Sequential_search")-1] Knuth 1998, §6.1 ("Sequential search").

[FOOTNOTEKnuth1998§6.2_("Searching_by_Comparison_Of_Keys")-2] Knuth 1998, §6.2 ("Searching by Comparison Of Keys").

[FOOTNOTEKnuth1998§6.1_("Sequential_search"),_subsection_"Algorithm_B"-3] Knuth 1998, §6.1 ("Sequential search"), subsection "Algorithm B".

[:0-4] Horvath, Adam. "Binary search and linear search performance on the .NET and Mono platform". Retrieved 19 April 2013.

[FOOTNOTEKnuth1998§6.1_("Sequential_search"),_subsection_"Algorithm_Q"-5] Knuth 1998, §6.1 ("Sequential search"), subsection "Algorithm Q".

[FOOTNOTEKnuth1998§6.1_("Sequential_search"),_subsection_"Algorithm_T"-6] Knuth 1998, §6.1 ("Sequential search"), subsection "Algorithm T".

[knuth-7] Knuth, Donald (1997). "Section 6.1: Sequential Searching". Sorting and Searching. The Art of Computer Programming. Vol. 3 (3rd ed.). Addison-Wesley. pp. 396–408. ISBN 0-201-89685-0.

[1]

[2]

[3]

[4]

[5]

[6]

[7]