SPL (Search Processing Language)

From Wikipedia, the free encyclopedia
Jump to: navigation, search
SPL
Paradigm(s) Piped programming language
Designed by Splunk
Appeared in 2007
Stable release SPL:5.0 / 2013
Influenced by Unix PipingSQL
OS Cross-platform
Website "Splunk Documentation". 

SPL or Search Processing Language is a special-purpose programming language designed by Splunk for managing machine-generated big data. Originally based upon Unix Piping and SQL, its scope includes data searching, filtering, modification, manipulation, insertion, and deletion.

Language elements[edit]

A search in SPL is a series of commands and arguments, each chained together with "|" (pipe) character that takes the output of one command and feeds it into the next command to the right.

  search-args | cmd1 cmd-args | cmd2 cmd-args | ...

You can use search commands to take indexed data and filter unwanted information, extract more information, calculate values, transform, and perform statistical analysis. The search results retrieved from the index can be thought of as a dynamically-created table. Each search command redefines the shape of that table. Each indexed event is a row, with columns for each field value. Columns include basic information about the data as well as columns that are extracted dynamically at search-time.

At the head of each search is an implied "search-the-index-for-events" command, which can be used to search for keywords (e.g., error), boolean expressions (e.g., (error OR failure) NOT success), phrases (e.g., "database error"), wildcards (e.g., fail* will match fail, fails, failure, etc.), field values (e.g., code=404), inequality (e.g., code!=404 or code>200), a field having any value or no value (e.g., code=* or NOT code=*). For example, the search:

   sourcetype="access_combined" error | top 10 uri

will retrieve indexed access_combined events from disk that contain the term "error" (ANDs are implied between search terms), and then for those events, report the top 10 most common URI values.

Subsearches[edit]

A subsearch is an argument to a command that runs its own search, returning those results to the parent command as the argument value. Subsearches are contained in square brackets. For example, finding all syslog events from the user that had the last login error:

   sourcetype=syslog [search login error | return user]

Note that the subsearch returns one user value, because by default the "return" command returns only one value, but there are options for more (e.g., | return 5 user).

Commands[edit]

The most common commands in used in SPL queries are:

  • chart/timechart - Returns results in a tabular output for (time-series) charting.
  • dedup - Removes subsequent results that match a specified criterion.
  • eval - Calculates an expression. (See EVAL FUNCTIONS table.)
  • fields - Removes fields from search results.
  • head/tail - Returns the first/last N results.
  • lookup - Adds field values from an external source.
  • rename - Renames a specified field; wildcards can be used to specify multiple fields.
  • replace - Replaces values of specified fields with a specified new value.
  • rex - Specifies regular expression named groups to extract fields.
  • search - Filters results to those that match the search expression.
  • sort - Sorts search results by the specified fields.
  • stats - Provides statistics, grouped optionally by fields.
  • top/rare - Displays the most/least common values of a field.
  • transaction - Groups search results into transactions.

See the search reference documentation for the complete list of commands.

External links[edit]