= Pipeline Pilot =

Pipeline Pilot
- Title: Pipeline Pilot
- Developer: Accelrys
- Latest Release Version: 18.1
- Programming Language: C++
- Operating System: Windows and Linux
- Genre: Visual and dataflow programming language
- License: Proprietary

Pipeline Pilot is a desktop software application developed by Dassault Systèmes, focused on extract, transform, and load (ETL) processes and data analytics. Since its inception, the software has evolved to offer broader capabilities in various scientific and industrial applications.

Pipeline Pilot uses a visual and dataflow programming interface, allowing users to design workflows for data processing. The software's functionality spans several domains, including cheminformatics, QSAR, next-generation sequencing, image analysis, and text analytics. Pipeline Pilot was initially developed by SciTegic, a company that was acquired by BIOVIA in 2004. In 2014, BIOVIA became part of Dassault Systèmes.

== Uses ==
Pipeline Pilot is primarily used in industries that require extensive data processing and analysis, including life sciences, materials science, and engineering. The software allows users to create workflows by dragging and dropping functional components that automate data analysis tasks, integrate with databases, and perform various scientific computations. These workflows are referred to as "protocols" and can be shared and reused within teams or organizations.

The product supports multiple programming languages, including Python, .NET, MATLAB, Perl, SQL, Java, VBScript, and R, giving users flexibility in integrating custom code into their workflows. Additionally, Pipeline Pilot offers support for PilotScript, its own scripting language based on PLSQL, which allows users to perform custom data manipulations within their workflows. Additional modules include capabilities for specific scientific tasks, such as next-generation sequencing analysis, cheminformatics, and polymer property prediction.

==Overview==
The interface, known as the Pipeline Pilot Professional Client, allows users to create workflows by selecting and arranging individual data processing units called "components." These components perform a variety of functions such as loading, filtering, joining, or modifying data. Additional components can carry out more complex tasks, such as constructing regression models, training neural networks, or generating reports in formats like PDF.

Pipeline Pilot follows a component-based architecture where components serve as nodes in a workflow, connected by "pipes" that represent data flow in a directed graph. This framework enables the processing of data as it moves between the components.

Users have the flexibility to work with pre-installed components or develop custom ones within workflows, referred to as "protocols." Protocols, which consist of linked components, can be saved, reused, and shared, enabling streamlined data processing. The interface visualizes the connections between components, simplifying complex data workflows by presenting them as sequences of operations.

===Component collections===

Pipeline Pilot offers several add-ons called "collections," which are groups of specialized functions aimed at specific domains, such as genetic information processing or polymer analysis. These collections are available to users for an additional licensing fee.

The collections are organized into two main groups: science-specific and generic. The science-specific collections focus on areas like chemistry, biology, and materials modeling, while the generic collections provide tools for reporting, data analysis, and document search. Below is an overview of the available collections:

| Group |
| Science-specific |
| ADMET |
| Cheminformatics |
| Biology |
| Sequence Analysis |
| Mass Spectrometry for Proteomics |
| Next Generation Sequencing |
| Materials Modeling & Simulation |
| Polymer Properties (Synthia) |
| Generic |
| Database & Application Integration |
| Imaging |
| Analysis & Statistics |
| Advanced Data Modeling |
| R Statistics |
| Document Search & Analysis |
| Text Analytics |
| Laboratory |
| Analytical Instrumentation |

===Custom scripts===
Pipeline Pilot is commonly used for processing large and complex datasets, often exceeding 1TB in size. In its early development, Pipeline Pilot introduced a scripting language called "PilotScript," which allows users to write basic scripts that can be integrated into a protocol. Over time, support for additional programming languages was added, including Python, .NET, Matlab, Perl, SQL, Java, VBScript, and R. These languages can be used through APIs that execute commands without requiring the graphical user interface.

PilotScript, a language modeled on PLSQL, is used within specific components like the "Custom Manipulator (PilotScript)" or "Custom Filter (PilotScript)." An example of a simple PilotScript command is shown below, where a property named "Hello" is added to each record passing through the component with the value "Hello World!":

<syntaxhighlight lang="plpgsql"> Hello := "Hello World!"; </syntaxhighlight>
