Linux DM Multipath

From Wikipedia, the free encyclopedia
Jump to: navigation, search

DM-Multipathing (DM-MPIO) provides input-output (I/O) fail-over and load-balancing within Linux for block devices.[1][2][3] By utilizing device-mapper, multipathd provides the host-side logic to use multiple paths of a redundant network to provide continuous availability and higher bandwidth connectivity between the host server and the block-level device.[4] DM-MPIO handles the rerouting of block I/O to an alternate path in the event of a path failure. DM-MPIO can also balance the I/O load across all of the available paths that are typically utilized in Fibre Channel (FC) and iSCSI SAN environments.[5] DM-MPIO is based on the device mapper, which provides the basic framework that maps one block device onto another.

Considerations[edit]

When utilizing Linux DM-MPIO in a datacenter that has other operating systems and multipath solutions, key components of path management must be considered.

  • Load balancing -- The workload is distributed across the available hardware components. Goal: Reduce I/O completion time, maximize throughput, and optimize resources
  • Path failover and recover -- Utilizes redundant I/O channels to redirect application reads and writes when one or more paths are no longer available.

Components[edit]

Simple multipath example

DM-MPIO in Linux consists of kernel components and user-space components.

  • Kernel – device-mapper – block subsystem that provides layering mechanism for block devices.
  • User-space – multipath-tools – provides the tools to manage multipathed devices by instructing the device-mapper multipath module what to do. The tools consist of:
    • Multipath: scans the system for multipathed devices, assembles them, updates the device-mapper's map.[5]
    • Multipathd: daemon that waits for maps events, and then executes multipath and monitors the paths. Marks a path as failed when the path becomes faulty. Depending on the failback policy, it can reactivate the path.[5]
    • Devmap-name: provides a meaningful device-name to udev for devmaps.[5]
    • Kpartx: maps linear devmaps to device partitions to make multipath maps partionable.[5]
    • Multipath.conf: configuration file for the multipath daemon. Used to overwrite the built-in configuration table of multipathd.

Configuration file[edit]

The configuration file /etc/multipath.conf makes many of the DM-MPIO features user-configurable. The multipath command and the kernel daemon multipathd use information found in this file. The file is only consulted during the configuration of the multipath devices. Changes must be made prior to running the multipath command. Changes to the file afterwards will require multipath to be executed again.

The multipath.conf has five sections:[6]

  1. System level defaults (defaults): User can override system level defaults.
  2. Blacklisted devices (blacklist): User specifies the list of devices that is not be to under the control of DM-MPIO.
  3. Blacklist exceptions (blacklist_exceptions): Specific devices to be treated as multipath devices even if listed in the blacklist.
  4. Storage controller specific settings (devices): User specified configuration settings will be applied to devices with specified "Vendor" and "Product" information.
  5. Device specific settings (multipaths): Fine tune the configuration settings for individual LUNs.

Terminology[edit]

  • HBA: Host bus adapters provide the physical interface between the input/output (I/O) host bus of Fibre Channel devices and the underlying Fibre Channel network.[7]
  • Path: Connection from the server through the HBA to a specific LUN.
  • DM Path States: The device mapper's view of the path condition. Only two conditions are possible:
    • Active: The last I/O operation sent through this path successfully completed. Analogous to ready path state.
    • Failed: The last I/O operation sent through this path did not successfully complete. Analogous to faulty path state.
  • Failover: When a path is determined to be in a failed state, a path that is in ready state will be made active.[8]
  • Failback: When a failed path is determined to be active again, multipathd may choose to failback to the path as determined by the failback policy.[9]
  • Failback Policy: Three options as set in the multipath.conf configuration file.
    • Immediate: Immediately failback to the highest priority path.
    • Number of seconds: Wait for a specified number of seconds to allow the I/O to stabilize, then failback to the highest priority path.
    • Manual: The failed path is not monitored, requires user intervention to failback.
  • Active/Active: In a system that has two storage controllers, each controller can process I/O.[10]
  • Active/Passive: In a system that has two storage controllers, only one controller at a time is able to process I/O, the other (passive) is in a standby mode.[10]
  • LUN: SCSI Logical Unit Number
  • WWID: Worldwide Identifier is an identifier for the multipath device that is guaranteed to be globally unique and unchanging.

Further reading[edit]

References[edit]

  1. ^ Olien, David. "Linux Multipathing". 
  2. ^ Varoqui, Christophe. "The Linux multipath implementation". 
  3. ^ Oberg, Michael. "Exploration of Parallel Storage Architectures for a Blue Gene/L on the TeraGrid". 
  4. ^ van Vugt, Sander. A Practical Guide to XEN High Availability. 
  5. ^ a b c d e SUSE. "Storage Administration Guide, SUSE Linux Enterprise Server 11 SP1". SLES11 Documentation, pg. 49. 
  6. ^ RedHat. "Using Device-Mapper Multipath". Using Device-Mapper Multipath. 
  7. ^ Gupta, Meeta (2002). Storage Area Network Fundamentals. Indianapolis, IN: Cisco Press. p. 81. ISBN 1-58705-065-X. 
  8. ^ Anderson, Michael. "SCSI Mid-Level Multipath". 
  9. ^ "Storage Administration Guide, SLES11 Documentation". SUSE. p. 73. 
  10. ^ a b Centos. "Overview of DM-Multipath". Using Device-Mapper Multipath.