IBM SAN Volume Controller

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer data storage, the IBM SAN Volume Controller (SVC) is a block storage virtualization appliance that belongs to the IBM System Storage product family. SVC implements an indirection, or "virtualization", layer in a Fibre Channel storage area network (FC SAN).

Contents

[edit] Architecture

SVC is always deployed as a cluster of nodes. Each node is a 1U high rack-mounted appliance based on an IBM System x server (SVC machine type is 2145). Each node has four Fibre Channel ports, two 1 Gbit/s Ethernet ports (for management and iSCSI) and the option of two 10 Gbit/s Ethernet ports (for iSCSI) and is protected by a dedicated uninterruptible power supply. Each node runs a Linux kernel and a specialized Virtualization Storage Software environment, that provides proprietary clustering capability. Each node has a service controller which provides a 2-row display, and a 5 button keyboard used to configure, service and monitor the status of the node.

SVC is based on COMmodity PArts Storage System (Compass) architecture, developed at the IBM Almaden Research Center.[1] The majority of the software has been developed at the IBM Hursley Labs in the UK.

The SVC is a gateway device, meaning it sits between the hosts and the storage arrays, presenting itself to hosts as the target and presenting itself to arrays as the initiator. All Fibre Channel ports on the SVC are both targets and initators, and all Fibre Channel ports on all nodes MUST be zoned to each other to allow communication between the nodes and the transfer and mirroring of data.

An SVC cluster consists of several pairs of nodes (up to 4 pairs). Each pair of nodes is called an I/O group and provides write data cache mirroring across the pair. In case of I/O path failure, non-disruptive failover is performed inside I/O group only, via multipath driver such as IBM Subsystem Device Driver (SDD) software.[1]

[edit] Terminology

  • Node - a single 1U machine.
SVC node models
Type-model Cache [GB] FC speed [Gb/s] iSCSI Speed [Gb/s] Based upon Announced
2145-4F2 4 2 n/a x335 2 June 2003
2145-8F2 8 2 1 x336 25 October 2005
2145-8F4 8 4 1 x336 23 May 2006
2145-8G4 8 4 1 x3550 22 May 2007
2145-8A4 8 4 1 x3250 28 October 2008
2145-CF8 24 8 1 x3550M2 20 October 2009
2145-CG8 24 8 1 (10 Gbit/s optional) x3550M3 5 May 2011
  • I/O group - a pair of nodes which duplicate each other's write commands.
  • Cluster - a group of 1 to 4 I/O Groups, which are managed as a single entity.
    • Cluster IP address - a single IP address of a cluster that provides administrative interfaces via (SSH and HTTPS).
    • Service IP address - an IP address used to service an individual node. Each node can have a service IP configured.
    • Configuration node - a single node that holds the cluster's configuration and has the assigned cluster IP address.
  • Master Console - a management GUI for SVC, based on WebSphere Application Server; not installed on any SVC node, but on a separate machine[1]
    • As of SVC code 6.x and greater, a Master Console is no longer used. Web based administration is done directly on the configuration node.
  • Virtual Disk (VDisk) - a unit of storage presented to the host. The release 6 GUI refers to a VDisk as a Volume.
  • Managed Disk (MDisk) - a unit of storage (a LUN) from a real, external disk array, virtualized by the SVC. An MDisk is the base to create an image mode VDisk.
  • Managed Disk Group - (MDisk Group) a group of one or more Mdisks. The extents of the MDisks in an MDisk Group are the base to create a striped or sequential mode VDisk. The release 6 GUI refers to a Managed Disk Group as a Pool.
  • Extent - a discrete unit of storage; an MDisk is divided into extents; a VDisk is formed from set of extents.

[edit] Software versions

Major releases
Release Code Level
version.release.mod.fix (V.R.M.F)
Code Build Level
1.1.0.0 0.13.03070300
1.1.1.0 0.32.0311060000
1.2.0.0 0.53.0404190000
1.2.1.0 1.21.0410150000
2.1.0.0 2.16.0502180000
3.1.0.0 3.17.0511040000
4.1.0.0 4.25.0606010000
4.1.1.0 5.13.0611030000
4.2.0.0 6.17.0705210000
4.2.1.0 7.7.0711051000
4.3.0.0 8.16.0806230000
4.3.1.0 9.14.0811070000
5.1.0.0 17.8.0910292000
5.1.0.4 18.1.1005100000
5.1.0.5 18.1.1006120000
5.1.0.6 18.2.1007260000
5.1.0.7 18.2.1009060000
5.1.0.8 18.3.1011240000
5.1.0.9 18.3.1101260000
5.1.0.10 18.3.1104050000
5.1.0.11 18.3.1107290000
6.1.0.0 25.0.1011041000
6.1.0.1 25.1.1011090000
6.1.0.2 25.1.1011240000
6.1.0.3 25.2.1012061000
6.1.0.4 25.3.1012080000
6.1.0.5 25.3.1012240000
6.1.0.6 25.3.1101192000
6.1.0.7 25.3.1103030000
6.1.0.8 25.6.1105250000
6.1.0.9 25.6.1106030000
6.1.0.10 25.8.1108120000
6.2.0.0 36.0.1106030000
6.2.0.1 36.0.1106060000
6.2.0.2 36.3.1107080000
6.2.0.3 36.5.1109020000
6.2.0.4 36.7.1111040000
6.3.0.0 54.6.1111250000
6.3.0.1 54.6.1201270000

[edit] Performance

Release 4.3 of the SVC held the Storage Performance Council (SPC) world record for SPC-1 performance benchmarks, returning nearly 275K (274,997.58) IOPS. There was no faster storage subsystem benchmarked by the SPC at that time (October 2008).[2] The SPC-2 benchmark also returned a world leading measurement of over 7 GB/s throughput.

With the release of version 5.1 there are new test results using a 4 node and 6 node cluster with DS8700 as backed storage device. With this configuration, in March 2010 the IBM SVC broke its own record of 274,997.58 SPC-1 IOPS with 315,043.59 for the 4 node cluster and 380,489.30 with the 6 node cluster, records that stood until October 2011.[3] The full results and executive summaries can be reviewed at the SPC website referenced above.

Release 6.2 of the SVC held the Storage Performance Council (SPC) world record for SPC-1 performance benchmarks, returning over 500K (520,043.99) IOPS (I/Os per second) using 8 SVC nodes and Storwize V7000 as the backend disk. There was no faster storage subsystem benchmarked by the SPC at that time (January 2012).[4]

Note: "Cache hit" or "bandwidth" performance numbers are usually much higher, e.g. "20 GBPS", but are relatively meaningless as they cannot be achieved in real-word scenarios.

[edit] Features (2011)

Indirection or mapping from virtual LUN to physical LUN
Servers access SVC as if it were a storage controller. The SCSI LUNs they see represent virtual disks (volumes) which are allocated in SVC from a pool of storage made up from one or more managed disks (MDisks). A managed disk is simply a storage LUN provided by one of the storage controllers that SVC is virtualizing.
Data migration
SVC can move volumes from MDisk group to MDisk group, whilst maintaining I/O access to the data. MDisk groups can be shrunk or expanded by removing or adding hardware LUNs, while maintaining I/O access to the data. Both features can be used for seamless hardware migration. Migration from an old SVC model to the most recent model is also seamless and implies no copying of data.
Importing existing LUNs via a feature called Image Mode
"Image mode" is a one-to-one representation of an MDisk (managed LUN) which contains existing client data; such an MDisk can be seamlessly imported into or removed from an SVC cluster.
Fast-write cache
Writes from hosts are acknowledged once they have been committed into the SVC mirrored cache, but prior to being destaged to the underlying storage controllers. Data is protected by being replicated to the other node in an I/O group (node pair). Cache size is dependant on the model of SVC used. Fast-write cache is also used to increases performance in midrange storage configurations.
Auto tiering (Easy Tier)
SVC automatically selects the best storage hardware for each chunk of data, according to its access patterns. Cache unfriendly "hot" data is dynamically moved to solid state drives SSD, whereas cache friendly "hot" and any "cold" data is moved to economic spinning disks.
Solid state drive SSD support
SVC can use any supported external SSD storage device or provide its own internal SSD slots, up to 32 per cluster. Easy Tiering is automatically active when mixing SSDs with spinning disks in hybrid MDisk groups.
Space-efficient features
LUN capacity is only used when new data is written to a LUN. Also known as Thin Provisioning. Data blocks equal zero are not physically allocated, unless previous data unequal zero exists.
Thin provisioning is typically combined with the FlashCopy features detailed below to provide space-efficient snapshots
Virtual Disk Mirroring
Provides the ability to make two copies of a LUN, implicitly on different storage controllers
Stretched Cluster, also called Split IOgroup
A geographically distributed cluster layout leveraging the virtual disk mirroring feature across datacenters within 300 km distance. A stretched cluster presents one logical storage layer over synchronous distances for increased high availability. Unlike in classical mirroring, logical LUNs are writable on both sides (tandem) at the same time, removing the need for "failover", "role switch", or "site switch". The feature can be combined with Live Partition Mobility or VMotion to avoid any data transport (storage mobility or storage VMotion). Each side's SVC nodes also have access to the other side's physical storage hardware, removing the need for data rebuilds in case of simple node failures.

[edit] Licensed Features

The payment for base license is per TB of MDisks or per number of physical disk drives in the underlying layer. There are some optional features, separately licensed per TB:[1]

Metro Mirror - synchronous remote replication
This allows a remote disaster recovery site at a distance of up to about 300km[5]
Global Mirror - asynchronous remote replication
This allows a remote disaster recovery site at a distance of thousands of kilometres. Each Global Mirror relationship can be configured for high latency / low bandwidth or for high latency / high bandwidth connectivity, the latter allowing a consistent recovery point objective RPO below 1 sec.
FlashCopy (FC)
This is used to create a disk snapshot for backup, or application testing of a single volume. Snapshots require only the "delta" capacity unless created with full-provisioned target volumes. FlashCopy comes in three flavours: Snapshot, Clone, Backup volume. All are based on optimized copy-on-write technology, and may or may not remain linked to their source volume.
One source volume can have up to 256 simultaneous targets. Targets can be made incremental, and cascaded tree like dependency structures can be constructed. Targets can be re-applied to their source or any other appropriate volume, also of different size (e.g. resetting any changes from a resize command).
Copy-on-write is based on a bitmap with a configurable grain size, as opposed to a journal.[1]

[edit] Other products running SVC code

On 7 October 2010, IBM announced the IBM Storwize V7000.[6] This uses the SAN Volume Controller code base with internal storage to provide a mid-level storage subsystem.[7]

[edit] See also

[edit] References

[edit] External links

Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export