Jump to content

Transitive data skew

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Yobot (talk | contribs) at 10:16, 7 January 2015 (Tagging using AWB (10703)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In distributed computing problems, transitive data skew is an issue of data synchronization.

It arises with the uneven distribution of otherwise evenly distributed data across a number of devices while the data is in transition. If sorted data is being distributed across multiple devices and the column on which that data is sorted is the "key" used to identify the target device, the resulting transitive data skew may be self-correcting.