Transitive data skew

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

In distributed computing problems, transitive data skew is an issue of data synchronization.

It arises with the uneven distribution of otherwise evenly distributed data across a number of devices while the data is in transition. If sorted data is being distributed across multiple devices and the column on which that data is sorted is the "key" used to identify the target device, the resulting transitive data skew may be self-correcting.