XOR the input block into the left half of the state.
Apply a 42-round unkeyed permutation (encryption function) to the state. This consists of 42 repetitions of:
Break the input into 256 4-bit blocks, and map each through one of two 4-bit S-boxes, the choice being made by a 256-bit round-dependent key schedule. Equivalently, combine each input block with a key bit, and map the result through a 5→4 bit S-box.
Permute 4-bit blocks so that they will be adjacent to different blocks in following rounds.
The final half-round consists of an S-box substitution without a following MDS or permutation step.
XOR the input block into the right half of the state.
The resulting digest is the first 224, 256, 384 or 512 bits from the 1024-bit final value. It is well suited to a bit slicing implementation using the SSE2 instruction set, giving speeds of 16.8 Cycles per byte.