Memory-bound function

Memory bound refers to a situation in which the time to complete a given computational problem is decided primarily by the amount of free memory required to hold the working data. This is in contrast to algorithms that are compute-bound, where the number of elementary computation steps is the deciding factor.

Memory and computation boundaries can sometimes be traded against each other, e.g. by saving and reusing preliminary results or using lookup tables.

Memory-bound functions and memory functions

Memory-bound functions and memory functions are related in that both involve extensive memory access, but a distinction exists between the two.

Memory functions use a dynamic programming technique called memoization in order to relieve the inefficiency of recursion that might occur. It is based on the simple idea of calculating and storing solutions to subproblems so that the solutions can be reused later without recalculating the subproblems again. The best known example that takes advantage of memoization is an algorithm that computes the Fibonacci numbers. The following pseudocode uses recursion and memoization, and runs in linear CPU time:

 Fibonacci (n)
 {
     for i = 0 to n-1
         results[i] = -1  // -1 means undefined

     return Fibonacci_Results (results, n);
 }

 Fibonacci_Results (results, n)
 {
     if (results[n] != -1)  // If it has been solved before,
         return results[n]  // look it up.
     if (n == 0)
         val = 0
     else if (n == 1)
         val = 1
     else
         val = Fibonacci_Results(results, n-2 ) + Fibonacci_Results(results, n-1)
     results[n] = val  // Save this result for re-use.

     return val
 }

Compare the above to an algorithm that uses only recursion, and runs in exponential CPU time:

 Recursive_Fibonacci (n)
 {
     if (n == 0)
         return 0
     if (n == 1)
         return 1

     return Recursive_Fibonacci (n-1) + Recursive_Fibonacci (n-2)
 }

While the recursive-only algorithm is simpler and more elegant than the algorithm that uses recursion and memoization, the latter has a significantly lower time complexity than the former.

The term "memory-bound function" has surfaced only recently and is used principally to describe a function that uses XOR and consists of a series of computations in which each computation depends on the previous computation. Whereas memory functions have long been an important actor in improving time complexity, memory-bound functions have seen far fewer applications. Recently^[when?], however, scientists have proposed a method using memory-bound functions as a means to discourage spammers from abusing resources, which could be a major breakthrough in that area.

Using memory-bound functions to prevent spam

Memory-bound functions might be useful in a proof-of-work system that could deter spam, which has become a problem of epidemic proportions on the Internet.

In 1992, IBM research scientists Cynthia Dwork and Moni Naor published a paper at CRYPTO 1992 titled Pricing via Processing or Combatting Junk Mail,^[1] suggesting a possibility of using CPU-bound functions to deter abusers from sending spam. The scheme was based on the idea that computer users are much more likely to abuse a resource if the cost of abusing the resource is negligible: the underlying reason spam has become so rampant is that sending an e-mail has minuscule cost for spammers.

Dwork and Naor proposed that spamming might be reduced by injecting an additional cost in the form of an expensive CPU computation: CPU-bound functions would consume CPU resources at the sender's machine for each message, thus preventing huge amounts of spam from being sent in a short period.

The basic scheme that protects against abuses is as follows:
Given a Sender, a Recipient, and an email Message. If Recipient has agreed beforehand to receive e-mail from Sender, then Message is transmitted in the usual way. Otherwise, Sender computes some function $G(Message)$ and sends $(Message, G(Message))$ to Recipient. Recipient checks if what it receives from Sender is of the form $(Message, G(Message))$ . If yes, Recipient accepts Message. Otherwise, Recipient rejects Message.

The function $G()$ is selected such that the verification by Recipient is relatively fast (e.g., taking a millisecond) and such that the computation by Sender is somewhat slow (involving at least several seconds). Therefore, Sender will be discouraged from sending Message to multiple recipients with no prior agreements: the cost in terms of both time and computing resources of computing $G()$ repeatedly will become very prohibitive for a spammer who intends to send many millions of e-mails.

The major problem of using the above scheme is that fast CPUs compute much faster than slow CPUs. Further, higher-end computer systems also have sophisticated pipelines and other advantageous features that facilitate computations. As a result, a spammer with a state-of-the-art system will hardly be affected by such deterrence while a typical user with a mediocre system will be adversely affected. If a computation takes a few seconds on a new PC, it may take a minute on an old PC, and several minutes on a PDA, which might be a nuisance for users of old PCs, but probably unacceptable for users of PDAs. The disparity in client CPU speed constitutes one of the prominent roadblocks to widespread adoption of any scheme based on a CPU-bound function. Therefore, researchers are concerned with finding functions that most computer systems will evaluate at about the same speed, so that high-end systems might evaluate these functions somewhat faster than low-end systems (2–10 times faster, but not 10–100 times faster) as CPU disparities might imply. These ratios are "egalitarian" enough for the intended applications: the functions are effective in discouraging abuses and do not add a prohibitive delay on legitimate interactions, across a wide range of systems.

The new egalitarian approach is to rely on memory-bound functions. As stated before, a memory-bound function is a function whose computation time is dominated by the time spent accessing memory. A memory-bound function accesses locations in a large region of memory in an unpredictable way, in such a way that using caches are not effective. In recent years, the speed of CPU has grown drastically, but there has been comparatively small progress in developing faster main memory. Since the ratios of memory latencies of machines built in the last five years is typically no greater than two, and almost always less than four, the memory-bound function will be egalitarian to most systems for the foreseeable future.

References

^ Dwork, Cynthia; Naor, Moni (1992). "Pricing via Processing or Combatting Junk Mail". Advances in Cryptology — CRYPTO' 92. Lecture Notes in Computer Science. Vol. 740. pp. 139–147. doi:10.1007/3-540-48071-4_10. ISBN 978-3-540-57340-1. (updated version of same)

Abadi, M., Burrows, M., Manasse, M., & Wobber, T. (2005, May). Moderately Hard, Memory-bound Functions, ACM Transactions on Internet Technology.
Dwork, C., Goldberg, A., & Naor, M. (2003). On Memory-Bound Functions for Fighting Spam, Advances in Cryptology.
Hellman, M. E. (1980). A Cryptanalytic Time-Memory Trade Off, IEEE Transactionson Information Theory.

External links

[1] Dwork, Cynthia; Naor, Moni (1992). "Pricing via Processing or Combatting Junk Mail". Advances in Cryptology — CRYPTO' 92. Lecture Notes in Computer Science. Vol. 740. pp. 139–147. doi:10.1007/3-540-48071-4_10. ISBN 978-3-540-57340-1. (updated version of same)

[1]