Jump to content

Memory-bound function

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Googol30 (talk | contribs) at 11:58, 7 May 2016 (size ≠ speed, and even if it did, "speed" is ambiguous as it could either mean latency or throughput). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Memory bound refers to a situation in which the time to complete a given computational problem is decided primarily by the amount of memory required to hold data. The application of memory bound functions could prove to be valuable in preventing spam, which has become a problem of epidemic proportions on the Internet.

Memory bound functions and memory functions

Memory bound functions and memory functions are related in that both involve extensive memory access, but a distinction exists between the two.

Memory functions use a dynamic programming technique called memoization in order to relieve the inefficiency of recursion that might occur. It is based on the simple idea of calculating and storing solutions to subproblems so that the solutions can be reused later without recalculating the subproblems again. The best known example that takes advantage of memoization is an algorithm that computes the Fibonacci numbers. The following pseudocode illustrates an algorithm that uses memoization, which runs in linear CPU time:

 Fibonacci (n)
 {
    for i = 0 to n-1
           results[i] = -1  // -1 means undefined
     return Fibonacci_Results (results, n);
 }
  Fibonacci_Results (results, n)
  {
     if (results[n] != -1) //check if it has already been solved before
        return results[n]
     if (n == 0)
           val = 0
     else if (n == 1)
           val = 1
     else
           val = Fibonacci_Results(results, n -2 ) + Fibonacci_Results(results, n -1)
     results[n] = val
     return val
  }

Compare the above to an algorithm that uses recursion, which runs in exponential CPU time:

  Recursive_Fibonacci (n)
  {	
     if (n == 0)
        return 0
     else if ( n == 1)
        return 1
     else
        return Recursive_Fibonacci (n -1) + Recursive_Fibonacci (n -2)
  }

While the recursive algorithm is simpler and more elegant than the algorithm that uses memoization, the latter has a significantly lower time complexity than the former. The term "memory bound function" has surfaced only recently and is used principally to describe a function that uses XOR and consists of a series of computations in which each computation depends on the previous computation. Whereas memory functions have long been an important actor in improving time complexity, memory bound functions have seen far fewer applications. Recently[when?], however, scientists have proposed a method using memory bound functions as a means to discourage spammers from abusing resources, which could be a major breakthrough in that area.

Using memory bound functions to prevent spam

In 1992, IBM research scientists Cynthia Dwork and Moni Naor published a paper titled Pricing via Processing or Combating Junk Mail, suggesting a possibility of using CPU bound functions to deter abusers from sending spam. The scheme was based on the idea that computer users are much more likely to abuse a resource if the cost of abusing the resource is negligible: the underlying reason spam has become so rampant is that sending an e-mail has minuscule cost for spammers.

Dwork and Naor proposed that spamming might be reduced by injecting an additional cost in the form of an expensive CPU computation: CPU bound functions would consume CPU resources at the sender's machine for each message, thus preventing huge amounts of spam from being sent in a short period. The basic scheme that protects against abuses is as follows:

Let S be sender, R be recipient, and M be an e-mail. If R has agreed beforehand to receive e-mail from S, then M is transmitted in the usual way. Otherwise, S computes some function G(M) and sends (M, G(M)) to R. R checks if what it receives from S is of the form (M, G(M)). If yes, R accepts M. Otherwise, R rejects M. The figure on the right depicts cases in which there were no prior agreements:

The function G() is selected such that the verification by R is relatively fast (taking a millisecond) and such that the computation by S is somewhat slow (involving at least several seconds). Therefore, S will be discouraged from sending M to multiple recipients with no prior agreements: the cost in terms of both time and computing resources of computing G() repeatedly will become very prohibitive for a spammer who intends to send many millions of e-mails.

The major problem of using the above scheme is that fast CPUs compute much faster than slow CPUs. Further, higher-end computer systems also have sophisticated pipelines and other advantageous features that facilitate computations. As a result, a spammer with a state-of-the-art system will hardly be affected by such deterrence while a typical user with a mediocre system will be adversely affected. If a computation takes a few seconds on a new PC, it may take a minute on an old PC, and several minutes on a PDA, which might be a nuisance for users of old PCs, but probably unacceptable for users of PDAs. The disparity in client CPU speed constitutes one of the prominent roadblocks to widespread adoption of any scheme based on a CPU bound function. Therefore, researchers are concerned with finding functions that most computer systems will evaluate at about the same speed, so that high-end systems might evaluate these functions somewhat faster than low-end systems (2-10 times faster, but not 10-100 faster) as CPU disparities might imply. These ratios are “egalitarian" enough for the intended applications: the functions are effective in discouraging abuses and do not add a prohibitive delay on legitimate interactions, across a wide range of systems.

The new egalitarian approach is to rely on memory bound functions. As stated before, a memory bound function is a function whose computation time is dominated by the time spent accessing memory. A memory bound function accesses locations in a large region of memory in an unpredictable way, in such a way that using caches are not effective. In recent years, the speed of CPU has grown drastically, but there has been comparatively small progress in developing faster main memory. Since the ratios of memory latencies of machines built in the last five years is typically no greater than two, and almost always less than four, the memory bound function will be egalitarian to most systems for the foreseeable future.

See also

References