Buddy memory allocation

The buddy memory allocation technique is a memory allocation technique that divides memory into partitions to try to satisfy a memory request as suitably as possible. This system makes use of splitting memory into halves to try to give a best-fit. According to Knuth, the buddy system was invented in 1963 by Harry Markowitz, who won the 1990 Nobel Memorial Prize in Economics, and was independently developed by Knowlton (published 1965).

Implementation and consequences

Compared to the memory allocation techniques (such as paging) that modern operating systems use, the buddy memory allocation is relatively easy to implement, and does not have the hardware requirement of a MMU. Thus, it can be implemented, for example, on Intel 80286 and below computers.

In comparison to other simpler techniques such as dynamic allocation, the buddy memory system has little external fragmentation, and has little overhead trying to do compaction of memory.

However, because of the way the buddy memory allocation technique works, there may be a moderate amount of internal fragmentation - memory wasted because the memory requested is a little larger than a small block, but a lot smaller than a large block. (For instance, a program that requests 66K of memory would be allocated 128K, which results in a waste of 62K of memory). Internal fragmentation is where more memory than necessary is allocated to satisfy a request wasting memory. External fragmentation is where enough memory is free to satisfy a request, but it is split into two or more chunks none of which is big enough to satisfy the request.

How it works

The buddy memory allocation technique allocates memory in powers of 2, i.e 2^x, where x is an integer. Thus, the programmer has to decide on, or to write code to obtain the upper limit of x. For instance, if the system had 2000K of physical memory, the upper limit on x would be 10, since 2¹⁰ (1024K) is the biggest allocatable block. This results in making it impossible to allocate everything in as a single chunk; the remaining 976K of memory would have to be taken in smaller blocks.

After deciding on the upper limit (let's call the upper limit u), the programmer has to decide on the lower limit, i.e. the smallest memory block that can be allocated. This lower limit is necessary so that the overhead of storing used and free memory locations is minimized. If this lower limit did not exist, and many programs request small blocks of memory like 1K or 2K, the system would waste a lot of space trying to remember which blocks are allocated and unallocated. Typically this number would be a moderate number (like 2, so that memory is allocated in 2² = 4K blocks), small enough to minimize wasted space, but large enough to avoid excessive overhead. Let's call this lower limit l.

Now we have got our limits right, let us see what happens when a program requests for memory. Let's say in this system, l = 6, which results in blocks 2⁶ = 64K in size, and u = 10, which results in a largest possible allocatable block, 2¹⁰ = 1024K in size. The following shows a possible state of the system after various memory requests.

	64K	64K	64K	64K	64K	64K
$t=0$	1024K
$t=1$	A-64K	64K	128K	256K		512K
$t=2$	A-64K	64K	B-128K	256K		512K
$t=3$	A-64K	C-64K	B-128K	256K		512K
$t=4$	A-64K	C-64K	B-128K	D-128K	128K	512K
$t=5$	A-64K	64K	B-128K	D-128K	128K	512K
$t=6$	128K		B-128K	D-128K	128K	512K
$t=7$	256K			D-128K	128K	512K
$t=8$	1024K

This allocation could have occurred in the following manner

Program A requests memory 34K..64K in size
Program B requests memory 66K..128K in size
Program C requests memory 35K..64K in size
Program D requests memory 67K..128K in size
Program C releases its memory
Program A releases its memory
Program B releases its memory
Program D releases its memory

As you can see, what happens when a memory request is made is as follows:

If memory is to be allocated

Look for a memory slot of a suitable size (the minimal 2^k block that is larger than the requested memory)
1. If it is found, it is allocated to the program
2. If not, it tries to make a suitable memory slot. The system does so by trying the following:
  1. Split a free memory slot larger than the requested memory size into half
  2. If the lower limit is reached, then allocate that amount of memory
  3. Go back to step 1 (look for a memory slot of a suitable size)
  4. Repeat this process until a suitable memory slot is found

If memory is to be freed

Free the block of memory
Look at the neighbouring block - is it free too?
If it is, combine the two, and go back to step 2 and repeat this process until either the upper limit is reached (all memory is freed), or until a non-free neighbour block is encountered

This method of freeing memory is rather efficient, as compaction is done relatively quickly, with the maximal number of compactions required equal to 2^u / 2^l (i.e. 2^u-l).

Typically the buddy memory allocation system is implemented with the use of a binary tree to represent used or unused split memory blocks.

However, there still exists the problem of internal fragmentation. In many situations, it is essential to minimize the amount of internal fragmentation. This problem can be solved by slab allocation.

Algorithm

One possible version of the buddy allocation algorithm was described in detail by Donald Knuth in The Art of Computer Programming. This is a complicated process.

References

Donald Knuth: The Art of Computer Programming Volume 1: Fundamental Algorithms. Second Edition (Reading, Massachusetts: Addison-Wesley, 1997), pp. 435-455. ISBN 0-201-89683-4