Completely Fair Scheduler

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Completely Fair Scheduler
Original author(s)Ingo Molnár
Developer(s)Linux kernel developers
Written inC
Operating systemLinux kernel
Typeprocess scheduler
Location of the "Completely Fair Scheduler" (a process scheduler) in a simplified structure of the Linux kernel.

The Completely Fair Scheduler (CFS) is a process scheduler which was merged into the 2.6.23 (October 2007) release of the Linux kernel and is the default scheduler. It handles CPU resource allocation for executing processes, and aims to maximize overall CPU utilization while also maximizing interactive performance.

In contrast to the previous O(1) scheduler used in older Linux 2.6 kernels, the CFS scheduler implementation is not based on run queues. Instead, a red–black tree implements a "timeline" of future task execution. Additionally, the scheduler uses nanosecond granularity accounting, the atomic units by which an individual process' share of the CPU was allocated (thus making redundant the previous notion of timeslices). This precise knowledge also means that no specific heuristics are required to determine the interactivity of a process, for example.[1]

Like the old O(1) scheduler, CFS uses a concept called "sleeper fairness", which considers sleeping or waiting tasks equivalent to those on the runqueue. This means that interactive tasks which spend most of their time waiting for user input or other events get a comparable share of CPU time when they need it.


The data structure used for the scheduling algorithm is a red-black tree in which the nodes are scheduler-specific sched_entity structures. These are derived from the general task_struct process descriptor, with added scheduler elements.

The nodes are indexed by processor "execution time" in nanoseconds.[2]

A "maximum execution time" is also calculated for each process to represent the time the process would have expected to run on an "ideal processor". This is the time the process has been waiting to run, divided by the total number of processes.

When the scheduler is invoked to run a new process:

  1. The leftmost node of the scheduling tree is chosen (as it will have the lowest spent execution time), and sent for execution.
  2. If the process simply completes execution, it is removed from the system and scheduling tree.
  3. If the process reaches its maximum execution time or is otherwise stopped (voluntarily or via interrupt) it is reinserted into the scheduling tree based on its new spent execution time.
  4. The new leftmost node will then be selected from the tree, repeating the iteration.

If the process spends a lot of its time sleeping, then its spent time value is low and it automatically gets the priority boost when it finally needs it. Hence such tasks do not get less processor time than the tasks that are constantly running.

The fair queuing CFS scheduler has a scheduling complexity of O(log N), where N is the number of tasks in the runqueue. Choosing a task can be done in constant time, but reinserting a task after it has run requires O(log N) operations, because the runqueue is implemented as a red-black tree.


Con Kolivas's work with scheduling, most significantly his implementation of "fair scheduling" named Rotating Staircase Deadline, inspired Ingo Molnár to develop his CFS, as a replacement for the earlier O(1) scheduler, crediting Kolivas in his announcement.[3] CFS is an implementation of a well-studied, classic scheduling algorithm called weighted fair queuing.[4] Originally invented for packet networks, fair queuing had been previously applied to CPU scheduling under the name stride scheduling. CFS is the first implementation of a fair queuing process scheduler widely used in a general-purpose operating system.[4]

The Linux kernel received a patch for CFS in November 2010 for the 2.6.38 kernel that has made the scheduler "fairer" for use on desktops and workstations. Developed by Mike Galbraith using ideas suggested by Linus Torvalds, the patch implements a feature called autogrouping that significantly boosts interactive desktop performance.[5] The algorithm puts parent processes in the same task group as child processes.[6] (Task groups are tied to sessions created via the setsid() system call.[7]) This solved the problem of slow interactive response times on multi-core and multi-CPU (SMP) systems when they were performing other tasks that use many CPU-intensive threads in those tasks. A simple explanation is that, with this patch applied, one will be able to still watch a video, read email and perform other typical desktop activities without glitches or choppiness while, say, compiling the Linux kernel or encoding video.

In 2016, the Linux scheduler was patched for better multicore performance, based on the suggestions outlined in the paper, "The Linux Scheduler: A Decade of Wasted Cores".[8]

See also[edit]


  1. ^ Andrews, Jeremy (2007-04-18). "Linux: The Completely Fair Scheduler". KernelTrap. Archived from the original on 2007-04-19.
  2. ^ CFS description at
  3. ^ Molnár, Ingo (2007-04-13). "[patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]". linux-kernel (Mailing list).
  4. ^ a b Li, T.; Baumberger, D.; Hahn, S. (2009). "Efficient and scalable multiprocessor fair scheduling using distributed weighted round-robin" (PDF). ACM SIGPLAN Notices. 44 (4): 65. CiteSeerX doi:10.1145/1594835.1504188.
  5. ^ The ~200 Line Linux Kernel Patch That Does Wonders
  6. ^ Galbraith, Mike (2010-11-15). "[RFC/RFT PATCH v3] Re: [RFC/RFT PATCH v3] sched: automated per tty task groups [CFS]". linux-kernel (Mailing list).
  7. ^ Galbraith, Mike (2010-11-20). "[PATCH v4] sched: automated per session task groups". linux-kernel (Mailing list).
  8. ^ Lozi, Jean-Pierre; Lepers, Baptiste; Funston, Justin; Gaud, Fabien; Quema, Vivian; Fedorova, Alexandra. "The Linux Scheduler: A Decade of Wasted Cores" (PDF). EuroSys 2016. Retrieved 15 June 2019.

External links[edit]