EROS (The Extremely Reliable Operating System) is an operating system developed beginning in 1991 by The EROS Group, LLC., the Johns Hopkins University, and the University of Pennsylvania. Features include automatic data and process persistence, some preliminary real-time support, and capability-based security. EROS is purely a research operating system, and was never deployed in real world use. As of 2005[update], development has stopped in favor of two successor systems, CapROS and Coyotos.
The overriding goal of the EROS system (and its relatives) is to provide strong support at the operating system level for the efficient restructuring of critical applications into small communicating components. Each component can communicate with the others only through protected interfaces, and is isolated from the rest of the system. A "protected interface", in this context, is one that is enforced by the lowest level part of the operating system (the kernel). The kernel is the only portion of the system that can move information from one process to another. It also has complete control of the machine and (if properly constructed) cannot be bypassed. In EROS, the kernel-provided mechanism by which one component names and invokes the services of another is capabilities using inter-process communication (IPC). By enforcing capability-protected interfaces, the kernel ensures that all communications to a process arrive via an intentionally exported interface. It also ensures that no invocation is possible unless the invoking component holds a valid capability to the invokee. Protection in capability systems is achieved by restricting the propagation of capabilities from one component to another, often through a security policy known as confinement.
Capability systems naturally promote component-based software structure. This organizational approach is similar to the programming language concept of object-oriented programming, but occurs at larger granularity and does not include the concept of inheritance. When software is restructured in this way, several benefits emerge:
- The individual components are most naturally structured as event loops. Examples of systems that are commonly structured this way include flight control systems (see also DO-178B Software Considerations in Airborne Systems and Equipment Certification), and telephone switching systems (see 5ESS switch). Event-driven programming is chosen for these systems primarily because of simplicity and robustness, which are essential attributes in life-critical and mission-critical systems.
- Components become smaller and individually testable, which helps the implementor to more readily identify flaws and bugs.
- The isolation of each component from the others limits the scope of the damage that may occur when something goes wrong or the software misbehaves.
Collectively, these benefits lead to measurably more robust and secure systems. The SDS Sigma 7 was a hardware-based capability system originally designed for use in telephony switches. A capability-based design was chosen specifically for reasons of robustness.
In contrast to many earlier systems, capabilities are the only mechanism for naming and using resources in EROS. Such a system is sometimes referred to as a pure capability system. The IBM AS/400 is an example of a commercially successful capability system, but it is not a pure capability system.
Pure capability architectures are supported by well-tested and mature mathematical security models. These have been used to formally demonstrate that capability-based systems can be made secure if implemented correctly. The so-called "safety property" has been shown to be decidable for pure capability systems (see Lipton). Confinement, which is the fundamental building block of isolation, has been formally verified to be enforceable by pure capability systems, and is reduced to practical implementation by the EROS "constructor" and the KeyKOS "factory". No comparable verification exists for any other primitive protection mechanism. There is a fundamental result in the literature showing that "safety" is mathematically undecidable in the general case (see HRU, but note that it is of course provable for an unbounded set of restricted cases). Of greater practical importance, safety has been shown to be false for all of the primitive protection mechanisms shipping in current commodity operating systems (see HRU). Safety is a necessary precondition to successful enforcement of any security policy. In practical terms, this result means that it is not possible in principle to secure current commodity systems, but it is potentially possible to secure capability-based systems provided they are implemented with sufficient care. Neither system has ever been successfully penetrated, and their isolation mechanisms have never been successfully defeated by any inside attacker, but it is not known whether the EROS or KeyKOS implementations was careful enough. One goal of the Coyotos project is to demonstrate that component isolation and security has been definitively achieved by applying software verification techniques.
The L4.sec system, which is a successor to the L4 microkernel family, is a capability-based system, and has been significantly influenced by the results of the EROS project. The influence is mutual, since the EROS work on high-performance invocation was motivated strongly by Jochen Liedtke's successes with the L4 microkernel family.
The EROS project started in 1991 as a clean-room reconstruction of an earlier system, KeyKOS. KeyKOS was an operating system developed by Key Logic, Inc., and was a direct continuation of work on the earlier GNOSIS (Great New Operating System In the Sky) system created by Tymshare, Inc. The KeyKOS system offered a degree of security and reliability that remains unduplicated today (2006). The circumstances surrounding Key Logic's unfortunate demise in 1991 made licensing KeyKOS impractical. Since KeyKOS did not run on popular commodity processors in any case, the decision was made to reconstruct it from the publicly available documentation.
By late 1992, it had become clear that processor architecture had changed significantly since the introduction of the capability idea, and it was no longer obvious that component-structured systems were practical. Microkernel-based systems, which similarly favor large numbers of processes and IPC, were facing severe performance challenges, and it was uncertain if these could be successfully resolved. The x86 architecture was clearly emerging as the dominant architecture but the expensive user/supervisor transition latency on the 386 and 486 presented serious challenges for process-based isolation. The EROS project was turning into a research effort, and moved to the University of Pennsylvania to become the focus of Shapiro's dissertation research. By 1999, a high performance implementation for the Pentium processor had been demonstrated that was directly performance competitive with the L4 microkernel family, which is known for its exceptional speed in IPC. The EROS confinement mechanism had been formally verified, in the process creating a general formal model for secure capability systems.
In 2000, Shapiro joined the faculty of Computer Science at Johns Hopkins University. At Hopkins, the goal was to show how to use the facilities provided by the EROS kernel to construct secure and defensible servers at application level. Funded by the Defense Advanced Research Projects Agency and the Air Force Research Laboratory, EROS was used as the basis for a trusted window system, a high-performance, defensible network stack, and the beginnings of a secure web browser. It was also used to explore the effectiveness of lightweight static checking. In 2003, some very challenging security issues were discovered that are intrinsic to any system architecture based on synchronous IPC primitives (notably including EROS and L4). Work on EROS halted in favor of Coyotos, which resolves these issues.
As of 2006[update], EROS and its successors are the only widely available capability systems that run on commodity hardware.
Work on EROS by the original group has halted, but there are two successor systems. The CapROS system is building directly from the EROS code base, while the Coyotos system is a successor system that addresses some of the architectural deficiencies of EROS, and is exploring (as research) the possibility of a fully verified operating system. Both CapROS and Coyotos are expected to be released in various commercial deployments.
- Verifying the EROS Confinement Mechanism
- Peter Lee: Proof-Carrying Code
- Differences Between Coyotos and EROS — A Quick Summary
- R. J. Lipton and L. Snyder. "A Linear Time Algorithm for Deciding Subject Security." Journal of the ACM, 24'(3):455—464, 1977.
- Michael A. Harrison, W. L. Ruzzo and Jeffrey D. Ullman. "Protection in Operating Systems". Communications of ACM. 19(8):461—471, August 1976.