Copy elision

In C++ computer programming, copy elision refers to a compiler optimization technique that eliminates unnecessary copying of objects.

The C++ language standard generally allows implementations to perform any optimization, provided the resulting program's observable behavior is the same as if, i.e. pretending, the program were executed exactly as mandated by the standard. Beyond that, the standard also describes a few situations where copying can be eliminated even if this would alter the program's behavior, the most common being the return value optimization (see below). Another widely implemented optimization, described in the C++ standard, is when a temporary object of class type is copied to an object of the same type.^[1]^[2] As a result, copy-initialization is usually equivalent to direct-initialization in terms of performance, but not in semantics; copy-initialization still requires an accessible copy constructor.^[3] The optimization can not be applied to a temporary object that has been bound to a reference.

Example

#include <iostream>

int n = 0;

struct C {
  explicit C(int) {}
  C(const C&) { ++n; }  // the copy constructor has a visible side effect
};                      // it modifies an object with static storage duration

int main() {
  C c1(42);      // direct-initialization, calls C::C(int)
  C c2 = C(42);  // copy-initialization, calls C::C(const C&)

  std::cout << n << std::endl;  // prints 0 if the copy was elided, 1 otherwise
}

According to the standard a similar optimization may be applied to objects being thrown and caught,^[4]^[5] but it is unclear whether the optimization applies to both the copy from the thrown object to the exception object, and the copy from the exception object to the object declared in the exception-declaration of the catch clause. It is also unclear whether this optimization only applies to temporary objects, or named objects as well.^[6] Given the following source code:

#include <iostream>

struct C {
  C() = default;
  C(const C&) { std::cout << "Hello World!\n"; }
};

void f() {
  C c;
  throw c;  // copying the named object c into the exception object.
}  // It is unclear whether this copy may be elided (omitted).

int main() {
  try {
    f();
  } catch (C c) {  // copying the exception object into the temporary in the
                   // exception declaration.
  }  // It is also unclear whether this copy may be elided (omitted).
}

A conforming compiler should therefore produce a program which prints "Hello World!" twice. In the C++11 revision of the C++ standard, the issues have been addressed, essentially allowing both the copy from the named object to the exception object, and the copy into the object declared in the exception handler to be elided.^[6]

GCC provides the -fno-elide-constructors option to disable copy-elision. This option is useful to observe (or not observe) the effects of return value optimization or other optimizations where copies are elided. It is generally not recommended to disable this important optimization.

C++17 Provides for "guaranteed copy elision", a prvalue is not materialized until needed, and then it is constructed directly into the storage of its final destination.^[7]

Return value optimization

In the context of the C++ programming language, return value optimization (RVO) is a compiler optimization that involves eliminating the temporary object created to hold a function's return value.^[8] RVO is allowed to change the observable behaviour of the resulting program by the C++ standard.^[9]

Summary

In general, the C++ standard allows a compiler to perform any optimization, provided the resulting executable exhibits the same observable behaviour as if (i.e. pretending) all the requirements of the standard have been fulfilled. This is commonly referred to as the "as-if rule".^[10]^[2] The term return value optimization refers to a special clause in the C++ standard that goes even further than the "as-if" rule: an implementation may omit a copy operation resulting from a return statement, even if the copy constructor has side effects.^[1]^[2]

The following example demonstrates a scenario where the implementation may eliminate one or both of the copies being made, even if the copy constructor has a visible side effect (printing text).^[1]^[2] The first copy that may be eliminated is the one where a nameless temporary C could be copied into the function f's return value. The second copy that may be eliminated is the copy of the temporary object returned by f to obj.

#include <iostream>

struct C {
  C() = default;
  C(const C&) { std::cout << "A copy was made.\n"; }
};

C f() {
  return C();
}

int main() {
  std::cout << "Hello World!\n";
  C obj = f();
}

Depending upon the compiler, and that compiler's settings, the resulting program may display any of the following outputs:

Hello World!
A copy was made.
A copy was made.

Hello World!
A copy was made.

Hello World!

Background

Returning an object of built-in type from a function usually carries little to no overhead, since the object typically fits in a CPU register. Returning a larger object of class type may require more expensive copying from one memory location to another. To avoid this, an implementation may create a hidden object in the caller's stack frame, and pass the address of this object to the function. The function's return value is then copied into the hidden object.^[11] Thus, code such as this:

struct Data { 
  char bytes[16]; 
};

Data F() {
  Data result = {};
  // generate result
  return result;
}

int main() {
  Data d = F();
}

may generate code equivalent to this:

struct Data {
  char bytes[16];
};

Data* F(Data* _hiddenAddress) {
  Data result = {};
  // copy result into hidden object
  *_hiddenAddress = result;
  return _hiddenAddress;
}

int main() {
  Data _hidden;           // create hidden object
  Data d = *F(&_hidden);  // copy the result into d
}

which causes the Data object to be copied twice.

In the early stages of the evolution of C++, the language's inability to efficiently return an object of class type from a function was considered a weakness.^[12] Around 1991, Walter Bright implemented a technique to minimize copying, effectively replacing the hidden object and the named object inside the function with the object used for holding the result:^[13]

struct Data {
  char bytes[16];
};

void F(Data* p) {
  // generate result directly in *p
}

int main() {
  Data d;
  F(&d);
}

Bright implemented this optimization in his Zortech C++ compiler.^[12] This particular technique was later coined "Named return value optimization" (NRVO), referring to the fact that the copying of a named object is elided.^[13]

Compiler support

Return value optimization is supported on most compilers.^[8]^[14]^[15] There may be, however, circumstances where the compiler is unable to perform the optimization. One common case is when a function may return different named objects depending on the path of execution:^[11]^[14]^[16]

#include <string>
std::string F(bool cond = false) {
  std::string first("first");
  std::string second("second");
  // the function may return one of two named objects
  // depending on its argument. RVO might not be applied
  return cond ? first : second;
}

int main() {
  std::string result = F();
}

External links

Copy elision on cppreference.com

References

^ ^a ^b ^c ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §12.8 Copying class objects [class.copy] para. 15
^ ^a ^b ^c ^d ISO/IEC (2003). "§ 12.8 Copying class objects [class.copy]". ISO/IEC 14882:2003(E): Programming Languages - C++ (PDF). para. 15. Archived from the original (PDF) on 2023-04-10. Retrieved 2024-02-26.
^ Sutter, Herb (2001). More Exceptional C++. Addison-Wesley.
^ ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §15.1 Throwing an exception [except.throw] para. 5
^ ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §15.3 Handling an exception [except.handle] para. 17
^ ^a ^b "C++ Standard Core Language Defect Reports". WG21. Retrieved 2009-03-27.
^ https://en.cppreference.com/w/cpp/language/copy_elision ^{[bare URL]}
^ ^a ^b Meyers, Scott (1995). More Effective C++. Addison-Wesley. ISBN 9780201633719.
^ Alexandrescu, Andrei (2003-02-01). "Move Constructors". Dr. Dobb's Journal. Retrieved 2009-03-25.
^ ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §1.9 Program execution [intro.execution] para. 1
^ ^a ^b Bulka, Dov; David Mayhew (2000). Efficient C++. Addison-Wesley. ISBN 0-201-37950-3.
^ ^a ^b Lippman, Stan (2004-02-03). "The Name Return Value Optimization". Microsoft. Retrieved 2009-03-23.
^ ^a ^b "Glossary D Programming Language 2.0". Digital Mars. Retrieved 2009-03-23.
^ ^a ^b Shoukry, Ayman B. (October 2005). "Named Return Value Optimization in Visual C++ 2005". Microsoft. Retrieved 2009-03-20.
^ "Options Controlling C++ Dialect". GCC. 2001-03-17. Retrieved 2018-01-20.
^ Hinnant, Howard; et al. (2002-09-10). "N1377: A Proposal to Add Move Semantics Support to the C++ Language". WG21. Retrieved 2009-03-25.

[C++03_12.8/15-1] ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §12.8 Copying class objects [class.copy] para. 15

[staff.ustc.edu.cn_2023_u994-2] ISO/IEC (2003). "§ 12.8 Copying class objects [class.copy]". ISO/IEC 14882:2003(E): Programming Languages - C++ (PDF). para. 15. Archived from the original (PDF) on 2023-04-10. Retrieved 2024-02-26.

[moreexcept-3] Sutter, Herb (2001). More Exceptional C++. Addison-Wesley.

[C++03_15.1/5-4] ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §15.1 Throwing an exception [except.throw] para. 5

[C++03_15.3/17-5] ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §15.3 Handling an exception [except.handle] para. 17

[DR_479-6] "C++ Standard Core Language Defect Reports". WG21. Retrieved 2009-03-27.

[7] ttps://en.cppreference.com/w/cpp/language/copy_elision ^{[bare URL]}

[moreeffcpp-8] Meyers, Scott (1995). More Effective C++. Addison-Wesley. ISBN 9780201633719.

[andrei-9] Alexandrescu, Andrei (2003-02-01). "Move Constructors". Dr. Dobb's Journal. Retrieved 2009-03-25.

[C++03_1.9/1-10] ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §1.9 Program execution [intro.execution] para. 1

[efficient-11] Bulka, Dov; David Mayhew (2000). Efficient C++. Addison-Wesley. ISBN 0-201-37950-3.

[lippman-12] Lippman, Stan (2004-02-03). "The Name Return Value Optimization". Microsoft. Retrieved 2009-03-23.

[d20-13] "Glossary D Programming Language 2.0". Digital Mars. Retrieved 2009-03-23.

[vc8-14] Shoukry, Ayman B. (October 2005). "Named Return Value Optimization in Visual C++ 2005". Microsoft. Retrieved 2009-03-20.

[gcc-15] "Options Controlling C++ Dialect". GCC. 2001-03-17. Retrieved 2018-01-20.

[n1377-16] Hinnant, Howard; et al. (2002-09-10). "N1377: A Proposal to Add Move Semantics Support to the C++ Language". WG21. Retrieved 2009-03-25.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

v t e Compiler optimizations
Basic block	Peephole optimization Local value numbering
Loop	Automatic parallelization Automatic vectorization Induction variable Loop fusion Loop-invariant code motion Loop inversion Loop interchange Loop nest optimization Loop splitting Loop unrolling Loop unswitching Software pipelining Strength reduction
Data-flow analysis	Available expression Common subexpression elimination Constant folding Dead store elimination Induction variable recognition and elimination Live-variable analysis Upwards exposed uses Use-define chain Reaching definitions
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Instruction scheduling Instruction selection Register allocation Rematerialization
Functional	Deforestation Tail-call elimination
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead-code elimination Expression templates Inline expansion Jump threading Partial evaluation Profile-guided optimization
Static analysis	Alias analysis Array-access analysis Control-flow analysis Data-flow analysis Dependence analysis Escape analysis Pointer analysis Shape analysis Value range analysis