References in Modern C++

Introduction
C++ references have been a cornerstone of the language for decades, offering an alternative to pointers for indirect access to data. However, with the advent of Modern C++ (C++11 and beyond), the concept of references has been significantly expanded and refined. This post explores these modern reference features, specifically focusing on rvalue references, move semantics, and perfect forwarding. We’ll examine how these features address limitations in traditional C++ and contribute to more efficient and robust code.
Traditional References (Lvalue References)
In classic C++, references, now known as lvalue references in Modern C++, act as aliases for existing variables. They provide a way to access and potentially modify the original variable through a different name. They must be initialized upon declaration, and they cannot be reseated to refer to a different variable after initialization. They are mainly used to avoid making copies of values passed into functions (or when returned from them) (Stroustrup, 2013). Consider the following example (Listing 1), where the parameter x is passed by reference:
Key characteristics of lvalue references:
- Must be initialized when declared.
- Cannot be reassigned to refer to a different object.
- Provide direct access to the original object.
- Cannot bind to temporary objects (rvalues) directly, unless they are constant lvalue references.
Lvalues and Rvalues: A Deeper Dive
To fully understand modern C++ references, we need a clearer understanding of lvalues and rvalues.
- Lvalue: An expression that identifies a non-temporary object. It has a name, an address, and persists beyond the expression in which it is used. Examples: variables, objects. You can take the address of an lvalue using the & operator.
- Rvalue: An expression that identifies a temporary object or a value that is about to be destroyed. It is often unnamed. You cannot directly take the address of an rvalue. Examples: temporary objects created by function calls (when returning by value), literals, the result of some operators.
Consider this example in Listing 2:
The distinction can sometimes be confusing. For instance, the prefix increment operator (++value) returns an lvalue (because it modifies the original object and returns a reference to it), while the postfix increment operator (value++) returns an rvalue (because it needs to create a temporary copy of the original value before incrementing). Similarly, while value itself is an lvalue, the expression 7 + value is an rvalue because it represents a temporary result.
Constant Lvalue References Binding to Rvalues
A crucial exception to the rule that lvalue references cannot bind to rvalues is constant lvalue references. A const lvalue reference can bind to an rvalue. This extends the lifetime of the temporary object to which it binds.
In the example above (Listing 4), a const lvalue reference testRef binds to the temporary Test object returned by getTest(). The temporary object’s lifetime is extended to the lifetime of testRef, preventing its immediate destruction.
Rvalue References: The Modern Twist
Modern C++ introduces rvalue references, denoted by &&. These references are designed to bind to temporary objects, also known as rvalues. This seemingly small change unlocks powerful optimizations and new programming techniques (Ou, 2019). Crucially, rvalue references are mutable by default, allowing us to modify the temporary object.
Key features of rvalue references:
- Bind to temporary objects (rvalues).
- Allow modification of temporary objects.
- Essential for implementing move semantics.
- Cannot directly bind to lvalues.
Move Semantics: Avoiding Unnecessary Copies
One of the primary motivations for introducing rvalue references is to enable move semantics. Move semantics provide a way to transfer ownership of resources from one object to another, rather than performing a deep copy. This is particularly beneficial when dealing with large, complex objects (Hinnant et al., 2002). Move semantics greatly improves performance when copying temporary objects that are about to be destroyed, because the copy can be avoided by simply transferring resource ownership.
Consider a MyString class (Listing 6) that manages dynamically allocated character data:
In the MyString class:
- The move constructor MyString(MyString&& other) (Listing 6) takes an rvalue reference to another MyString object. Instead of allocating new memory and copying the string, it simply takes the pointer to the existing buffer from other. It then sets other.data to nullptr and other.length to 0 to prevent other from deleting the buffer when it goes out of scope. It is vital that the moved-from object is left in a valid and safe state.
- The move assignment operator operator=(MyString&& other) (Listing 6) performs a similar operation, but first it must free any resources already held by the object being assigned to. It then takes the resources from other, and sets other’s resources to a safe state (null pointer).
- std::move() (Listing 7): This function converts an lvalue to an rvalue, enabling the move constructor or move assignment operator to be called. This follows the guidance that a move operation should leave its source in a valid state (Stroustrup & Sutter, 2020). The main function (Listing 7) shows how temporary objects returned by function createString() are moved into variable str1, and the subsequent std::move() call to str1 makes a transfer of ownership to variable str2.
Without the move constructor and move assignment operator, the code would rely on the copy constructor and copy assignment operator, resulting in the creation of new memory and copying the contents, which is less efficient. In C++17 and beyond, copy elision is guaranteed in many cases (such as return value optimization), so the move constructor may not even be invoked. This is part of guaranteed copy elision introduced in C++17.
Perfect Forwarding: Maintaining Type Information
Perfect forwarding is another key use case for rvalue references. It allows you to write template functions that can forward arguments to other functions while preserving their original type (lvalue or rvalue) and const/volatile qualifiers. This is often used in factory functions or generic wrappers (Meyers, 2012). Consider this example in Listing 8:
In this example:
- forwardValue is a template function that takes a universal reference T&& arg. A universal reference can bind to both lvalues and rvalues (Meyers, 2012). These are also sometimes called forwarding references.
- std::forward<T>(arg) is the key to perfect forwarding. It conditionally casts arg to an rvalue reference only if T was deduced as an rvalue reference. Otherwise, it casts arg to an lvalue reference. This ensures that processValue receives the argument with its original type and value category. Without std::forward, arg would always be treated as an lvalue within forwardValue.
Reference Collapsing
The behavior of std::forward is based on reference collapsing rules (Table 1):
These rules govern how references to references are resolved during template instantiation. These collapsing rules are what make universal references work. The compiler deduces the template arguments in the best possible way, making sure the correct overload is called.
Named Rvalue References
It’s important to note that a named rvalue reference is treated as an lvalue within its scope. This is because the named variable has a memory location and can be referenced multiple times. To pass a named rvalue reference as an rvalue to another function, you must use std::move. (Microsoft, 2016). This is illustrated in Listing 9:
Conclusion
Rvalue references, move semantics, and perfect forwarding are powerful tools in Modern C++. They enable significant performance optimizations and allow for more flexible and generic code. Understanding these concepts is essential for writing efficient and modern C++ applications. As of C++23, no fundamental changes have occurred in reference semantics, but these concepts continue to be vital for writing performant code-especially with newer features like coroutines and ranges. By leveraging these features, you can avoid unnecessary copying, improve resource management, and create more robust and maintainable code.
References
- Becker, T. (2013). C++ rvalue references explained. Retrieved from https://isocpp.org/blog/2012/10/classics-file-c-rvalue-references-explained
- Hinnant, H. E., Dimov, P., & Abrahams, D. (2002). A Proposal to add move semantics support to the C++ language. Retrieved from http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1377.htm
- Meyers, S. (2012). Universal references in C++11. Retrieved from http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1377.htm
- Microsoft. (2016). C++ language reference. Retrieved from https://learn.microsoft.com/en-us/cpp/cpp/references-cpp?view=msvc-170
- Ou, C. (2019). Modern C++ tutorial: C++11/14/17/20 on the fly. Retrieved from https://github.com/changkun/modern-cpp-tutorial
- Stroustrup, B. (2013). The C++ programming language (4th ed.). Addison-Wesley Professional.
- Stroustrup, B., & Sutter, H. (2020). C++ core guidelines. Retrieved from https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rc-move-semantic
AI Assistance Disclosure
This article was drafted with editorial assistance from AI and all code was reviewed and tested for accuracy.
Appendix
Code listing: Link to code examples