Borrowing and References in RustImmutable and Mutable ReferencesRules for BorrowingLifetimesLifetime ElisionLifetime Annotation SyntaxLifetime Annotations in Function SignaturesThinking in Terms of LifetimesLifetime Annotations in Struct DefinitionsLifetime ElisionLifetime Annotations in Method DefinitionsThe Static LifetimeGeneric Type Parameters, Trait Bounds, and Lifetimes Together
Borrowing and References in Rust
Rust’s ownership system is a notable and distinctive feature that revolves around the concepts of ownership and borrowing, which allows developers to manage resources efficiently and safely. It’s designed to prevent memory leaks, data races, and other common problems that occur in other programming languages. Ownership refers to the idea that every value in Rust has a distinct owner. The owner is responsible for value deallocation when it goes out of scope. Rust enforces this ownership model to ensure that memory deallocation occurs automatically and reliably. By structuring ownership in this way, Rust eliminates explicit memory deallocation calls and prevents memory leaks at compile time. In comparison, borrowing refers to borrowing a reference to a resource from its owner. References are a way to access a resource without taking ownership of it, which makes it possible to share the resource between different parts of the program.
To demonstrate how borrowing works, take a look at the following example:
This is what your output should look like:
In this example,
b
borrows from the value of a
, the &
symbol acts as a way to create a reference of a
in memory that can be pointed to retrieve its value, and it makes use of an immutable reference to do so (more on this next).Immutable and Mutable References
There are two types of references in Rust: immutable and mutable.
Immutable references allow read-only access to a resource. Immutable references are created using the
&
symbol and can be created multiple times, which means that multiple parts of the program can access the same resource at the same time.Say you have a vector of integers, and you want to print each element in the vector. You can create an immutable reference to the vector using the following code:
Your output would look like this:
In comparison, mutable references are created using the
&mut
symbol and allow read and write access to a resource. However, there can only be one mutable reference to a resource at any given time. This ensures that only one part of the program can modify the resource at a time, which prevents data races.For example, suppose you have a mutable vector of integers, and you want to modify its first element. In that case, you can create a mutable reference to the vector using the following code:
Here’s the output:
In this example, you create a mutable reference to the first element of the vector using the
&mut
symbol. Then you modify the first element by dereferencing the reference using the *
operator and set its value to 6
.Rules for Borrowing
While borrowing is a powerful feature in Rust, it comes with a set of rules that must be followed to ensure memory safety and avoid data races. These rules include the following:
- Each resource can only have one mutable reference or any number of immutable references at a time.
- References must always be valid, which means that the resource being referenced must remain in scope for the entire lifetime of the reference.
- A mutable reference cannot exist at the same time as any other reference, mutable or immutable.
The Rust compiler enforces these rules at compile time, ensuring that your code is safe from data races and other memory-related bugs.
When you follow these rules, you’ll be able to write safer and more efficient code that takes advantage of Rust’s ownership system.
Lifetimes
Lifetimes are a way of tracking the scope of a reference to an object in memory. In Rust, every value has one owner, and when the owner goes out of scope, the value is dropped, and its memory is freed. Lifetimes allow Rust to ensure that a reference to an object remains valid for as long as it’s needed.
In Rust, lifetimes are denoted using the
'a
syntax, where the 'a
is a placeholder for the actual lifetime. The lifetime can be defined as a generic parameter in a function, struct, or trait using angle brackets. The following is an example:Your output would look like this:
Here, a
struct Path
is defined with two fields: point_x
and point_y
, which references an i32
type. The i32
value (or type) represents a signed integer from the number -2147483648 to 2147483647. The lifetime 'a
specifies that the reference must live at least as long as the instance of the struct.Now, take a look at this example which is similar to the code above but results in an error:
Can you see the error? Your output will look like this:
In this example, the compiler detects an error when the lifetime reference of
temp
goes out of scope. This error prevents the further use of p_y
in the program because the value of temp
has already been dropped. The issue arises because p_y
is assigned a borrowed reference, &temp
, which cannot exist outside the scope of p_y
. This error occurs due to the mismatched lifetimes.Take a look at a modified version of the previous code snippet:
As you can see, this approach works because
temp
and p_y
have the same lifetime, allowing the temp
variable to exist for the duration of the program. This means that the reference to temp
, assigned to p_y
, remains valid and can be used throughout the program.Lifetime Elision
Rust’s lifetime elision rules allow the compiler to infer lifetimes in specific situations, which can reduce the amount of boilerplate code that is needed. The rules are based on the following three-lifetime elision principles:
- Each parameter that is a reference gets its lifetime parameter. In other words, a function with one parameter of type
&T
would have a single lifetime parameter, such asfn foo<'a>(x: &'a T)
.
- If there is exactly one input lifetime parameter (ie,
&self
,&mut self
, or&
), that lifetime is assigned to all output lifetime parameters.
- If there are multiple input lifetime parameters but one of them is
&self
or&mut self
, the lifetime of&self
or&mut self
is assigned to all output lifetime parameters.
Lifetime Annotation Syntax
Lifetime annotations don’t change how long any of the references live. Rather, they describe the relationships of the lifetimes of multiple references to each other without affecting the lifetimes. Just as functions can accept any type when the signature specifies a generic type parameter, functions can accept references with any lifetime by specifying a generic lifetime parameter.
Lifetime annotations have a slightly unusual syntax: the names of lifetime parameters must start with an apostrophe (
'
) and are usually all lowercase and very short, like generic types. Most people use the name 'a
for the first lifetime annotation. We place lifetime parameter annotations after the &
of a reference, using a space to separate the annotation from the reference’s type.Here are some examples: a reference to an
i32
without a lifetime parameter, a reference to an i32
that has a lifetime parameter named 'a
, and a mutable reference to an i32
that also has the lifetime 'a
.One lifetime annotation by itself doesn’t have much meaning, because the annotations are meant to tell Rust how generic lifetime parameters of multiple references relate to each other. Let’s examine how the lifetime annotations relate to each other in the context of the
longest
function.Lifetime Annotations in Function Signatures
To use lifetime annotations in function signatures, we need to declare the generic lifetime parameters inside angle brackets between the function name and the parameter list, just as we did with generic type parameters.
We want the signature to express the following constraint: the returned reference will be valid as long as both the parameters are valid. This is the relationship between lifetimes of the parameters and the return value. We’ll name the lifetime
'a
and then add it to each reference, as shown in Listing 10-21.Filename: src/main.rs
Listing 10-21: The
longest
function definition specifying that all the references in the signature must have the same lifetime 'a
This code should compile and produce the result we want when we use it with the
main
function in Listing 10-19.The function signature now tells Rust that for some lifetime
'a
, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a
. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a
. In practice, it means that the lifetime of the reference returned by the longest
function is the same as the smaller of the lifetimes of the values referred to by the function arguments. These relationships are what we want Rust to use when analyzing this code.Remember, when we specify the lifetime parameters in this function signature, we’re not changing the lifetimes of any values passed in or returned. Rather, we’re specifying that the borrow checker should reject any values that don’t adhere to these constraints. Note that the
longest
function doesn’t need to know exactly how long x
and y
will live, only that some scope can be substituted for 'a
that will satisfy this signature.When annotating lifetimes in functions, the annotations go in the function signature, not in the function body. The lifetime annotations become part of the contract of the function, much like the types in the signature. Having function signatures contain the lifetime contract means the analysis the Rust compiler does can be simpler. If there’s a problem with the way a function is annotated or the way it is called, the compiler errors can point to the part of our code and the constraints more precisely. If, instead, the Rust compiler made more inferences about what we intended the relationships of the lifetimes to be, the compiler might only be able to point to a use of our code many steps away from the cause of the problem.
When we pass concrete references to
longest
, the concrete lifetime that is substituted for 'a
is the part of the scope of x
that overlaps with the scope of y
. In other words, the generic lifetime 'a
will get the concrete lifetime that is equal to the smaller of the lifetimes of x
and y
. Because we’ve annotated the returned reference with the same lifetime parameter 'a
, the returned reference will also be valid for the length of the smaller of the lifetimes of x
and y
.Let’s look at how the lifetime annotations restrict the
longest
function by passing in references that have different concrete lifetimes. Listing 10-22 is a straightforward example.Filename: src/main.rs
Listing 10-22: Using the
longest
function with references to String
values that have different concrete lifetimesIn this example,
string1
is valid until the end of the outer scope, string2
is valid until the end of the inner scope, and result
references something that is valid until the end of the inner scope. Run this code, and you’ll see that the borrow checker approves; it will compile and print The longest string is long string is long
.Next, let’s try an example that shows that the lifetime of the reference in
result
must be the smaller lifetime of the two arguments. We’ll move the declaration of the result
variable outside the inner scope but leave the assignment of the value to the result
variable inside the scope with string2
. Then we’ll move the println!
that uses result
to outside the inner scope, after the inner scope has ended. The code in Listing 10-23 will not compile.Filename: src/main.rs
Listing 10-23: Attempting to use
result
after string2
has gone out of scopeWhen we try to compile this code, we get this error:
The error shows that for
result
to be valid for the println!
statement, string2
would need to be valid until the end of the outer scope. Rust knows this because we annotated the lifetimes of the function parameters and return values using the same lifetime parameter 'a
.Thinking in Terms of Lifetimes
The way in which you need to specify lifetime parameters depends on what your function is doing. For example, if we changed the implementation of the
longest
function to always return the first parameter rather than the longest string slice, we wouldn’t need to specify a lifetime on the y
parameter. The following code will compile:Filename: src/main.rs
We’ve specified a lifetime parameter
'a
for the parameter x
and the return type, but not for the parameter y
, because the lifetime of y
does not have any relationship with the lifetime of x
or the return value.When returning a reference from a function, the lifetime parameter for the return type needs to match the lifetime parameter for one of the parameters. If the reference returned does not refer to one of the parameters, it must refer to a value created within this function. However, this would be a dangling reference because the value will go out of scope at the end of the function. Consider this attempted implementation of the
longest
function that won’t compile:Filename: src/main.rs
Here, even though we’ve specified a lifetime parameter
'a
for the return type, this implementation will fail to compile because the return value lifetime is not related to the lifetime of the parameters at all. Here is the error message we get:The problem is that
result
goes out of scope and gets cleaned up at the end of the longest
function. We’re also trying to return a reference to result
from the function. There is no way we can specify lifetime parameters that would change the dangling reference, and Rust won’t let us create a dangling reference. In this case, the best fix would be to return an owned data type rather than a reference so the calling function is then responsible for cleaning up the value.Ultimately, lifetime syntax is about connecting the lifetimes of various parameters and return values of functions. Once they’re connected, Rust has enough information to allow memory-safe operations and disallow operations that would create dangling pointers or otherwise violate memory safety.
Lifetime Annotations in Struct Definitions
So far, the structs we’ve defined all hold owned types. We can define structs to hold references, but in that case we would need to add a lifetime annotation on every reference in the struct’s definition. Listing 10-24 has a struct named
ImportantExcerpt
that holds a string slice.Filename: src/main.rs
Listing 10-24: A struct that holds a reference, requiring a lifetime annotation
This struct has the single field
part
that holds a string slice, which is a reference. As with generic data types, we declare the name of the generic lifetime parameter inside angle brackets after the name of the struct so we can use the lifetime parameter in the body of the struct definition. This annotation means an instance of ImportantExcerpt
can’t outlive the reference it holds in its part
field.The
main
function here creates an instance of the ImportantExcerpt
struct that holds a reference to the first sentence of the String
owned by the variable novel
. The data in novel
exists before the ImportantExcerpt
instance is created. In addition, novel
doesn’t go out of scope until after the ImportantExcerpt
goes out of scope, so the reference in the ImportantExcerpt
instance is valid.Lifetime Elision
You’ve learned that every reference has a lifetime and that you need to specify lifetime parameters for functions or structs that use references. However, in Chapter 4 we had a function in Listing 4-9, shown again in Listing 10-25, that compiled without lifetime annotations.
Filename: src/lib.rs
Listing 10-25: A function we defined in Listing 4-9 that compiled without lifetime annotations, even though the parameter and return type are references
The reason this function compiles without lifetime annotations is historical: in early versions (pre-1.0) of Rust, this code wouldn’t have compiled because every reference needed an explicit lifetime. At that time, the function signature would have been written like this:
After writing a lot of Rust code, the Rust team found that Rust programmers were entering the same lifetime annotations over and over in particular situations. These situations were predictable and followed a few deterministic patterns. The developers programmed these patterns into the compiler’s code so the borrow checker could infer the lifetimes in these situations and wouldn’t need explicit annotations.
This piece of Rust history is relevant because it’s possible that more deterministic patterns will emerge and be added to the compiler. In the future, even fewer lifetime annotations might be required.
The patterns programmed into Rust’s analysis of references are called the lifetime elision rules. These aren’t rules for programmers to follow; they’re a set of particular cases that the compiler will consider, and if your code fits these cases, you don’t need to write the lifetimes explicitly.
The elision rules don’t provide full inference. If Rust deterministically applies the rules but there is still ambiguity as to what lifetimes the references have, the compiler won’t guess what the lifetime of the remaining references should be. Instead of guessing, the compiler will give you an error that you can resolve by adding the lifetime annotations.
Lifetimes on function or method parameters are called input lifetimes, and lifetimes on return values are called output lifetimes.
The compiler uses three rules to figure out the lifetimes of the references when there aren’t explicit annotations. The first rule applies to input lifetimes, and the second and third rules apply to output lifetimes. If the compiler gets to the end of the three rules and there are still references for which it can’t figure out lifetimes, the compiler will stop with an error. These rules apply to
fn
definitions as well as impl
blocks.- The first rule is that the compiler assigns a lifetime parameter to each parameter that’s a reference. In other words, a function with one parameter gets one lifetime parameter:
fn foo<'a>(x: &'a i32)
; a function with two parameters gets two separate lifetime parameters:fn foo<'a, 'b>(x: &'a i32, y: &'b i32)
; and so on.
- The second rule is that, if there is exactly one input lifetime parameter, that lifetime is assigned to all output lifetime parameters:
fn foo<'a>(x: &'a i32) -> &'a i32
.
- The third rule is that, if there are multiple input lifetime parameters, but one of them is
&self
or&mut self
because this is a method, the lifetime ofself
is assigned to all output lifetime parameters. This third rule makes methods much nicer to read and write because fewer symbols are necessary.
Let’s pretend we’re the compiler. We’ll apply these rules to figure out the lifetimes of the references in the signature of the
first_word
function in Listing 10-25. The signature starts without any lifetimes associated with the references:Then the compiler applies the first rule, which specifies that each parameter gets its own lifetime. We’ll call it
'a
as usual, so now the signature is this:The second rule applies because there is exactly one input lifetime. The second rule specifies that the lifetime of the one input parameter gets assigned to the output lifetime, so the signature is now this:
Now all the references in this function signature have lifetimes, and the compiler can continue its analysis without needing the programmer to annotate the lifetimes in this function signature.
Let’s look at another example, this time using the
longest
function that had no lifetime parameters when we started working with it in Listing 10-20:Let’s apply the first rule: each parameter gets its own lifetime. This time we have two parameters instead of one, so we have two lifetimes:
You can see that the second rule doesn’t apply because there is more than one input lifetime. The third rule doesn’t apply either, because
longest
is a function rather than a method, so none of the parameters are self
. After working through all three rules, we still haven’t figured out what the return type’s lifetime is. This is why we got an error trying to compile the code in Listing 10-20: the compiler worked through the lifetime elision rules but still couldn’t figure out all the lifetimes of the references in the signature.Because the third rule really only applies in method signatures, we’ll look at lifetimes in that context next to see why the third rule means we don’t have to annotate lifetimes in method signatures very often.
Lifetime Annotations in Method Definitions
When we implement methods on a struct with lifetimes, we use the same syntax as that of generic type parameters shown in Listing 10-11. Where we declare and use the lifetime parameters depends on whether they’re related to the struct fields or the method parameters and return values.
Lifetime names for struct fields always need to be declared after the
impl
keyword and then used after the struct’s name, because those lifetimes are part of the struct’s type.In method signatures inside the
impl
block, references might be tied to the lifetime of references in the struct’s fields, or they might be independent. In addition, the lifetime elision rules often make it so that lifetime annotations aren’t necessary in method signatures. Let’s look at some examples using the struct named ImportantExcerpt
that we defined in Listing 10-24.First, we’ll use a method named
level
whose only parameter is a reference to self
and whose return value is an i32
, which is not a reference to anything:The lifetime parameter declaration after
impl
and its use after the type name are required, but we’re not required to annotate the lifetime of the reference to self
because of the first elision rule.Here is an example where the third lifetime elision rule applies:
There are two input lifetimes, so Rust applies the first lifetime elision rule and gives both
&self
and announcement
their own lifetimes. Then, because one of the parameters is &self
, the return type gets the lifetime of &self
, and all lifetimes have been accounted for.The Static Lifetime
One special lifetime we need to discuss is
'static
, which denotes that the affected reference can live for the entire duration of the program. All string literals have the 'static
lifetime, which we can annotate as follows:The text of this string is stored directly in the program’s binary, which is always available. Therefore, the lifetime of all string literals is
'static
.You might see suggestions to use the
'static
lifetime in error messages. But before specifying 'static
as the lifetime for a reference, think about whether the reference you have actually lives the entire lifetime of your program or not, and whether you want it to. Most of the time, an error message suggesting the 'static
lifetime results from attempting to create a dangling reference or a mismatch of the available lifetimes. In such cases, the solution is fixing those problems, not specifying the 'static
lifetime.Generic Type Parameters, Trait Bounds, and Lifetimes Together
Let’s briefly look at the syntax of specifying generic type parameters, trait bounds, and lifetimes all in one function!
This is the
longest
function from Listing 10-21 that returns the longer of two string slices. But now it has an extra parameter named ann
of the generic type T
, which can be filled in by any type that implements the Display
trait as specified by the where
clause. This extra parameter will be printed using {}
, which is why the Display
trait bound is necessary. Because lifetimes are a type of generic, the declarations of the lifetime parameter 'a
and the generic type parameter T
go in the same list inside the angle brackets after the function name.