Understanding Ownership in Rust

Understanding ownership in Rust

I have been working on learning rust using the rustbook.These are the notes taken while reading the ownership chapter of the book.

What is Ownership

The central feature of Rust is ownership and is Rust’s approach of managing memory with a set of rules the compiler checks at compile time. Before that it is important to understand the concept of stack and heap.

Stack and Heap

  1. Stack and heap are parts of memory that are available to your code to use at runtime and are structured in a different way.
  2. The stack stores values in a last in,first out fashion.Adding data is called pushing on to the stack and removing is popping off the stack.
  3. All data stored on stack must have a known fixed size. Data with unknown fixed size must be stored on the heap.
  4. The heap is less organized: when you put data on the heap, you request a certain amount of space. The memory allocator finds an empty spot in the heap that is big enough, marks it as being in use, and returns a pointer, which is the address of that location. This process is called allocating on the heap and is sometimes abbreviated as just allocating.
  5. Pushing values onto the stack is not considered allocating. Because the pointer is a known, fixed size, you can store the pointer on the stack, but when you want the actual data, you must follow the pointer.
  6. Pushing to the stack is faster than allocating on the heap because the allocator never has to search for a place to store new data; that location is always at the top of the stack. Comparatively, allocating space on the heap requires more work, because the allocator must first find a big enough space to hold the data and then perform bookkeeping to prepare for the next allocation.

Ownership rules

To understand the ownership rules, we will use the String type which supports the test to be mutable and of growing type:

  1. Memory must be requested from the allocator at runtime, which is done when we call String::from.
  2. and when we are done with the string we need a way to return the memory back. In most languages it’s the work of a Garbage collector to keep track and cleans up memory that is not being used anymore.While in Rust, memory allocated is automatically returned once the variable that owns it goes out of scope.
{
    let s = String::from("hello"); //valid from this point ownwards

    println!("{},s);
}//Scope is over and s is no longer valid

Variables that have a known fixed size are pushed on to the stack:

let x = 5;
let y = x; //y is a copy of x and is assigned on the stack

But the situation changes when we use String for example:

let s1 = String::from("Hello");
let s2 = s1; 

The Memory representation looks like:

image info

Image taken from the Rust Book

When s1 is assigned to s2, the String data is copied i.e. the pointer,length and capacity that are on the stack. The data on the heap is not copied.This representation looks like:

image info

When a variable goes out of scope, Rust automatically calls the drop function to clean up the heap memory for that variable. When we assign s1 to s2 and when they both go out of scope they will both try to free the same memory and this is known as a double free error and is a memory safety bug. This is handled differently in Rust, because the moment we assign s1 to s2, Rust considers s1 no longer valid and does not need to free anything

let s1 = String::from("hello");
let s2 = s1; // s1 is moved to s2 and is a sort of shallow copy

println!("{}, world!", s1);

and the compiler shows this error:

error[E0382]: borrow of moved value: `s1`
  --> src/main.rs:18:28
   |
15 |     let s1 = String::from("hello");
   |         -- move occurs because `s1` has type `String`, which does not implement the `Copy` trait
16 |     let s2 = s1;
   |              -- value moved here
17 | 
18 |     println!("{}, world!", s1);
   |                            ^^ value borrowed here after move

In this case s1 was moved into s2

If we want to do a deep copy, Rust provides a common method called clone. e.g:

fn main() {
    let s1 = String::from("hello");
    let s2 = s1.clone(); //Deep Copy

    println!("s1 = {}, s2 = {}", s1, s2);
}

In case of stack only data for example:

fn main() {
    let x = 5;
    let y = x;

    println!("x = {}, y = {}", x, y);
}

Here although x is copied to y but it can still be used because it’s size is known at compile time.

Ownership and Functions

Considering the following example from the book:

fn main() {
    let s = String::from("hello");  // s comes into scope

    takes_ownership(s);             // s's value moves into the function...
                                    // ... and so is no longer valid here

    let x = 5;                      // x comes into scope

    makes_copy(x);                  // x would move into the function,
                                    // but i32 is Copy, so it's okay to still
                                    // use x afterward

} // Here, x goes out of scope, then s. But because s's value was moved, nothing
  // special happens.

fn takes_ownership(some_string: String) { // some_string comes into scope
    println!("{}", some_string);
} // Here, some_string goes out of scope and `drop` is called. The backing
  // memory is freed.

fn makes_copy(some_integer: i32) { // some_integer comes into scope
    println!("{}", some_integer);
} // Here, some_integer goes out of scope. Nothing special happens.

Functions that return values transfer ownership back as in this example:

fn main() {
    let s1 = gives_ownership();         // gives_ownership moves its return
                                        // value into s1

    let s2 = String::from("hello");     // s2 comes into scope

    let s3 = takes_and_gives_back(s2);  // s2 is moved into
                                        // takes_and_gives_back, which also
                                        // moves its return value into s3
} // Here, s3 goes out of scope and is dropped. s2 was moved, so nothing
  // happens. s1 goes out of scope and is dropped.

fn gives_ownership() -> String {             // gives_ownership will move its
                                             // return value into the function
                                             // that calls it

    let some_string = String::from("yours"); // some_string comes into scope

    some_string                              // some_string is returned and
                                             // moves out to the calling
                                             // function
}

// This function takes a String and returns one
fn takes_and_gives_back(a_string: String) -> String { // a_string comes into
                                                      // scope

    a_string  // a_string is returned and moves out to the calling function
}

The pattern used here is: assigning a value to another variable moves it. When a variable that includes data on the heap goes out of scope, the value will be cleaned up by drop unless the data has been moved to be owned by another variable.

What if a function does not want the ownership and just needs to use the value, Rust provides the option of passing references using the & operator. The same example with the usage of references:

fn main() {
    let s1 = String::from("hello");

    let len = calculate_length(&s1);

    println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &String) -> usize {
    s.len()
}

From the above code,we understand the following:

  1. The &s1 syntax creates a reference that refers to the value of s1 but does not own it.
  2. Since it does not own it,the value it points to will not be dropped when the reference stops being used.
  3. Likewise, the signature of the function uses & to indicate that the type of the parameter s is a reference.
  4. The scope in which the variable s is valid is the same as any function parameter’s scope, but we don’t drop what the reference points to when s stops being used because we don’t have ownership.

We cannot modify a borrowed value.The following will not compile:

fn main() {
    let s = String::from("hello");

    change(&s);
}

fn change(some_string: &String) {
    some_string.push_str(", world");
}

This will throw the error:

 Compiling ownership v0.1.0 (/home/ankur/rust-projects/ownership)
error[E0596]: cannot borrow `*s` as mutable, as it is behind a `&` reference
  --> src/main.rs:39:5
   |
38 | fn change(s:&String){
   |             ------- help: consider changing this to be a mutable reference: `&mut String`
39 |     s.push_str(",world");
   |     ^ `s` is a `&` reference, so the data it refers to cannot be borrowed as mutable

error: aborting due to previous error

Mutable References

To correct the above code we can have mutable references and this can be done by changing it as:

  1. Change the variable s to mut
  2. Create a mutable reference with &mut s.
  3. Change the function signature to accept a mutable reference.

This is how it will look like now:

fn main() {
    let mut s = String::from("hello");

    change(&mut s);
}

fn change(some_string: &mut String) {
    some_string.push_str(", world");
}

The only restriction being is that you can have only one mutable reference of a data at a time. The benefit of this is to prevent data races.

Dangling References

Rust handles the dangling pointer issue by making sure that the data will not go out of scope before it’s reference does as can be seen below:

fn main() {
    let reference_to_nothing = dangle();
}

fn dangle() -> &String //Returns a reference to String {
    let s = String::from("hello"); //new String

    &s // returning the reference to the String
}// Here s goes out of scope,and is dropped. Its memory goes away.The compiler will show an error message here

The Slice Type

Slices lets you refer to a contigous sequence of elements in a collection rather than the whole collection.Let’s say we have a function that takes a string and returns the first word of the string.If the function did not find a space it will assume that the string has one word.

There are two ways for the function to return the result:

  1. It can return the index at which it finds the first space as follows:
fn first_word(s: &String) -> usize {
    let bytes = s.as_bytes(); //Converting the string to bytes

    for (i, &item) in bytes.iter().enumerate() { //iterating over array of bytes
        if item == b' ' { //checking for the byte that represents space
            return i; // returning the index of that
        }
    }

    s.len()
}

fn main() {
    let mut s = String::from("hello world");

    let word = first_word(&s); // word will get the value 5

    s.clear(); // this empties the String, making it equal to ""

    // word still has the value 5 here, but there's no more string that
    // we could meaningfully use the value 5 with. word is now totally invalid!
}

With this approach the index word is out of sync with the String itself and it does not make sense to use the value after the String is emptied.

  1. A better approach would be to return a slice of the String as:
fn first_word(s: &String) -> &str {
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return &s[0..i];
        }
    }

    &s[..]
}

fn main() {
    let mut s = String::from("hello world");

    let word = first_word(&s);

    s.clear(); // error!

    println!("the first word is: {}", word);
}

Now we cannot clear the String because it has been borrowed as a immutable reference.

That’s all to it for now!

References

  1. RustBook