Chapter 13 - Functional Language Features: Iterators and Closures

Motivation for Closure


#![allow(unused)]
fn main() {
fn generate_workout(intensity: u32, random_number: u32) {
    if intensity < 25 {
        println!(
            "Today, do {} pushups!",
            simulated_expensive_calculation(intensity)
        );
        println!(
            "Next, do {} situps!",
            simulated_expensive_calculation(intensity)
        );
    } else {
        if random_number == 3 {
            println!("Take a break today! Remember to stay hydrated!");
        } else {
            println!(
                "Today, run for {} minutes!",
                simulated_expensive_calculation(intensity)
            );
        }
    }
}
}

Cons: In the above function, you call simulated_expensive_calculation twice in the first if block. Let's improve it:


#![allow(unused)]
fn main() {
fn generate_workout(intensity: u32, random_number: u32) {
    let expensive_result =
        simulated_expensive_calculation(intensity);

    if intensity < 25 {
        println!(
            "Today, do {} pushups!",
            expensive_result
        );
        println!(
            "Next, do {} situps!",
            expensive_result
        );
    } else {
        if random_number == 3 {
            println!("Take a break today! Remember to stay hydrated!");
        } else {
            println!(
                "Today, run for {} minutes!",
                expensive_result
            );
        }
    }
}
}

In the above implementation, the expensive computation is computed only once. Unfortantely for cases where intensity >= 25 && random_number == 3, we have to perform the expensive computation although it isn't required. Let's use closures here.

To define a closure, we start with a pair of vertical pipes (|), inside which we specify the parameters to the closure:


#![allow(unused)]
fn main() {
fn generate_workout(intensity: u32, random_number: u32) {
    let expensive_closure = |num| {
        println!("calculating slowly...");
        thread::sleep(Duration::from_secs(2));
        num
    };

    if intensity < 25 {
        println!(
            "Today, do {} pushups!",
            expensive_closure(intensity)
        );
        println!(
            "Next, do {} situps!",
            expensive_closure(intensity)
        );
    } else {
        if random_number == 3 {
            println!("Take a break today! Remember to stay hydrated!");
        } else {
            println!(
                "Today, run for {} minutes!",
                expensive_closure(intensity)
            );
        }
    }
}
}

However the above implementation has the same problem of the first variant. We could fix this problem by creating a variable local to that if block to hold the result of calling the closure, but closures provide us with another solution. Let's learn something more before finding out solution to the above problem.

Closure Type Inference and Annotation

Closures don’t require you to annotate the types of the parameters or the return value like fn functions do. But we can add type annotations if we want to increase explicitness and clarity at the cost of being more verbose than is strictly necessary.


#![allow(unused)]
fn main() {
let expensive_closure = |num: u32| -> u32 {
    println!("calculating slowly...");
    thread::sleep(Duration::from_secs(2));
    num
};
}

Closure definitions will have one concrete type inferred for each of their parameters and for their return value. The following code won't compile:


#![allow(unused)]
fn main() {
let example_closure = |x| x;

let s = example_closure(String::from("hello"));
let n = example_closure(5);
}

Storing Closures Using Generic Parameters and the Fn Traits

One solution to the above function generate_workout is to save the result of the expensive closure in a variable for reuse and use the variable in each place we need the result.

To make a struct that holds a closure, we need to specify the type of the closure, because a struct definition needs to know the types of each of its fields. Each closure instance has its own unique anonymous type: that is, even if two closures have the same signature, their types are still considered different.

The Fn traits are provided by the standard library. All closures implement at least one of the traits: Fn, FnMut, or FnOnce.


#![allow(unused)]
fn main() {
struct Cacher<T>
    where T: Fn(u32) -> u32
{
    calculation: T,
    value: Option<u32>,
}
}

The Cacher struct has a calculation field of the generic type T. The trait bounds on T specify that it’s a closure by using the Fn trait. Any closure we want to store in the calculation field must have one u32 parameter (specified within the parentheses after Fn) and must return a u32 (specified after the ->).


#![allow(unused)]
fn main() {
impl<T> Cacher<T>
    where T: Fn(u32) -> u32
{
    fn new(calculation: T) -> Cacher<T> {
        Cacher {
            calculation,
            value: None,
        }
    }

    fn value(&mut self, arg: u32) -> u32 {
        match self.value {
            Some(v) => v,
            None => {
                let v = (self.calculation)(arg);
                self.value = Some(v);
                v
            },
        }
    }
}
}

And now the implementation:


#![allow(unused)]
fn main() {
fn generate_workout(intensity: u32, random_number: u32) {
    let mut expensive_result = Cacher::new(|num| {
        println!("calculating slowly...");
        thread::sleep(Duration::from_secs(2));
        num
    });

    if intensity < 25 {
        println!(
            "Today, do {} pushups!",
            expensive_result.value(intensity)
        );
        println!(
            "Next, do {} situps!",
            expensive_result.value(intensity)
        );
    } else {
        if random_number == 3 {
            println!("Take a break today! Remember to stay hydrated!");
        } else {
            println!(
                "Today, run for {} minutes!",
                expensive_result.value(intensity)
            );
        }
    }
}
}

The above implementation doesn't suffer from any of the above cons discussed above. The function is computed only once when required.

But there is a problem with the above implementation. The code will fail (obviously) for this scenario:


#![allow(unused)]
fn main() {
#[test]
fn call_with_different_values() {
    let mut c = Cacher::new(|a| a);

    let v1 = c.value(1);
    let v2 = c.value(2);

    assert_eq!(v2, 2);
}
}

This problem can be fixed by changing the struct implementation to store the key and value mapping in a hashmap.

Capturing the Environment with Closures

In the above example, we used closures as inline anonymous functions. We can also use it to capture their environment and access variables from the scope in which they're defined.

fn main() {
    let x = 4;

    let equal_to_x = |z| z == x;

    let y = 4;

    assert!(equal_to_x(y));
}

whereas something like this will result in an compile error:

fn main() {
    let x = 4;

    fn equal_to_x(z: i32) -> bool { z == x }

    let y = 4;

    assert!(equal_to_x(y));
}

Closures can capture values from their environment in three ways, which directly map to the three ways a function can take a parameter: taking ownership, borrowing mutably, and borrowing immutably. These are encoded in the three Fn traits as follows:

  • FnOnce consumes the variables it captures from its enclosing scope, known as the closure’s environment. To consume the captured variables, the closure must take ownership of these variables and move them into the closure when it is defined. The Once part of the name represents the fact that the closure can’t take ownership of the same variables more than once, so it can be called only once.
  • FnMut can change the environment because it mutably borrows values.
  • Fn borrows values from the environment immutably.

When you create a closure, Rust infers which trait to use based on how the closure uses the values from the environment. All closures implement FnOnce because they can all be called at least once. Closures that don’t move the captured variables also implement FnMut, and closures that don’t need mutable access to the captured variables also implement Fn.

Reddit thread on usecase of FnOnce

If you want to force the closure to take ownership of the values it uses in the environment, you can use the move keyword before the parameter list. This technique is mostly useful when passing a closure to a new thread to move the data so it’s owned by the new thread. Example:

fn main() {
    let x = vec![1, 2, 3];

    let equal_to_x = move |z| z == x;

    println!("can't use x here: {:?}", x);

    let y = vec![1, 2, 3];

    assert!(equal_to_x(y));
}

The above program will result in compile error till you have the printlin statement in the code.

Iterators

Three forms of iteration:

  • iter() iterates over &T
fn main() {
    let v1 = vec![1, 2, 3];

    let v1_iter = v1.iter();
    println!("{:?}", v1);
    for v in v1_iter {
        println!("Got {}", v);
    }
    println!("{:?}", v1);
}
  • iter_mut iterates over &mut T
fn main() {
    let mut v1 = vec![1, 2, 3];

    let v1_iter: std::slice::IterMut<u8> = v1.iter_mut();
    for v in v1_iter {
        *v = *v + 2;
        println!("Got {}", v);
    }
    // println!("{:?}", v1); Uncommenting this results in compile error
}

The above results in a compile error because mutable references have one big restriction: you can have only one mutable reference to a particular piece of data in a particular scope. And in the above code, v1's mutable borrow has already happened and v1_iter has mutable reference to that in the scope. When you try to print it, you try to immutably borrow - but the mixing isn't permitted. So, you can overcome that like this:

fn main() {
    let mut v1 = vec![1, 2, 3];

    {
        let v1_iter: std::slice::IterMut<u8> = v1.iter_mut();
        for v in v1_iter {
            *v = *v + 2;
            println!("Got {}", v);
        }
    }
    println!("{:?}", v1);
}

Note that even this will work as after the for loop ends, the scope of the borrow ends:

fn main() {
    let mut v1 = vec![1, 2, 3];

    for v in v1.iter_mut() {
        *v = *v + 2;
        println!("Got {}", v);
    }
    println!("{:?}", v1);
}
  • into_iter() iterates over T
fn main() {
    let v1 = vec![1, 2, 3];

    let v1_iter: std::vec::IntoIter<u8> = v1.into_iter();
    for v in v1_iter {
        println!("Got {}", v);
    }
    // println!("{:?}", v1); Uncommenting this results in compile error
}

Note that if you restructure it like this, it still won't compile (the reason being v1 is borrowed):

fn main() {
    let v1 = vec![1, 2, 3];
    {
        let v1_iter: std::vec::IntoIter<u8> = v1.into_iter();
        for v in v1_iter {
            println!("Got {}", v);
        }
    }
    println!("{:?}", v1);
}

Other Examples

fn main() {
    let v1: [i32; 3] = [1, 2, 3];
    let v2: Vec<i32> = v1.iter().map(|x| x * 2).collect();
    let v3: Vec<&i32> = v1.iter().filter(|x| **x == 1).collect();
    println!("{:?}", v1);
    println!("{:?}", v2);
    println!("{:?}", v3);
}

Why does v3 is annotated with Vec<&i32> and not Vec<i32> and why does it has ** ?

In v3, we do vi.iter() which passes &i32 into filter. But the type of predicate in filter is FnMut(&Self::Item) -> Bool. So the type of x becomes &&i32. So, you do two de-references to get the value. That answers the second part of the question. The type is Vec<i32> as the type of predicate for map is FnMut(Self::Item) -> B whereas for filter it is FnMut(&Self::Item -> Bool). And hence the different type signature.

Different map variants:

fn main() {
    let mut v1: Vec<i32> = vec![1, 2, 3];
    let v2: Vec<i32> = v1.iter().map(|x| x * 2).collect();
    let v3: Vec<i32> = v1.iter_mut().map(|x| *x * 2).collect();
    let v4: Vec<()> = v1.iter_mut().map(|x| *x = *x * 2).collect();
    let v5: Vec<&mut i32> = v1
        .iter_mut()
        .map(|x| {
            *x = *x * 2;
            x
        }).collect();

    // println!("{:?}", v1); Uncommenting this will result in an compile error
    println!("{:?}", v2);
    println!("{:?}", v3);
    println!("{:?}", v4);
    println!("{:?}", v5);
}

Note that v4 style is not recommened. Uncommenting the line will result in compile error because v5 has a mutuable borrow on v1.

Different filter variations:


#![allow(unused)]
fn main() {
let v1: Vec<i32> = vec![1, 2, 3];
let v2: Vec<i32> = v1.into_iter().filter(|x| *x == 2).collect();
println!("{:?}", v2);
}

#![allow(unused)]
fn main() {
let v1: Vec<i32> = vec![1, 2, 3];
let v2: Vec<&i32> = v1.iter().filter(|&x| *x == 2).collect();
println!("{:?}", v2);
}

#![allow(unused)]
fn main() {
let mut v1: Vec<i32> = vec![1, 2, 3];
let v2: Vec<&mut i32> = v1.iter_mut().filter(|x| **x == 2).collect();
println!("{:?}", v2);
}

Note that there are two styles of coding: iterator and loops. Most rust programmers prefer iterator style. Also, there is no much performance difference between both of them.