Beliebte Suchanfragen
//

Rust for Java developers

9.9.2020 | 35 minutes of reading time

Rust for Java developers – A step-by-step introduction

The Java ecosystem is vast and can solve almost any problem you throw at it. Yet its age shows in several parts, making it clunky and unattractive to some Java devs – devs that may be interested in Rust , one of the up-and-coming languages that compete for developer attention. In this blog post we examine what makes the languages similar – and what makes them different. It offers a step-by-step guide through several core features, and how many of the concepts of Java translate to Rust.

Like any programming language intended for real-life production usage, Rust offers far more than a single blog post can teach. This post aims at giving a first overview of Rust for Java developers. Those interested in the details and further reading can find more documentation in the Rust book . We will cover the following topics in this guide:

Simple Syntax: How to make the machine do what you mean

Syntax does not matter, you might say – until it does. After all, syntax determines what you look at all day long, and it will influence how you approach a problem in subtle ways. Both Rust and Java are imperative languages with object-oriented features. So at its most basic the syntax of Rust should feel familiar for a Java developer. Almost all concepts you regularly use in Java are available. They just happen to look a little different.

Objects and structs

1public class Person implements Named {
2  private String name;
3  private int age;
4  
5  public Person(String name) {
6    this.name = name;
7    this.age = 18;
8  }
9  
10  @Override public String getName() {
11    return name;
12  }
13}

This code snippet should look familiar to most Java devs. A similar snippet of Rust might look akin to this:

1pub struct Person {
2    name: String,
3    age: u32
4}
5
6impl Person {
7    pub fn new(name: String) -> Self {
8        Person { name, age: 18 }
9    }
10}
11
12impl Named for Person {
13    fn name(&self) -> String {
14        return self.name.clone();
15    }
16}

This code looks both familiar and different from the Java code. The Java code “concentrates” all knowledge about what the class is. In contrast the Rust code consists of multiple blocks. Each of these blocks tells us about an aspect of the struct.

The struct itself

The first of these blocks is the actual definition of the struct. It defines what the struct looks like in memory. This block tells us the struct is public, and has two (implicitly private) fields. From this definition, the Rust compiler knows enough to be able to generate an instance of the struct. Yet this block does not yet tell us anything about what the struct can do.

Inherent implementation

The second block defines the “inherent implementation” of the class. That phrase is quite a mouthful, but just means “things the struct can do by itself”. Think of the methods defined in the class with no matching interface or superclass method. In effect, any method you could not annotate with @Override is an inherent method.

In our example, we define a single inherent function. Functions are declared with the fn keyword. Java does not have a dedicated keyword to declare a function/method. In contrast, Rust requires this bit of syntax. The function declared is named new and returns Self. Self is a special type that can come in handy sometimes, especially once we start writing generic code. It just means “the current type”. Similarly, self (note the lowercase!) means the current object, and is the closest sibling to Java’s this. Methods and functions are very similar in Rust – methods are just functions which take some variant of self as their first argument.

Trait implementation

Finally, we have the implementation of Named. This trait corresponds to a Java interface. So, we need to provide a number of methods in order to fulfill the Named contract. Unlike Java, we do not write these methods mixed with the inherent ones. Instead, we create a new top-level block containing only the methods of a single trait. There are two reasons for this: A struct can actually implement multiple traits with conflicting methods defined. In Java, this would be a problem, since it would be impossible to tell which should be called. In Rust, both can coexist. Additionally, and more importantly, you can implement a trait in two locations: At the definition of the struct, and at the definition of the trait. This means that while in Java, you cannot make String implement your interface, in Rust it is perfectly possible to provide an implementation of your trait for String.

Variables, constants and calculating things

1class Foo {
2  double calculate(long x, double y, int z) {
3    var delta = x < z ? 2 : -5;
4    x += delta;
5
6    var q = y * x;
7
8    return q + z;
9  }
10}

This snippet might not seem exciting to most Java developers. In fact, there is not a lot going on. Just some basic arithmetic.

1fn calculate(x: i64, y: f64, z: i32) -> f64 {
2    let x = x + (if x < z as i64 { 2 } else { -5 });
3    let q = y * x;
4
5    q + z
6}

The corresponding Rust function looks very similar, but there are a few points worth considering. First, we see a bit of an odd declaration. x is declared as a parameter, and then re-declared by the let. What this means is shadowing the previous declaration – from line 3 onward, only the calculated value is visible. Note that this does not change the value of x – it is a constant. Instead, it changes the meaning of the symbol.

Also noteworthy is that we just use an if for our check. An if with both a then and an else-case produces a value, just like a trinary operator in Java.

In fact, any block that ends with a value implicitly “returns” this value. This is the reason we can just close our function declaration with the expressionq + z without having to write an explicit return. In fact, return is only necessary to return from a function early. Note the absence of a semicolon – adding one “destroys” the value, turning the expression into a statement.

Iteration

1fn loops(limit: u32, values: Vec<i32>, condition: bool) {
2    // while (condition)
3    while condition {
4        
5    }
6    
7    // for (var i: values)  
8    for i in values {
9        
10    }
11    
12    // for (var i = 0; i < limit; i++)
13    for i in 0..limit {
14        
15    }
16
17    // do-while must be simulated
18    loop {
19
20        if !condition {
21            break
22        }
23    }
24}

Iteration is done in a similar fashion as in Java – while loops are, in fact, almost completely unchanged. There is a handy abbreviation for the endless loop (simply called loop), and the for keyword does allow for iteration of “iterable things”. Java developers will know Iterable. The Rust equivalent is called IntoIterator.

But what about the classic Java for-loop? for (int i = 0; i < limit; i++) is a variant of the syntax we do not see on the Rust side. The secret here is the two dots in i..limit. This constructs a type called Range which provides the required IntoIterator implementation. While this does not completely match up with all the capabilities of the “init-check-update for loop”, it does very elegantly cover the most common use. More complex cases will need to be written out using while.

Match

1fn match_it(x: Option<i32>, flag: bool) -> i32 {
2    match x {
3        None => 0,
4        Some(3) => 3,
5        Some(_) if !flag => 450,
6        Some(x) if x > 900 => 900,
7        _ => -1
8    }
9}

Roughtly analogous to the switch expression in Java, match offers that functionality and more. Like Java switch, they allow to select different values in a single, concise statement. Unlike Java, the arms of a match statement can perform a lot more structural matching – in this case, we can branch depending on if an option value is present, further constraints and a default value. Note that match does check for exhaustiveness – all cases need to be covered.

Did you catch the small concept we just snuck past you? The Some and None expressions are the two possible values of the enum called Option in Rust. Rust allows enum values to actually be complete structs of their own, including data fields – something that would not work in Java since enum values can only exist once. In this way, we have a convenient and safe way to model “something that may but need not exist” – if the object is present, it will be constructed as Some(value), otherwise as None, and the user may check which is which via a match.

Life and death: No garbage collection

Java developers, you need to be brave. Rust does not have a garbage-collector. The older ones among you might have flashbacks to malloc/free, while the younger ones might scratch their heads on how the program is supposed to ever reclaim memory. Fortunately, there is a simple and elegant solution to the problem of when to destroy data in Rust. Every scope cleans up after itself and destroys all data that is no longer needed. Those of you with a C++ background may recall this approach as “RAII”.

What does this mean? Actually, it means something every Java developer probably finds intuitive: Your program reclaims memory once it has become unreachable. The key difference is that Rust does so immediately, instead of delaying it until a garbage collection.

Moving around objects

1fn destruction() -> String{
2    let string1 = String::from("Hello World");
3    let string2 = String::new();
4    let string3 = string2; // string2 moved to string3, no longer valid
5
6    for i in 0..32 {
7        let another_string = String::from("Yellow Submarine");
8        do_something(another_string); // another_string "given away" here, no longer our concern
9    }
10
11    return string1;  // string1 returned to caller, survives past this method
12    // string3 destroyed here
13}

Unlike in Java, in Rust an object is not always a reference – when you declare a variable to be String in Java, what you actually express is “reference to a String“. There may be other references to the same string, in almost arbitrary parts of the program's memory. In contrast, if you say String in Rust, that is exactly what you get – the string itself, exclusive and not shared with anything else (at least, initially). If you pass a String to another function, store it in a struct or otherwise transfer it anywhere, you lose access to it yourself. The string2 becomes invalid as soon as it is assigned to another variable.

A single scope owns any object – either a structure or a variable on the stack. The program can move an object from scope to scope. In the example, another_string moves from the scope of destruction to the scope of do_something . That scope takes ownership and potentially destroys it. Similarly, string1 moves out of the function in the return statement, and thus passes into the ownership of whoever called it. Only string3 becomes unreachable once the function exits, and is destroyed.

There is an exception to this scheme. Any type that implements Copy is not moved when a value is reassigned – instead, it is copied (as the name might imply). The copy is an independent object with its own lifecycle. Clone is a similar trait, but does require you to explicitly “confirm” that you want a potentially expensive copy by calling a method.

In effect, copy and clone provide functions similar to the Cloneable interface of the JDK.

Questions of ownership: references and mutability

The ownership scheme described in the previous section may seem simple and intuitive, but it has one major consequence: How would you write a function that does something to an object you want to use in the future, ideally without shuffling megabytes of data across your memory? The answer is “use references”.

Java and Rust: Their view on references

For Java, everything is a reference – well, almost everything. There are some primitive types, such as int or boolean. But any object type is always behind a reference, and thus indirectly accessible. Since everything is a reference anyway, you do not even declare anything to achieve this. That means, as you are probably aware, that once you allocate an object “somewhere” you can use it in arbitrary ways. The garbage collector will destroy it eventually.

That implies something both easy to understand and subtle: References can live an arbitrary time – they define how long the object lives, not the other way around. You can pass and store references wherever you want. The object lives long enough to ensure the references always remain valid.

As explained in the previous chapter , Rust maintains a clear ownership of the object. This allows the language to clean up an object immediately when it becomes unused. At this point, there can be no more references – otherwise, you would still be able to access an object past its death.

A reference can be introduced by the ref keyword, but can also be declared in the type of a variable, or derived ad-hoc by just taking the reference via the address operator. In general, the & operator turns a value into a reference. As part of a type, & declares the type to be a reference.

1fn reference() -> u32 {
2    let x: u32 = 10;
3    let y = &x; // reference to x, type is &u32
4    let z: &u32;
5
6    {
7        let short_lived: u32 = 82;
8        z = &short_lived;
9    }
10
11    *z
12}

This code is invalid – and the Rust compiler tells us that short_lived does not live long enough. Fair enough. We can create references to another object in memory. In exchange, we need to ensure these references do not dangle after the death of the object.

Shared pain – mutability and references

1class ConcurrentModification {
2    public static void main(String[] args) {
3        var list = new ArrayList<>(Arrays.asList("A", "B", "C"));
4
5        for (var str : list) {
6            System.out.println("Doubling " + str);
7            list.add(str + str);
8        }
9    }
10}

Many Java developers will have run into the bug illustrated in this code snippet. You are modifying an object currently in use. You run the code. Bam! ConcurrentModificationException. Surprisingly enough, the alternatives would be worse. An unexpected endless loop is usually harder to debug than a relatively clean exception. Actual concurrent access by many threads would be worse still. So it would be good to have the compiler enforce a bit of safety here.

1fn endless_loop() {
2    let mut vector = vec!["A".to_string(), "B".to_string(), "C".to_string()];
3
4    for string in &vector {
5        let mut new_string = string.clone();
6        new_string.push_str(&string);
7        vector.push(new_string)
8    }
9}

This entire class of errors is not possible in Rust. A very simple rule prevents this: You can either have as many read-only references to an object as you want, or you can have a single reference that allows for modification. So the potentially endless loop in the previous example cannot happen in Rust. The iterator will demand an immutable reference to the list. That reference will block the creation of a mutable reference. However, we would need a mutable reference for push. Thus the compiler rejects the code sample.

Note that this code again sneakily introduces a new concept: mut. This modifier announces that a variable or reference can alter values. This is the opposite to the approach in Java. In Java, every variable is mutable, unless it is declared final.

Java is fine with final Objects being altered internally. You can declare a final List and still add elements to it. In Rust, you cannot create a mut reference to a non-mut variable. If your Vec is not mutable, this also includes altering its content (usually, some exceptions exist). While this means you need to think a little more deeply about mutability on occasion, it at least prevents an UnsupportedOperationException.

Java-like references in Rust: Rc and Arc

For many problems, the native approach in Rust is all we need – we allocate an object, do something with it, then destroy it once it has served its purpose. But sometimes, we want to have Java-like semantics. We want something to stay alive for as long as we are using it somewhere. Think of connection pools. We certainly want to share the pool among more than one object.

1struct Pool;
2struct RequestContext {
3    connection_pool: Rc<Pool>
4}
5
6fn share_the_love() -> Vec<RequestContext>{
7    let mut result = Vec::new();
8    let pool = Rc::new(Pool);
9
10    for _ in 0..1000 {
11        let connection_pool = pool.clone();
12        result.push(RequestContext { connection_pool })
13    }
14
15    return result;
16}

The Rc in this code sample means reference-counted. The Rc “wraps” around the actual object. It is cheap to clone, and can provide a reference to the actual object “behind” the Rc. Each of the RequestContext objects created can live for a different lifetime. The Rc can even be cloned and associated with something else entirely without affecting them – and no second Pool will be created.

1fn bad_rc() {
2    struct Container {
3        contained: RefCell<Option<Rc<Container>>>
4    }
5
6    let mut outer = Rc::new(Container { contained: RefCell::new(None) });
7    *outer.contained.borrow_mut() = Some(outer)
8    // and the container lives forever
9}

Reference-counting is a cheap strategy to manage lifetimes. It has many advantages, but it does have one major caveat – it cannot deal with cycles. In this example we create such a cycle. This object will live forever – the reference inside itself can keep it alive. In Java, this is not a problem, the garbage collector can ignore such internal references. In Rust, the outer Rc is destroyed, but the inner keeps the object alive. Note also the RefCell. This is one of the exceptions to the “deep mutability” rule mentioned earlier. Rc may want to protect us from altering the shared value (by only allowing an immutable reference). Nevertheless, RefCell stands ready to break this rule and allow us to shoot ourselves in the foot.

Rc is cheap and does as little as possible. It does not do the expensive logic to work in concurrent scenarios. If you prefer to work with multiple threads sharing data, you should use its close cousin Arc instead. Arc works exactly the same, but it does the additional synchronization to work safely across thread boundaries.

Inheriting the earth: traits and implementations

We learned what traits are way back in the beginning . They are the Rust analogue to Java interfaces. Other than the decision to have a trait implementation being an independent block, they look almost exactly the same. And for the most part, they can be. However, implementing interfaces only covers one of the two “class header” keywords of Java: implements. What about extends, the once-shining star of object-oriented programming that has fallen by the wayside over the years?

In short, it’s not part of the language for Rust. No concrete inheritance is possible. One of your structs may have a field of another struct and delegate some of its methods. You may implement AsRef or something similar for another struct. What you cannot do is override another structs methods or treat one struct as another when assigning values.

What is possible is that one trait requires another to work. This is similar to extending an interface in Java – in order to implement the child trait, you also need to implement the parent trait. However, there is a small distinction. As always, each trait gets its own block.

The chief use of Java interfaces is calling interface methods regardless of their implementation. The same is possible in Rust. This is called dynamic dispatch in Rust, and indicated by the dyn keyword.

1fn stringify(something: &dyn AsRef<str>) -> String {
2    String::from(something.as_ref())
3}
4
5fn call_stringify() {
6    struct Foo;
7    impl AsRef<str> for Foo {
8        fn as_ref(&self) -> &str {
9            "This is custom"
10        }
11    }
12    stringify("Hello World"); // &str
13    stringify(&String::new()); //&String
14    stringify(&Foo); // &Foo
15}

In this snippet, we see this capability in action: We define a single function, which can be invoked with references to any number of types that implement the trait AsRef. This is very convenient and very closely aligns what we expect to do with Java interfaces – pass an object by reference without necessarily knowing its exact type, merely specified by its behavior.

Putting things into boxes

The approach of “just passing a reference” works fine for dealing with parameters. It feels intuitive and very similar to what you would do in Java. It might not be the absolute fastest way to do things, but it usually serves well. However, sometimes we do not want to pass a parameter to a function – instead we want to return a value from a function.

1fn return_dyn() -> dyn AsRef<str> {
2    return "Hello World"
3}

Unfortunately, while this looks like it “should work” from the point of view of a Java developer, Rust has some additional constraints. Namely, that ownership of the object is passed to the caller. Without going into too much technical detail, receiving ownership of an object means having an obligation to store that object, too. And to do that, we need to know one crucial detail: We need to know its size.

All Java objects live on a large heap, and their true size is actually pretty hard to determine. Rust has a different strategy: Rust wants to keep as much of its data as is sensible on the stack. When you allocate a struct, you actually put that many bytes on the stack. Just returning dyn Trait does not give enough information to accomplish that. After all, for all you know, there might be different implementations depending on some internal conditions. So for dynamic returns, the stack is out of the question.

1fn return_dyn() -> Box<dyn AsRef<str>>  {
2    return Box::new("Hello World")
3}

By using the type Box, we tell the compiler that our value should not be placed on the stack. Only a special kind of reference goes on the stack, the actual data starts out on the heap. The Box itself has a fixed size and can clean up the heap-placed object properly.

Not quite naming things

There is an alternative to boxing values. While boxing an object is very much in the style of Java, Rust is not eager to use much heap. After all, keeping track of the heap is comparatively slow and complex. Sometimes the reason to return a trait is merely to hide information. Frequently, developers do not want to change the type depending on some parameters, but instead just not expose such an implementation detail. Occasionally the type may also be hard to name, or have no name that can be written down.

1fn return_impl() -> impl AsRef<str>  {
2    return "Hello World"
3}

This looks very neat and tidy. It does not expose the implementation type, but instead just says “I return something that you can use as the trait”, without going into detail about what that something is. Beneath the metaphorical hood, though – the compiler knows. It knows and can optimize for the actual type, up to and including not doing a dynamic call at all.

Generally speaking: Generics

Pretty much all Java developers know at least the basics of generics: They are what makes Collection et. al. work in a sensible fashion. Without generics (and pre-Java 5), all these types operated solely on objects. Under the hood, they still do this by removing all generic types and replacing them with the “upper bound”. Rust does not have a common supertype like Object, but still has generic types (you have seen a few of them in this article already).

Since Rust does not have a “common supertype”, it stands to reason that its approach must be different. And indeed, it is. Where Java creates the same code for all potential type parameters, Rust instead emits special code for each actual type parameter combination.

You can define constraints on type parameters in Java – and Rust works the same way. Where in Java, the syntax is T extends S, Rust has a somewhat less wordy alternative: T: S. Remember there is no way to “extend a struct” in Rust, so only traits can constrain a type. Multiple traits can be demanded by simply specifying Trait1 + Trait2, much like the Java Interface1 & Interface2 notation. However, since Rust traits are frequently much narrower than Java interfaces tend to be, you will encounter the plus-notation a lot more often.

Alternatives to dynamic dispatch

1fn wrap<A>(param: A) -> Vec<A> {
2    let mut v = Vec::new();
3    v.push(param);
4    v
5}
6
7fn add_three<A: Add>(one: A, two: A, three: A) -> A {
8    one.add(two).add(three)
9}
10
11fn example() {
12    wrap("Hello World"); // calls wrap<&str>
13    wrap(999); // calls wrap<i32>
14
15    add_three(10, 20, 30); // calls add_three<i32>
16    add_three(0.5, 0.9, 38.4); // calls add_three<f64>
17}

The above snippet illustrates this pattern. We have two functions that take parameters of a number of types, and operate on them. However, the second example is actually interesting: We do use the plus operation of the Add trait. Yet, the code contains no dyn.

This is due to the difference in strategy mentioned before. When our add_three function is called, the compiler actually creates a different function for each A – and may even decide to inline some or all of these calls. For our example with 32-bit integers, there is no need to even call any functions at all to add them. The compiler can emit extremely high-performance machine code.

1.add_three:
2lea    (%rdi,%rsi,1),%eax
3add    %edx,%eax
4retq

Associated types vs. generics

Generics are a well-known concept to Java developers, and that concept translates well to Rust. There is a key difference, though: Java does not support implementing the same generic interface twice – even with different type parameters.

1class Twice implements Comparable<Twice>, Comparable<String> {
2
3    public int compareTo(String o) {
4        return 0;
5    }
6
7    public int compareTo(Twice o) {
8        return 0;
9    }
10}

This may seem unexpected even to seasoned Java developers, but it has a good reason: Type erasure. Since the type parameter of Comparable is forgotten, the actual compareTo method has to have Object parameters. Only one method can have that exact signature, and it does not really have a chance to figure out which of the two compareTo methods to forward an argument to. In contrast, Rust allows two implementations of the same trait with different type parameters. The compiler generates both of them, and selects the “proper one” at each occurrence. There is no type erasure, and thus no need for a “hidden” forwarding method.

Sometimes this ability is a boon – the developer has more options and less chances to trip up. Sometimes, though, it is inconvenient. The IntoIterator trait is one such example. It should probably not be implemented multiple times. What would the type of the variable in a for loop be? For this reason, there is a way to move a type variable “into” the trait: Associated types.

1trait AssociatedType {
2    type TheType;
3}
4
5impl AssociatedType for i32 {
6    type TheType = String;
7}
8
9fn mogrify<A: AssociatedType, B: AssociatedType<TheType=String>>(a: A, b: B) {
10
11}

With an associated type, you do not have a type variable in the impl clause – and hence, you cannot implement the same trait twice. Thus, you gain much the same behavior as in Java. Only one implementation is possible. In Rust that is an intentional choice you can make, rather than a constraint of the language’s history.

There is one final bit of interesting code in the above example. Line 9 shows how to refer to a trait with an associated type. If we do not need to know the type itself, we just write the trait bound as we usually would. But if we do need that knowledge, we can peek beneath the hood, and treat the associated type like a parameter. The syntax is slightly different from “normal” parameters. Associated types need to be specified as Name=Value rather than just by their position.

Functional thinking: Lambdas and closures

Lambdas have been part of Java for a long time now, first making their entrance with Java 8. They are essentially a shortcut to turn a function (method) into an object. Before Java 8 came along, that required a dedicated (often anonymous) class, and a lot of notation. It probably comes as no surprise that Rust offers much the same capability. In fact, even the notation should seem familiar to most Java developers.

1fn double_and_summarize(input: &Vec<i32>) -> i32 {
2    input.iter().map(|x| x * 2).fold(0, |a, b| a + b)
3}

Other than some fine points in notation (lack of braces, …) the Rust code looks very similar to what we would write in Java. Things do get somewhat more interesting, when we look at the underpinnings of “functional style” code. Java uses the notion of a SAM interface. Effectively, any interface that only lacks a default implementation for a single method can serve as the target for a lambda expression. Rust is more explicit and arguably more limited than Java. There is a dedicated family of traits to represent functions.

Types of functions (and how to use them)

The “function” traits in Rust are special. You can only implement this family of traits with the closure syntax. The traits have a somewhat special syntax themselves. They all have the form TraitName(argumentTypeList...) (-> Result)?

The “function family” contains three traits. Each closure you define automatically implements the most permissive one possible.

  • FnOnce is the “weakest” of these three families. You can invoke these functions at most once. The chief reason for this might be that the function receives ownership of an object, and destroys it once it completes.
  • The FnMut family does not have the same limitation, but it still is somewhat limited in its applicability. An implementation has the option to mutate its “receiver”. The receiver is analogous to the this in Java. However, an FnMut be used in place of a FnOnce.
  • Fn is the most general class of functions. You can call them multiple times, and they do not capture any (mutable) state. Essentially, these functions have no “memory”. An Fn closure can be used in place of the other two types.
1fn invoke_once<F: FnOnce()-> SomeStruct>(function: F) -> SomeStruct {
2    function()
3}
4
5fn invoke_mut<F: FnMut() -> SomeStruct>(function: &mut F) -> SomeStruct {
6    function()
7}
8
9fn invoke<F: Fn() -> SomeStruct>(function: &F) -> SomeStruct {
10    function()
11}
12
13fn invoke_with_once_closure() {
14    let s = SomeStruct;
15    let closure = || s;
16    invoke_once(closure);
17}
18
19fn invoke_with_mut_closure() {
20    let mut count = 0;
21    let mut closure = || {
22        count += 1;
23        SomeStruct
24    };
25    invoke_mut(&mut closure);
26    invoke_once(closure);
27}
28
29fn invoke_with_nonmut_closure() {
30    let mut closure = || SomeStruct;
31    invoke(&closure);
32    invoke_mut(&mut closure);
33    invoke_once(closure);
34}

This example showcases the different closure types that can result. The first one (defined in invoke_with_once_closure) actively takes ownership of a variable, and thus is forced to implement the weakest of the three traits, FnOnce. The second example produces its own value on each invokation. So it is able to produce a value multiple times. However, it captures part of its calling environment. In order to be able to increment x, implicitly a &mut is created. Thus, the closure requires a mutable context itself.

This added complexity serves a rather simple purpose: Keeping track of what lives for how long. Imagine referencing a local variable in a closure, and having the containing block exit, thus destroying the value. This once more showcases the difference in design philosophy. Java has decided to cut down the complexity by omitting the trickier cases of FnMut and FnOnce. After all, all captured values must be “effectively final”.

Returning closures

While maybe not the most common use case, sometimes it is useful to return a closure.

1class MakeRunnable {
2    Runnable makeRunnable(Object o) {
3        return () -> {
4            runWith(o);
5        };
6    }
7}

In Java, this is very elegant due to the SAM convention – you just return the interface you want your closure to implement. In method body, you can write out a closure in the return statement. Simple.

1fn make_runnable(a: SomeStruct) -> impl Fn() {
2    move || runWith(&a)
3}

Achieving the same in Rust is a little more complex. We need to give the compiler one more hint: The move keyword. Without this keyword, the value a would die as soon as the call to make_runnable returned. Thus, the closure would reference a dead value, and bad things would happen. The move keyword tells the Rust compiler to move any captured variable into the ownership of the closure instead.

Also note that this function uses the impl Trait return type previously discussed. Without that syntax, we would need a named type after all, and would have to manually implement the closure functions.

When things go wrong: Error handling

Error handling is a pain for most developers. It can easily detract from the intent of the code. Error handling is one also of the most likely culprits for hard-to-follow logic. In the worst case the developer just foregoes error handing – with mysterious crashes at random times as a result. Any language worth its salt needs a user-friendly error handling strategy.

Here, the paths of Rust and Java diverge rather significantly. Java is a child of the 90s. The then-novel concept of exceptions takes center stage in its error handling strategy. Generally speaking, a method will throw an Exception to signal an error condition. That aborts the execution of the current method, and “skips back” on the stack to a matching handler.

Caring about Results

This is a very convenient model for the developer, only slightly hampered by the overhead of doing throws declarations. It is also very expensive to implement. Rust, much more than Java, cares a lot about performance. So it stands to reason that Rust would favor another way to handle errors over raising exceptions: Encoding the success or failure of an operation into the returned value. Similarly to the Optional type we know from Java, Rust defines the Result type.

1fn f() -> Result<i32, SomeError>{
2    unimplemented!()
3}

In essence, the above code fragment expresses the same thing as this Java signature:

1class SomeClass {
2  int someMethod() throws SomeException {
3    // ... 
4  }
5}

The key difference here is that the failure does not propagate automatically up the stack: There is no need for special logic to find an exception handler. Perhaps most crucially, there is no stack trace – the functions all return normally, albeit with a result that indicates an error.

Now, this seems very error-prone at first glance. After all, it is very easy to just forget to check the result of a call, or discard it altogether. Thankfully, Rust offers a capability that Java lacks to compensate: a compiler designed to assist the developer in catching such mistakes. Rust has the capability to mark a returned value as “must use”, and compilation will fail if you discard such a return value.

The ? Operator

1fn f() -> Result<i32, SomeError>{
2    match function1() {
3        Err(e) => Err(e),
4        Ok(v) => {
5            let mut sum = 0;
6            for element in v {
7                match function2(element) {
8                    Err(e) => return Err(e),
9                    Ok(intermediate) => match function3(intermediate) {
10                        Err(e) => return Err(e),
11                        Ok(next) => sum += next,
12                    }
13                }
14            }
15            Ok(sum)
16        },
17    }
18}

That code is beyond ugly – it is borderline incomprehensible. Thankfully, a special kind of syntax exists to ease the pain of properly handling results: ?. This innocuous operator effectively serves as a shortcut to the statements above. If you use this try-operator, the code reads quite similar to Java code without using the much more expensive exception mechanism.

1fn f2() -> Result<i32, SomeError>{
2    let mut sum = 0;
3    for element in function1()? {
4        sum += function3(function2(element)?)?
5    }
6
7    Ok(sum)
8}

Different types of errors

Not all errors are alike. After all, the Result type is parametrized over the error type as well as the result type. Error types may be a simple as “something went wrong” to relatively complex structures with lots of helpful error-handling information. Therefore, it may be necessary to convert one kind of error into another. The code ? operator already has support for this: If there is a Into Implementation from the actual error to the expected error, the operator will simply use this to convert. Otherwise, some custom code may be necessary (such as calling map_err on the Result object).

Many libraries (“crates”) define an error type specific to that library – and some also offer a convenient shortcut on dealing with potentially failing operations: They define a type alias for Result which fixes the error parameter, so the user can save on typing the error parameter each time.

When all is lost

At the intro of this chapter, we mentioned that Rust does not like to produce backtraces or deal with “abrupt exits” of functions. That is true, but it is not the whole picture. There exists one piece of the puzzle: panic. This function does exactly what its name implies. It gives up and runs away, much like a Java exception would. It is not the preferred way to handle things in Rust, and mostly used for cases when the error is on the level of a failed assertion. In other words, your program should panic if it notices a bug by itself (such as an array-out-of-bound). Panics are a debugging tool and not the proper way to handle errors.

You can actually “catch” a panic if you employ some functions in the standard library, but there is usually little benefit in doing so. Note that thankfully even a panic is a “controlled panic” – all cleanup is still done when each scope exits.

Multiple ways of doing multiple things: How Rust and Java handle concurrency

Your phone probably has multiple cores, and any program not using more than one of them needs to ask itself: Why not? And consequently, parallel and concurrent programming has become ever more important.

Currently, there are two chief approaches to this: (Thread-based) parallel computation and concurrent execution. The venerable Thread API, and the much younger CompletionStage API provide these in Java. Both have close relatives in Rust, and both have one major constraint: the ability to share data securely between threads. With Java, this has always been an open issue: You can always share References freely. You just need to manage shared access properly. You also need to know what “properly” means in each case.

In Rust, it is very clear what may be shared between different, concurrent contexts: Anything that implements Sync. Similarly, anything that implements Send can be transferred between different threads. Remember the whole concept of ownership , though – an immutable reference might be Sync, but if its lifetime is not long enough to ensure all tasks you share it with are completed, you still cannot use it across multiple contexts.

The compiler will automatically implement the proper Send and Sync traits. Generally, the types you usually will interact with will be both. The reason is simple: Any type composed entirely of Send types will be Send itself, and the basic types are Send. The same holds true for Sync. Some exceptions apply, though – so be sure to check the full documentation.

Threading the needle

Threads have been here for a very long time – since the 90s, actually. They are essentially memory-sharing lightweight processes. Java makes it very simple to generate a new thread.

1class Threads {
2    private static final int SIZE = 500;
3    int[] foo() throws Exception {
4        var results = new int[SIZE];
5        var threads = new Thread[SIZE];
6
7        for (var i = 0; i < SIZE; i++) {
8            final int threadNum = i;
9            threads[threadNum] = new Thread(() -> {
10                results[threadNum] = 2 * threadNum;
11            });
12            threads[i].start();
13        }
14
15        for (var thread: threads) {
16            thread.join();
17        }
18        return results;
19    }
20}

Serviceable, but not exciting. The major problem here is that the threads are not able to effectively communicate their results back to the generating function, but otherwise, this is pretty easy to understand – no data is shared between the threads, after all.

1fn threads() -> Result<Vec<i32>, Box<dyn Error>> {
2    let size = 400;
3    let thread_results: Vec<JoinHandle<i32>> = (0..size).map(|i| {
4        spawn(move || i * 2)
5    }).collect();
6    let mut result = Vec::new();
7
8    for join in thread_results {
9        result.push(join.join()?)
10    }
11
12    Ok(result)
13}

Rust looks extremely similar, but offers a slight cherry on top – each thread has a JoinHandle that is generated by spawning (rather than keeping a mutable representation of the thread around). That JoinHandle allows only a few basic operations – way fewer than Thread, but it does allow waiting for the thread to complete, and to retrieve a result value.

Into the Future

Threads are great for simple parallelism – especially for server applications where each of the threads will see one request from start to finish. That model is, as you probably know, not the most efficient and responsive one. After all, the threads would block waiting for IO most of the time.

1class Completeable {
2    CompletionStage<SomeType> processing() {
3        return loadDataFromDatabase()
4                .thenCompose(data -> {
5                   var written = writeToFileSystem(data);
6                   var additionallyLoaded = loadMoreData(data);
7
8                   var processWriteAdditionally = additionallyLoaded.thenCombine(written, (data2, ignore) -> data2);
9                   return processWriteAdditionally.thenCompose(data2 -> callRestWebService(data, data2));
10                });
11    }
12}

This Java code reads reasonably well, once you are familiar with the API – it chains together a number of async invocations and forces them all to be successful, producing a final result. All the details of the invocations are elided in this example, of course – but the sheer number of braces does lead to a bit of a headache.

1async fn processing() -> SomeType {
2    let loaded = load_from_database().await;
3    let write_op = write_to_fs(&loaded);
4    let load_additional_op = load_more_data(&loaded);
5
6    let (_, additional_loaded) = join!(write_operation, load_additional_op);
7    call_rest_service(&loaded, &additional_loaded).await
8}

Rust has opted to extend its syntax, since async code is important and will only get more so in the future. The corresponding Rust code consequently looks a lot cleaner.

The special syntax is essentially just sugar, though – an async fn is essentially just a normal function that returns impl Future. In fact, the async modifier is not actually required “per se” – it is just syntactic sugar for declaring such a function, a type that serves as the return type, and implementing the Future trait. Without it, the code would look much like the Java code example.

Conclusions

In this post, you learned some of the basics of Rust. Now, will Rust completely replace Java in the next five years? No, probably not. But it is a sleek new low-level language that has promise. It is blazingly fast, well-structured and generally fun and expressive. Plus, the language cares to support application programmers with some of the best diagnostics and language features I have seen in two decades of development. Best of all, it is amazingly safe, while still being low-level. Whole classes of common errors are completely eliminated by the language rules, which is no small feat.

So, when you are doing your next microservice, why not give Rust a chance? You might want to check out the Actix framework for your web server. If you want to delve deeper into the language, the Rust book is your first go-to resource. For those who regularly truck with sun.misc.Unsafe, a peek at the unsafe sub-language in the Rustonomicon might get the creative juices flowing.

share post

Likes

8

//

More articles in this subject area

Discover exciting further topics and let the codecentric world inspire you.

//

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.