Virtual Structs Part 1: Where Rust’s Enum Shines

darksaints · on May 5, 2015

Rust was the first language in which I used tagged unions. While I can't say that I use Rust at the moment (not mature enough and too low-level for my current needs), I found pattern matching and tagged unions so beneficial and practical that now I refuse to use any language that doesn't have them. Currently I'm writing a bunch of code in F#.

I don't know what the language of the future will be but I can't imagine it not having an ML heritage.

vezzy-fnord · on May 5, 2015

ML-style pattern matching is neat. There's actually an entire formal logic of computation oriented around it called the pattern calculus, though it's still in relatively early research.

That said, the language with the most powerful pattern matching engine I've used so far has been Erlang. Just about everything there is intimately tied to it, giving it a homoiconicity between the code and the literal data types. You can even pattern match on binary formats, like file headers or packet structures.

It seems like Algol-like languages are slowly absorbing lessons from ML. Rust is quite an evident example.

gnuvince · on May 5, 2015

I needed to write a small AST in Java, and I actually had to write the definition in ML on a sheet of paper to see how to do it. It's so weird that so many languages don't have sum types.

sixbrx · on May 5, 2015

What's too bad is that Java had a chance to handle alternatives properly as sum types, when enums were being added. But the designers chose instead of sum types, to add them as yet another kind of "object" that have to have a common constructor (and thus the exact same backing data) and methods. So basically all the variants have to be of the same "shape". Very disappointing!

sanderjd · on May 5, 2015

This has been my experience with Rust as well. Came for the C/C++ competitor, stayed for the type system. Also not using it for anything "real" yet, but excited to at some point.

digikata · on May 5, 2015

In C++ (and C for that matter) you use an enum + union in a struct to implement these virtual structs, then all the criticisms of pointer slicing, and not being able to pass by value go away. You do have to manage all that manually though, so Rust would definitely have the very nice advantage of having language support to do that cleanly.

sepeth · on May 5, 2015

FP languages have algebraic data types for this. Why do they call it virtual structs? How are they different than ADTs?

jarrettc · on May 5, 2015

Rust has algebraic data types. They're one of Rust's most important features.

But algebraic data types do not in themselves provide a way to express the notion of a type with variants or sub-types.

eddyb · on May 5, 2015

"Virtual Structs" is a name for a potential feature that may expose some sort of structural inheritance or subtyping.

The name is not used for the current "enum" feature, which is one of the few ADTs in Rust (there's also "struct" and tuples - not sure what else qualifies).

pdpi · on May 5, 2015

enum gives you sum types, struct/tuples provide product types. That's pretty vanilla ADTs sorted. GADTs are a whole other can of worms though.

Ygg2 · on May 5, 2015

This is part one: The good part of Rust's ADT/Enum system. Next I presume is the bad part of it - namely how it caused pain when implementing DOM in Servo.

cmrx64 · on May 5, 2015

Rust enums are ADTs.

jarrettc · on May 5, 2015

I'm hoping the next post will talk about this use case: Suppose I have an enum called "Canine" and I want each of its variants to implement a different "bark" method. Currently, as far as I know, I have to write a match statement dispatching "bark" to each "Canine" variant. If I have "bark" and "growl," I have to write two match statements, and so on for each method that needs to be dispatched to different variants.

So it's a lot of boilerplate. I think it can be slimmed down with a macro, but still.

You might think traits rather than enums are the way to go here. Sometimes that may be true. But often, an enum is far preferable because in Rust, a trait is not a type, but an enum is. That means you cannot, for example, have a "Vec<CanineTrait>," but you can have a "Vec<CanineEnum>."

dcb18 · on May 5, 2015

A trait is a type, but the type does not have a size, which is the reason it is not possible to store a trait instance directly in a Vec (or any other place needing a sized type). It is possible to store a pointer to a trait object in a Vec, which would look like Vec<Box<CanineTrait>>.

jarrettc · on May 5, 2015

Ah, my mistake. So would I be correct in saying that a trait is a type, but it does not necessarily implement the Sized trait, and if it does not, then it cannot be allocated directly on the stack?

Also, I've been wondering something about dereferencing a Box where the inner type is just a trait. If I dereference a Box<CanineTrait> and call "bark," how is the correct implementation found at runtime? Is there a vtable or something analogous?

Rusky · on May 5, 2015

Trait object pointers are "fat"- they consist of both a pointer to the object and a vtable pointer (much like Go interfaces if you've used those). Other DSTs are similar- pointers to slices consist of both a pointer to the elements and a count.

losvedir · on May 5, 2015

> If I dereference a Box<CanineTrait> and call "bark," how is the correct implementation found at runtime? Is there a vtable or something analogous?

I think a vtable. It's my understanding that "trait objects" (boxed traits) are how you implement dynamic dispatch in rust. If instead you were talking about a `fn foo<T: CanineTrait>(x: T) { ... }` trait bound (the more common case), there will be monomorphization and static dispatch.

Source: http://doc.rust-lang.org/1.0.0-beta/book/static-and-dynamic-...

steveklabnik · on May 5, 2015

After the beta landed, I re-did the TOC for the book, and so this information will be at http://doc.rust-lang.org/nightly/book/trait-objects.html in the future. (where nightly = 1.0.0, of course)

st3fan · on May 5, 2015

Yup. This is also one of my favourite language features in Swift.

halayli · on May 5, 2015

An enum, short for enumeration, shouldn't hold data, else don't call it an enum. It's no longer an enumeration, it's a data structure.

kinghajj · on May 5, 2015

Rust's enum is a natural extension to the common notion of enumerated types to ADTs: http://en.wikipedia.org/wiki/Enumerated_type#Algebraic_data_....

halayli · on May 5, 2015

If Rust wants to win C/C++ developers, it shouldn't surprise them a lot.

barosl · on May 5, 2015

I would say that the reason it is named `enum` is exactly not to surprise them. While learning Haskell, I had used `data` several times, but never grokked its true meaning. The naming of `enum` in Rust helped me finally understand what ADTs are, and thus also made me truly understand what Haskell's `data` is. I really appreciate Rust for trying to introduce a certain concept with the existing names.

kinghajj · on May 5, 2015

Well it shouldn't be surprising, as Rust's enum behaves pretty much like a C/C++ one if all cases are nullary. The only changes are syntactical ("EnumType::Variant" instead of just "Variant", unless you "use EnumType::*"). In fact, it's exactly because they didn't want to be surprising that they used "enum" instead of something else, like "data" in Haskell.

Dewie3 · on May 5, 2015

They already chose to try to appeal to those developers by calling it "enum". And that doesn't sound like a good motivation to me: how hard is it to learn something like algebraic data types, compared to learning about lifetimes? It's like the choice to use angle brackets for generics - if the intent is to be easy for programmers with a certain background, that only lures them in with a false sense of familiarity. Because something like angle brackets over square brackets is, again, a superficial and easy thing to learn compared to any of the semantic stuff.

pcwalton · on May 5, 2015

> An enum, short for enumeration, shouldn't hold data, else don't call it an enum.

They can in Java. Java's about as mainstream of a programming language as you can think of.

taeric · on May 5, 2015

Not in the same way. In Java, you can have an enum that has some data placed on it, at a per value basis. You can not have an enum where different instances of the same value have different values attached to it.

That is, Enum.A will always be Enum.A with the same data on it at all times in Java. This is not the case for ADTs.

pcwalton · on May 5, 2015

Sure, but having enums with data is not per se out of the ordinary in mainstream languages.

vbezhenar · on May 5, 2015

Swift uses the same terminology. Probably meaning of enum will include ADT.

Dewie3 · on May 5, 2015

I agree that the name seems misleadingly specific: it's named after a type of construct which it subsumes.

In cases like this I think it is better to call it by a more generic name rather than a specific name which it really subsumes (like Haskell using the `data` keyword for algebraic data type definitions).