Rust was the first language in which I used tagged unions. While I can't say that I use Rust at the moment (not mature enough and too low-level for my current needs), I found pattern matching and tagged unions so beneficial and practical that now I refuse to use any language that doesn't have them. Currently I'm writing a bunch of code in F#.
I don't know what the language of the future will be but I can't imagine it not having an ML heritage.
ML-style pattern matching is neat. There's actually an entire formal logic of computation oriented around it called the pattern calculus, though it's still in relatively early research.
That said, the language with the most powerful pattern matching engine I've used so far has been Erlang. Just about everything there is intimately tied to it, giving it a homoiconicity between the code and the literal data types. You can even pattern match on binary formats, like file headers or packet structures.
It seems like Algol-like languages are slowly absorbing lessons from ML. Rust is quite an evident example.
I needed to write a small AST in Java, and I actually had to write the definition in ML on a sheet of paper to see how to do it. It's so weird that so many languages don't have sum types.
What's too bad is that Java had a chance to handle alternatives properly as sum types, when enums were being added. But the designers chose instead of sum types, to add them as yet another kind of "object" that have to have a common constructor (and thus the exact same backing data) and methods. So basically all the variants have to be of the same "shape". Very disappointing!
This has been my experience with Rust as well. Came for the C/C++ competitor, stayed for the type system. Also not using it for anything "real" yet, but excited to at some point.
In C++ (and C for that matter) you use an enum + union in a struct to implement these virtual structs, then all the criticisms of pointer slicing, and not being able to pass by value go away. You do have to manage all that manually though, so
Rust would definitely have the very nice advantage of having language support to do that cleanly.
"Virtual Structs" is a name for a potential feature that may expose some sort of structural inheritance or subtyping.
The name is not used for the current "enum" feature, which is one of the few ADTs in Rust (there's also "struct" and tuples - not sure what else qualifies).
This is part one: The good part of Rust's ADT/Enum system. Next I presume is the bad part of it - namely how it caused pain when implementing DOM in Servo.
I'm hoping the next post will talk about this use case: Suppose I have an enum called "Canine" and I want each of its variants to implement a different "bark" method. Currently, as far as I know, I have to write a match statement dispatching "bark" to each "Canine" variant. If I have "bark" and "growl," I have to write two match statements, and so on for each method that needs to be dispatched to different variants.
So it's a lot of boilerplate. I think it can be slimmed down with a macro, but still.
You might think traits rather than enums are the way to go here. Sometimes that may be true. But often, an enum is far preferable because in Rust, a trait is not a type, but an enum is. That means you cannot, for example, have a "Vec<CanineTrait>," but you can have a "Vec<CanineEnum>."
A trait is a type, but the type does not have a size, which is the reason it is not possible to store a trait instance directly in a Vec (or any other place needing a sized type). It is possible to store a pointer to a trait object in a Vec, which would look like Vec<Box<CanineTrait>>.
Ah, my mistake. So would I be correct in saying that a trait is a type, but it does not necessarily implement the Sized trait, and if it does not, then it cannot be allocated directly on the stack?
Also, I've been wondering something about dereferencing a Box where the inner type is just a trait. If I dereference a Box<CanineTrait> and call "bark," how is the correct implementation found at runtime? Is there a vtable or something analogous?
Trait object pointers are "fat"- they consist of both a pointer to the object and a vtable pointer (much like Go interfaces if you've used those). Other DSTs are similar- pointers to slices consist of both a pointer to the elements and a count.
> If I dereference a Box<CanineTrait> and call "bark," how is the correct implementation found at runtime? Is there a vtable or something analogous?
I think a vtable. It's my understanding that "trait objects" (boxed traits) are how you implement dynamic dispatch in rust. If instead you were talking about a `fn foo<T: CanineTrait>(x: T) { ... }` trait bound (the more common case), there will be monomorphization and static dispatch.
I would say that the reason it is named `enum` is exactly not to surprise them. While learning Haskell, I had used `data` several times, but never grokked its true meaning. The naming of `enum` in Rust helped me finally understand what ADTs are, and thus also made me truly understand what Haskell's `data` is. I really appreciate Rust for trying to introduce a certain concept with the existing names.
Well it shouldn't be surprising, as Rust's enum behaves pretty much like a C/C++ one if all cases are nullary. The only changes are syntactical ("EnumType::Variant" instead of just "Variant", unless you "use EnumType::*"). In fact, it's exactly because they didn't want to be surprising that they used "enum" instead of something else, like "data" in Haskell.
They already chose to try to appeal to those developers by calling it "enum". And that doesn't sound like a good motivation to me: how hard is it to learn something like algebraic data types, compared to learning about lifetimes? It's like the choice to use angle brackets for generics - if the intent is to be easy for programmers with a certain background, that only lures them in with a false sense of familiarity. Because something like angle brackets over square brackets is, again, a superficial and easy thing to learn compared to any of the semantic stuff.
Not in the same way. In Java, you can have an enum that has some data placed on it, at a per value basis. You can not have an enum where different instances of the same value have different values attached to it.
That is, Enum.A will always be Enum.A with the same data on it at all times in Java. This is not the case for ADTs.
I agree that the name seems misleadingly specific: it's named after a type of construct which it subsumes.
In cases like this I think it is better to call it by a more generic name rather than a specific name which it really subsumes (like Haskell using the `data` keyword for algebraic data type definitions).
I don't know what the language of the future will be but I can't imagine it not having an ML heritage.