Two Ways To Do Dynamic Dispatch

90.46k views4345 WordsCopy TextShare

Logan Smith

Rust and C both have built-in (but different-flavored) support for dynamic dispatch, and both also...

Video Transcript:

hey in this one I want to explain and then compare two approaches to Dynamic dispatch that are used in Rust and C plus each language has one of the approaches built in and they made different choices about which one and the other one can be handwritten well actually both can be handwritten in both languages and speaking of which before we jump into the fancy animations let me show you something cool all right so here we are on compiler Explorer and I just have this trait speak with one method speak and I have cats and dogs

implementing it just tried and true example uh they're down here expressing their thoughts and we can see over in the output we have meow and woof um so I want to try something kind of interesting here I want to try sketching out a type that is kind of like a reference that can refer to anything that implements the speak trait so it's going to be a concrete type with kind of a pointer or a reference inside that can either refer to a cat or a dog or any future implementer of speak and in order to

do this I'm going to use a technique called type Erasure that I first learned how to do in C plus and here's one way it can look in Rust although I don't recommend that you write it this way as we'll see soon so I'm going to start by adding a struct and I'm going to call it anything speak because it's going to be able to refer to anything that is speak and it's going to have a lifetime parameter because it's kind of a referency type but the first thing I'm going to do is actually throw

that lifetime parameter into a phantom data because I want to be Bound by this lifetime parameter but we're not actually going to be using a reference in here we're going to be using a raw pointer because we're going to be pointing at something that we don't know what it is and doing some tricky pointer casts and that's a little bit easier on a raw pointer So speaking of which our data is going to be a raw pointer which is I'm using this tool non-null from the standard Library which is just a raw pointer that's known

to not be null and what type are we going to point at well I don't know I mean that's the whole point of this thing is I don't know exactly what we're pointing at I just know that it implements speak so I'm going to represent that using unit so next we need a way to call speak on whatever we're pointing at and we don't know how to call speak on what we're pointing at we don't even know what we're pointing at so one really good way to call into code that you don't know ahead of

time is using a function pointer so I'm going to add a function pointer here and I'm going to call it speak thunk for reasons I'll explain soon and this is going to be a function that is going to take in our data so our non-null unit pointer except that I'm also going to add unsafe this is going to be an unsafe function pointer because the function that we store in here is going to have a very important precondition that the compiler is not going to be able to check for us which is that the only

thing we can pass into this function is this data pointer we can't pass in any non-null unit it must be this one because we're going to create this function pointer and this data pointer in a way that's very tightly coupled which we'll see in a second so I'm going to sketch out an impul block now for anything speak and I'm going to write a new function and this new function is very important because in this new function we have a brief moment where we know how we are creating our anything speak value so this is

our one chance to write down any information that we need about the type that we're pointing to before that information vanishes and we forget exactly what we're pointing to so our data pointer is just going to be our T cast into a unit pointer straightforward enough now what about our speak thunk well we're going to use the convenient fact that closures can implicitly convert into function pointers when they don't capture anything and remember we're designing this so that our speak thunk takes the data pointer as a parameter and what's it going to do with it

well inside this function that we're in right now we know what data is pointing to we know that it's actually pointing to a t so we can just cast our data pointer to a T and then dereference it and then call speak on it and now this operation is unsafe right here because we're dereferencing a raw pointer but it's safe to do because we know we just stored a t inside of our data pointer and we know that we're only ever going to call the speak thunk using data and by the way let me now

explain why I called this speak thunk so thunk is kind of a strange term with lots of different uses but it usually involves the idea of wrapping some code in like a thin wrapper function either to delay evaluation or to do some light bookkeeping before or after running the real code and that's basically what we're doing here right we essentially want to just call T speak method directly but we can't because all we have is our pointer to unit so we have to wrap T's speak in a thunk that casts the pointer to an appropriate

type before calling the real function all right so now let's try implementing speak for anything speak so how's this going to work well we have our speak thunk and we have our pointer that we're supposed to pass into it so all we have to do is call our speak thunk using our data pointer so let's try using this now I'm going to get rid of these just so we can see more clearly what's going on and I'm going to create an anything speak that is pointing at a cat so you see we have meow over

here now I'm going to rebind that same value to a new anything speak that's pointing to a dog and you see now we have meow and woof so the same value of the same type is referring to two different implementers of speak and is calling their speak functions appropriately so we just implemented type Erasure in Rust but before we move on I want to point out one more thing about this code that I don't really like and that's the fact that if we came along later and added another function to this trait now we have

to go down here and add another function pointer here suddenly we're going to have to store a function pointer for each and every method in this trait if we want to be able to use it from an anything speak so suddenly the size of our anything speak struct is linear in the number of methods in the trade and that's bad news especially if there are a lot of methods that aren't used very often or just like lots of methods in general think of like the iterator trait and so I'm going to get around that by

adding another layer of indirection here so instead of storing all of the function pointers inline right here I'm going to factor them out into a separate struct that just holds all the function pointers and then inside of anything speak we'll just point at that and so that way or anything stays constant sized so we can just copy it around and stuff and the table of functions stays behind a pointer so I'm going to add one more struct here and it's going to be called speak functions and I'll go ahead and move our speak thunk into

it and you know to do add yell thunk and now in here we're just going to have a pointer to this now what type should we use for the pointer well you might be tempted to reach for box or something but I don't really like the idea of allocating and deallocating memory every time we create and anything speak and also it's kind of just unnecessary because every time I haven't anything speak this pointing at a dog for example they're all going to have the same stuff inside of the speak function struct so it would be

nice if they could all share it somehow so I'm going to be a little bit audacious here and just say I'm going to have a tick static reference to a speak functions and if you're thinking how the heck is this going to work with the tick static lifetime you're not alone let's see how it works so down here I'm not storing my speak thunk directly inside my struct anymore is going to be inside of my function struct so here we have speak functions and this field is now inside of speak functions and this is the

wrong type I need a reference so why don't I just take a reference to this and now this works this is complaining about this and there we go there's our meow and woof so if you've never seen this before this should blow your mind a little bit we just created this value right here and then claimed that it has static Lifetime and took a reference to it and the compiler agrees the compiler is fine with this and this blew my mind the first time I saw it this is a feature called constant promotion or sometimes

I've heard it called R value static promotion where if you have a value that's built up of things that are all known at compile time so they're just constants so in this case like a closure that has no captures none of these things depends on any runtime inputs they're all available at compile time so when we take a reference to it the compiler sees that it doesn't need any runtime information to construct this value for us and so it just says sure I'll stick that in static memory and I'll give you a reference to it

that lives as long as you need it to including for the rest of the lifetime of the program but this lets us express exactly what we want which is kind of a constant that's associated with each type think like a variable template from C plus that we can stick a pointer to in all of our anything speaks and now are anything speaks are constant sized and we can add as many methods as we want to the speak trait and not worry about them blowing up so I think this is a pretty straightforward but also clever

and interesting implementation of type Erasure in Rust but again I do not recommend that you actually write this code in Rust and the reason is that there's a much much easier way to do it in Rust so let's talk about that so you may have noticed by now that while writing anything speak we actually just implemented this type namely a reference to a speak trait object which I'll just call dine speak for convenience and I mean literally like a ref Dyne speak is essentially identical to our struct with two pointers in it we use type

Erasure to forget exactly what type we were pointing to just knowing that it implemented the speak trait and we created a reference to it and along with that reference we wrote down information about how to use it that could be retrieved at runtime this is exactly the same thing that the rust compiler does for you when you create a value of type ref dyn speak a reference to a trade object in Rust is a wide pointer which basically just means a pointer are bundled with some additional information in this case the additional information is another

pointer to a table of function pointers and other stuff that can be used to operate on our type erase value dynamically this isn't the only type of white pointer in Rusk you may know slices are also wide pointers where the extra data is the slices length but for our purposes today when I say wide pointer I'll just be referring to this kind so not only are references to trade objects wide but raw pointers are too including the raw pointers inside of non-null and box making for example a boxdine speak itself be a wide pointer with

no additional interaction or anything which I think is neat but we've only talked about half the equation so far the other half is our speak function struct which is our table of function pointers which the compiler must also generate for us so it can point wide pointers at it this auto-generated struct is called a v table short for virtual method table and it's quite similar to what we hand wrote with some slight but important differences for one notice that I mentioned box which is an owning pointer which needs to have a way to destroy enjoy

the type erase value to support that the compiler also automatically gives the V table a pointer to some code for running the values Destructor which includes the drop implementation if any along with the drop glue which is the auto-generated stuff that recursively drops the values Fields it also puts the size and Alignment in there for reasons and in contrast to our V table implementation the compiler gets to remove a level of indirection we had to wrap our calls into the trait methods in thunks so we could cast our type erase pointer to the correct type

but we only had to do that to play by the rules of the type system those pointer casts are no Ops at the API level and the compiler because it's the compiler can just skip them so we've already sort of walked through how this works but let's quickly see what happens when we use trade object wide pointers the language feature rather than our hand rolled version so here's our speak trade and implementing speak we have an animal with some data in it just to make the visualization in a second make more sense so when we

create an animal and use it directly nothing at all special or magic or dynamic happens there are no V tables no white pointers we're calling a method on an animal that is statically resolved only when we ask to use the speak trade dynamically does all this magic kick in the way we ask to use Speak dynamically is by performing what's called an unsizing cast or coercion into this Dyne speak type at the moment the compiler sees us do this it creates a v table for animal and sticks it in static memory somewhere now when I

say the moment it sees us do this I'm talking about at compile time the V table is generated statically at compile time when the compiler sees code that needs it but no sooner it then sticks a pointer to that V table in the wide pointer that we've named d a here and then this call to speak is resolved dynamically through the V table visualizing what this looks like in memory here's our animal which is a 32-bit integer when we perform our unsizing cast we create this wide pointer which points to our animal and our V

table respectively when we call speak using this wide pointer we dereference the value and the virtual function separately note that there's no data dependency between the value and the V pointer in other words we could get the address of speak without necessarily dereferencing the value pointer I don't have any numbers on this or anything but I'll offer the observation that in the world of CPU pipelining and speculative execution this structure might give your CPU the opportunity to do some really cool stuff about speeding up this call so that's the built-in support that Russ provides for

dynamic dispatch now I'd like to look at the built-in support that c plus provides for dynamic dispatch to be clear you can Implement what we just saw in C plus but you have to hand write it when you ask the compiler to write Dynamic dispatch for you it gives you a different approach let's look at how it does this stuff so unlike the type class based polymorphism from rust C plus runtime polymorphism is more rooted in classical object-oriented class hierarchies this class here animal has one member function speak but it's not runtime polymorphic yet in

order to make it runtime polymorphic I have to mark it virtual note that I make this decision when I'm writing the class not when I'm using it there's no way to use a member function polymorphically that wasn't explicitly declared as runtime polymorphic so the moment I declare a member function virtual in this class that is when the compiler creates a v table and it sticks a hidden pointer to the V table at the beginning of the data layout of the class the V table has some metadata in it here I'm showing some Base Class offset

stuff and a pointer to rtti which is runtime type information that helps power Dynamic casts and stuff like that and then a pointer to my speak virtual function notice that unlike in Rust this v-table does not automatically have an entry for the destructor of animal if you plan on destroying your object polymorphically like if you want to destroy a unique pointer Bay that's really pointing to an object of type derived you need to manually Mark your Destructor as virtual or else you get undefined Behavior it would be nice if this were automatic but it's technically

possible to use these features correctly without a virtual Destructor so C plus plus makes it opt-in also now that we've mentioned the destructor we need to follow the rule of three here I pretty much always just delete my copy operations for polymorphic types notice how I said polymorphic types there we've truly baked polymorphism into the type itself rather than the usage of the type as soon as I declare a function I need to decide whether it's to be used polymorphically or not and I can't change my decision without changing the definition of the class this

approach is intrusive in a big way let's see it visualized so here's my class before I add a virtual function when I add one the compiler generates a v table and inserts a pointer at the beginning of my class layout notice that in this case because I just had an INT before that actually tripled the size and doubled the alignment of my class and the larger alignment actually means the size was basically quadrupled not tripled that's a cost paid for every single instance of my class whether it ends up being used polymorphically or not now

if I am going to use it polymorphically that happens through a pointer or a reference here I'll call speak on my pointer and notice that unlike with wide pointers there is a data dependency here where I have to finish dereferencing the pointer to my object before I even have the V pointer so at least on paper I'm bottlenecked on fetching the code for the speak function until I have the object itself first so this is how C plus plus does dynamic dispatch and it does it this way for some valid reasons for example you can

imagine the language doing some tricks like rust like using thin pointers for normal objects but wide pointers for polymorphic types but that doesn't work because C plus plus also supports pointers to incomplete types and it would be impossible to know if a pointer to incomplete type should be thin or wide since you don't know if the type is polymorphic or not so C plus plus kind of has to implement Dynamic dispatch how I've shown here using this intrusive V pointer approach so to recap let's make a chart thing with the two approaches we've seen so

first off wide pointers are not intrusive meaning you can add runtime polymorphism and new interfaces after the fact to a value that didn't necessarily agree ahead of time that it was going to be used polymorphically the intrusive approach is of course intrusive meaning if you want to add or remove runtime polymorphism or interfaces you need to modify the original type for the stack size of a pointer wide pointers are double the size of regular pointers whereas a pointer to an interest of object is just a regular pointer wide for in directions when calling a virtual

function there are two interactions with wide pointers one jumps to the V table and then one jump from the V table to the function with no data dependency on the value itself although in the likely case that you need it in the method you will need to de-reference that too for this one you have to jump to the actual object then to the V table then from the V table to the function so that's three for the memory overhead of the V pointer for wide pointers you only pay it on pointers that you are actually

going to use dynamically although it is attached to each and every pointer whereas for the intrusive pointer it's inside every object whether that object will be used dynamically or not so obviously my own opinion has been leaking through this entire time and to no one's surprise I personally feel that wide pointers are a more elegant solution to the problem of dynamic dispatch in general but there are exceptions to every generalization for example here's a rust result which is using type Erasure for its error type which is common in application code because this boxdine error is

a wide pointer the error variant of this result is two pointers wide so that's two pointers this result type has to Lug around in order to power the error path of our code which is probably a path that's less likely to be taken it would be nice if we could make sure our code is maximally optimized for the happy path Instead This is one benefit of switching to the anyhow crate and using its error type it acts a lot like box an error but miraculously is only a thin pointer wide so how does anyhow air

do that well internally it holds this pointer to this type error impul okay well what's that error impulse is a value that lives out on the Heap that's a struct consisting of a v pointer followed by some object data this should look very familiar now this V pointer points at this type error V table which is a struct of function pointers just as we'd expect it starts with a pointer to drop which should be no surprise and then it has a number of other functions but this overall structure is identical to the C plus intrusive

V pointer approach and it's a good choice for this use case because it optimizes the stack size of our error type under the assumption that if an error occurs you're already on the cold path anyway so it's fine to pay a little extra cost to get at the error itself now what about the C plus plus World well my impression is that the Zeitgeist is moving or has moved away from liking this intrusive viewpointer approach and more toward wide pointers for example here are two types in the standard library that provide type Erasure over different

concepts in the standard Library shipped with the clang compiler stood function from C plus 11 is implemented using inheritance and ordinary virtual functions behind the scenes and consequently dispatching uses an intrusive V pointer but there's a newer more experimental implementation that looks much more like wide pointers that you can opt into if you use this ABI flag stood any on the other hand is a much newer Library component and it was written using a wide pointerest approach in the first place there's also an amazing Library called Dyno written by Louis Dion who is coincidentally one

of the Lipsy plus boss maintainers that helps automate writing your own type erased objects using wide pointers so my point is that code in both of these languages can benefit from both of these approaches to Dynamic dispatch and it's worth understanding them both so you can make the best decision for your code Beyond that I have more to say about this topic I actually cut several minutes of this video out for the sake of time and I might cut those into a follow-up video but for now I want to leave you with this quote from

Sean parent and I'm going to link one of his talks in the description that really opened my eyes about this topic otherwise I'd love to hear what you think about wide pointers versus intrusive viewpointers in the comments and I'll see you next time