Damian and I have been intelligently screaming at each other on perl6-language, finally really fleshing out the Manhattan distance MMD debate. It has settled down recently, after I discovered that Damian had been arguing against something that was almost, but not quite, entirely unlike my algorithm. I looked through the message history and realized that I had been arguing this algorithm this whole time, and I never said what it was! The most visible place I put it was in the pugs repository at the end of a document.
Anyway, I implemented my algorithm in a Perl 5 module, Class::Multimethods::Pure (the algorithm is also described there). The only current problem is that it is turtle slow (or whatever the expression is). I’ve never been very good or interested in writing locally fast code: I prefer abstraction to raw speed. However, if I’m seriously proposing this algorithm for Perl 6, I’d better have a good algorithm for dispatching in sublinear time. And algorithm design happens to be one of the things I am interested in.
A multimethod is composed of a set of multimethod variants, which are essentially lists of parameters together with the corresponding code. Currently, when the module compiles, it sorts the set of variants into an ordering of singular sets. That is, I go along the list of such sets, and for each set:
- If it has no elements which match the input parameters, I move on to the next set (or die “no method found” if there are no more sets).
- If it has exactly one matching set, I succeed and call that variant.
- If it has more than one element, I die with an ambiguity error.
So that means that whenever a variant a is more specific (see the docs above for a precise description of what that means) than a variant b, b is necessarily in a later set than variant a. So the most specific variants are way up front. And that means that the more generic method you’re going to dispatch, the slower the algorithm gets. That is probably unavoidable. Here’s the problem: By the time you get to the last set, you’ve asked each “does” many too many times. Keep in mind that such questions can involve a subtype condition, which can (but shouldn’t) involve heavy computation.
The approach that I’m working out now is to build a DFA where the states are questions about “does” relationships and the transitions are “true” or “false”. I want to construct a DFA that asks as few questions as possible in order to determine whether there is a unique matching variant. The way I see this as a win is that if you ask the question “Does the first argument do C?”, then if so and C is a subset of A, you already know the answer to “Does the first argument do A?”. Likewise, if you ask if the first argument does A and it’s false, then you already know that it doesn’t do C.
That’s about as much as I know for sure. I’ve tried a few times to find an method of constructing these DFAs with no avail. I’ve tried having each state be a set of candidate methods, and each question narrow the state set until you have a dominated set (where there is one method in the set that is more specific than all the others). But consider this example:
B does A variant 1: foo(A, A) variant 2: foo(A, B) variant 3: foo(B, A) variant 4: foo(B, B)
If you ask “does argument 1 do B” then you get the set 234. From the set 234, if you ask “does argument 2 do B”, you get the set 234. Obviously variant 4 is correct, but a “set of viable variants” approach doesn’t keep enough information to to tell you that.
If anybody has any ideas for this, I’d love to hear them.