Tag Archives: perl6

Why designing a language and designing a command line are different

If you’re not familiar with them, take a moment to check out some of Damian Conway’s modules. Good examples of what I’ll be referencing in this post are Perl6::Form and Text::Balanced.

Damian’s modules are the ultimate in user-friendliness, or to put it another way, DWIMminess (Do-What-I-Meaniness). He has relatively few functions, each of which is very full featured. Each takes a variety of kinds of arguments, and interprets what you probably wanted to do with that kind of data. They’re very “smart”, in a sense. And you could imagine that this would enable a very easy programming style, because the module is interpreting what you want to do along the way.

Another example of DWIMminess is the entire Perl 6 language. Well, it’s DWIM factor has been decreasing recently, but usually features are introduced as though they were in a Damian module and then slowly simmered into something a little stupider (describing the behavior of the functions, not the decision to make them behave that way). Why do we do that to Perl 6? Why might Damianesque interfaces not be as nice for programmers as it would seem?

First, let me make a case for “smart” interfaces. I want my command line to Do What I Mean as much as possible. I use this nice little program on “my taskbar called Deskbar, in which I can type a variety of things and it uses heuristics to determine what I wanted. For example, if I type “mplayer ~/movie.avi”, it will open mplayer and play that movie. If I type “gmail.com”, it will open my web browser to Google mail. If I type “dict persephone”, it will open a terminal with the definition of “persephone”. And it is wonderful; I don’t think it has ever done the wrong thing.

My case is that that is not a good thing to do when programming larger applications. Many folks would agree with me, but that doesn’t matter. I’ll bet they don’t know why they agree with me.

It’s about the information available to the programmer. When we’re writing code, usually we want to be as general as possible—to make as few assumptions as possible—when we’re programming part of a large application. We want to be able to work on as many forms of data as possible, and be agnostic to it if we can. For example, an object which maps strings to Foos doesn’t need to know anything about Foos to do that, so you could reuse that implementation for an object which maps strings to anythings.

On the other hand, when we’re on the command line, we know everything. We know what command we’re running, which arguments we’re passing, what error message it told us when we tried the same thing thirty seconds ago, whether the files we’re passing exist, whether the arguments we’re passing have spaces in them, etc. etc. We can quickly detect a failure, and we can also verify that it looks right pretty quicky. If you type “open -f http://google.com”, you don’t expect it to delete your hard drive. This opens something somehow, and it’s going to operate primarily on http://google.com. Who knows what -f means… it doesn’t really matter, since you can probably see what it means pretty quickly. But if you typed “$cmd -f $arg”, you can’t see what -f means pretty quickly, because it means something different for each value of $cmd. So even after you think your code is sturdy, some $cmd may come along someday and cause something totally unexpected to happen.

My stance is that DWIMminess is good in principle. More programming languages should do it. But you have to be very careful when you are dealing with complex programming problems. Essentially, the “I” in “DWIM” is what you need to worry about. “I” is the person who wrote the code: whatever it looks like to them, that’s what it should do. And that explains my all-too-common position in Perl 6 arguments of “DWIM ONLY AT COMPILE TIME!”. At compile time, the decision about what to do is made based only on information the programmer has, so if it’s a good DWIMmer, it will make the same decision the programmer would have made. Don’t choose behavior based on what the runtime values are, since that can cause two different interpretations of what IM in the same lexical position, i.e. your DWIM thinks that the programmer meant two different things by the same line of code. While that is an interesting literary device, I don’t think I’ll face much resistance convincing you that that’s a bad thing in programming.

Keep in mind, this is neither a case against polymorphism nor is polymorphism (including multimethods) a case against this. I am a big fan of strongly-typed polymorphism, like C++’s and Java’s1. If you write a->add(b), you don’t have to know what exactly a is, but you do have to know whether it’s a number or whether it’s a set. The word “add” means pretty different things on numbers and sets, and the programmer should know which one he’s talking about. Thus, in order to call add, a must be declared as something that has an add method, say “Set”. Then when something derives from “Set” and defines its own “add”, it knows that it is talking about adding an object to a collection, not adding a number to another, so the programmer’s original intent by a->add(b) is preserved.

1C++ and Java were not examples of languages that got DWIM right, they were just the most common example of strongly typed polymorphism. C++ and Java both lack another very important feature of abstraction: the ability to ignore what isn’t important. Haskell, which I didn’t want to mention for fear of sounding like a zealot, has strongly typed polymorphism and the ability to ignore what isn’t important (called parametric polymorphism): it puts “What I Mean” in the naming of the methods; i.e. you can’t have add both mean adding numbers and adding to collections, their types would be incompatible. So you’d have to have add and addToCollection or something, which is why the programmer can leave out the types and still be perfectly clear about what he means. Dynamically typed languages like Perl and Python have the ability to ignore what isn’t important, but give up the ability to say exactly what you mean all the time. Some runtime data may come into your function one day and break it, by trying to add numbers when you meant add to a collection. The method case doesn’t break much, because good method naming and a little bit of argument arity checking will quickly find your problem for you when you run your test suite. Operators are a different story, though: if you overload operators too heavily (which is done at runtime in those languages), it can do the wrong thing and you won’t catch it until 1,000 lines later. That’s because operators have a fixed arity, and many object types will overload them, so where is it going to die?

Perl 6 Rules: Elementary Compositionality and More Contrived Vocabulary

Larry Wall made a change to Synopsis 5, section “Longest-token matching” splitting the alternation operator | in two: | and ||. The idea is that || works just like the old one: try to match the left any way you can, and if that fails, then try the right. This makes sense to me, and the double-or name makes sense too (that’s how it works in perl code). On the other hand, the single | operator has some wildly strange semantics:

So for regex matching purposes we define we define token patterns as those patterns containing no whitespace that can be matched without side-effects or self-reference.

To that end, every regex in Perl 6 is required to be able to distinguish its “pure” patterns from its actions, and return its list of initial token patterns (transitively including the token patterns of any subrule called by the “pure” part of that regex, but not including any subrule more than once, since that would involve self reference). A logical alternation using | then takes two or more of these lists and dispatches to the alternative that matches the longest token prefix. This may or may not be the alternative that comes first lexically.

He’s going for two things here: raw speed and dwimming on the longest-token rule. I think he hit the first one pretty much smack on, and missed the second one, shooting himself in the ankle instead.

This longest token rule violates something which I just made up now called elementary compositionality. Even though it is nothing more than a figment of my mind at the moment, I believe it to be very important. It guarantees that you won’t get into trouble when you try to refactor; and let’s face it, refactoring is what modern software engineering is all about.

Let me try to define in very precise terms what elementary compositionality is. First, let’s define the language of a rule to be the set of all strings that it matches exactly. That is, $a ε Lang(rule) iff $a ~~ /^<rule>$/. Then, to borrow a term from model theory, two rules are elementarily equivalent if they have exactly the same language. Finally, elementary compositionality is a property of a matching strategy where: if a, a', b, and c are rules, where b is elementarily equivalent to c, and a' is obtained by replacing every occurence of b with c in a, then a is elementarily equivalent to a'.

Stated less formally: if you replace every call to a rule with a call to an elementarily equivalent rule, nothing changes.

Now for the pragmatics. Perl 6 rules are very rich, and we can’t expect elementary compositionality everywhere. For example, the <commit> directive violates it, and it can be violated if you rename your captures (another thing that bugs me: there is no encapsulation in rules, but that is a discussion for another day). But those are predictable somewhat. I mean, you added <commit> precisely to change large-scale matching semantics: that’s what it’s for. Also, people are accessing your $<reslut>, so if you call it your $<mother> instead, people aren’t going to be able to access it. The renaming thing is also fairly refactor-friendly, since you can rename and rebind captures as you wish. However, what is unpredictable (and also extremely hard to remedy) is the following: say you have a rule which matches any IPv4 address:

    token byte { <[0-1]> <digit>**{2} | 2 <[0-4]> <digit> | 25<[0-5]> }
    token ip { <byte> \. <byte> \. <byte> \. <byte> }

You realize what an ugly and unclear way to write byte that is, so you rewrite it:

    token byte { (<digit>**{3}) { fail unless $1 < 256 } }

Ahh, much nicer. Except one problem. The rule:

    rule literal { <float> | <ip> }

has just changed semantics! On the input 123.456.789.012, literal used to match the entire string as an IP, but now it matches 123.456 only as a float (then probably fails due to ratcheting when the grammar doesn’t accept .789 as the next thing).

Now, unfortunately, your little refactor has just disabled an optimization. The early, ugly byte was expressed as a regular expression, so it could be combined with ip the other common prefixes of literal (and sibling rules) into a DFA. The new, pretty byte cannot be compiled that way. I think this is one of the reasons Larry went with the “automatically detect tokens” approach. But I don’t think this is sufficient reason to violate elementary compositionality everywhere (to drive home the point: run this example in reverse, and in trying to optimize your grammar, you have also changed semantics).

The solution I have in mind at the moment, and I admit it is not totally optimal, is to have the user specify what he thinks the tokens are, so that the compiler can really match the longest token, not the longest “token-looking-ish thing”. The reason that is not optimal is because sometimes it is nice to inline syntax into rules without having to go through the token abstraction, and we’re not really sure which ones of those to call tokens. Maybe we could make up a simple heuristic, such as all constant strings are considered tokens (at the rule level, not the transitive closure level), which isn’t too different from the first version of the | / || separation. But the point is that I should be able to take control from Perl’s dwimmery, say “no, you’re wrong”, and ensure that I can refactor without fear from then on.

No monotypes

Perl 6 has a very rich type system, even though it’s not done. As far as my knowledge goes (which in this area is not very far compared to some of the lambdacamels), it makes the fewest assumptions and restrictions of any type system of any other language. I’ll be writing today about a statement that seems obvious, but has some very interesting consequences (much like “the speed of light is constant in all reference frames”). That statement is “types are just sets”.

I used that statement when arguing with Damian about multimethod semantics. The argument was against derivation metrics, in how it is possible to measure with a finite number how much one infinite set differs from another (assuming that they both have the same infinite cardinality). However, in my argument, I was claiming that types have nothing special over sets, otherwise they might have some information in them that allows you to measure distance. So, let’s look at what that means.

For one, you can no longer talk about the “type of” a certain value. When you ask that question, you’re asking “what set does this value belong to?”. And of course there are infinitely many of them. So the best you can do is ask “does this value belong to this set?”1.

(For the following example, assume that Perl 6 is pure-functional: that is, no side-effects allowed. We’ll incorporate those later.)

Consider this function:

    sub foo (::T --> ::T) {...}

In Perl 6, that means that foo is a function that takes a value of type T and returns a value of type T, for any type T. Let’s look inside that definition more mathematically: for every set T, x ∈ T ⇒ foo(x) ∈ T. You might as well not write the body of the function, for the only function that this could possibly be is the identity! (Note that the only function in Haskell with this signature is, in fact, id.)

If we are to write expressive type signatures, it’s pretty clear that we need to steal Haskell’s type classes. A type class is just a set of types, or a set of sets. It itself is a type. For instance, Numeric could be a type class that says that its member types obey closure under addition, multiplication, etc., along with things that Perl can’t prove, like associativity under addition and multiplication. This is why you have to declare type class instances. Then we can declare (pseudosyntax):

    sub double (::T --> ::T | ::T (in) Numeric) {...}

And have it mean something other than the identity. It probably adds the argument to itself and returns it. Since ::T obeys the laws of Numeric you’re allowed to add it to itself and get the same thing back.

How do we define such type classes, then? How about like this:

    role Numeric ::T {
        multi infix:<+> (::T, ::T --> ::T) {...}

(Incidentally, that implies that you might define normal classes like so:

    class Dog $fido {
        method bark () { say "$fido.name() says 'woof'" }
        method name () { "Fido" }

Which is an interesting, er, “solution” to the $self problem.)

And of course, this implies that you can probably have type-class classes, whatever those are.

1You might think about asking “what is the smallest set that this value belongs to?”, but then you’d quickly realize that the answer you’d get would be the singleton set of that value :-).

More Multimethod Madness

Damian and I have been intelligently screaming at each other on perl6-language, finally really fleshing out the Manhattan distance MMD debate. It has settled down recently, after I discovered that Damian had been arguing against something that was almost, but not quite, entirely unlike my algorithm. I looked through the message history and realized that I had been arguing this algorithm this whole time, and I never said what it was! The most visible place I put it was in the pugs repository at the end of a document.

Anyway, I implemented my algorithm in a Perl 5 module, Class::Multimethods::Pure (the algorithm is also described there). The only current problem is that it is turtle slow (or whatever the expression is). I’ve never been very good or interested in writing locally fast code: I prefer abstraction to raw speed. However, if I’m seriously proposing this algorithm for Perl 6, I’d better have a good algorithm for dispatching in sublinear time. And algorithm design happens to be one of the things I am interested in.

A multimethod is composed of a set of multimethod variants, which are essentially lists of parameters together with the corresponding code. Currently, when the module compiles, it sorts the set of variants into an ordering of singular sets. That is, I go along the list of such sets, and for each set:

  • If it has no elements which match the input parameters, I move on to the next set (or die “no method found” if there are no more sets).
  • If it has exactly one matching set, I succeed and call that variant.
  • If it has more than one element, I die with an ambiguity error.

So that means that whenever a variant a is more specific (see the docs above for a precise description of what that means) than a variant b, b is necessarily in a later set than variant a. So the most specific variants are way up front. And that means that the more generic method you’re going to dispatch, the slower the algorithm gets. That is probably unavoidable. Here’s the problem: By the time you get to the last set, you’ve asked each “does” many too many times. Keep in mind that such questions can involve a subtype condition, which can (but shouldn’t) involve heavy computation.

The approach that I’m working out now is to build a DFA where the states are questions about “does” relationships and the transitions are “true” or “false”. I want to construct a DFA that asks as few questions as possible in order to determine whether there is a unique matching variant. The way I see this as a win is that if you ask the question “Does the first argument do C?”, then if so and C is a subset of A, you already know the answer to “Does the first argument do A?”. Likewise, if you ask if the first argument does A and it’s false, then you already know that it doesn’t do C.

That’s about as much as I know for sure. I’ve tried a few times to find an method of constructing these DFAs with no avail. I’ve tried having each state be a set of candidate methods, and each question narrow the state set until you have a dominated set (where there is one method in the set that is more specific than all the others). But consider this example:

    B does A
    variant 1: foo(A, A)
    variant 2: foo(A, B)
    variant 3: foo(B, A)
    variant 4: foo(B, B)

If you ask “does argument 1 do B” then you get the set 234. From the set 234, if you ask “does argument 2 do B”, you get the set 234. Obviously variant 4 is correct, but a “set of viable variants” approach doesn’t keep enough information to to tell you that.

If anybody has any ideas for this, I’d love to hear them.

Question(s) for Larry

Here are some questions for Larry before so I can ask when he wakes up:

  • Why does BUILD bind named-style parameters at the end? If it bound at the beginning, I could do consistency/sanity checking on them within the body of BUILD. At the moment, I need to specify them in the signature of BUILD to do that, which is redundant.

That’s all for today (I may continue adding to this list).

Perl 6 Progress

Enter my very first purely Perl 6 post. I’ve been working on the PIL implementation in Perl 6. PIL is an intermediate, retargetable semantics tree that represents the basic calculus of Perl 6. We compile Perl 6 into PIL, and then we can target PIL to whatver. PIL is very simple: only one page of Haskell, three pages of Perl 6.

I wrote up the Haskell PIL spec in Perl 6 and now I am writing a Perl 5 backend for PIL. The PIL tree I will hand generate for now, and it should be pretty easy to translate Pugs’s PIL tree to mine (they’re exactly equivalent barring language barrier structures), so that Pugs can be my parser. It’s going pretty well (and very quickly), but I have to implement all Perl 6 operations as subs in Perl 5. It is unlikely that I will be able to convert to “canonical Perl 5”. But that’s okay, as long as I can run most of[1] Perl 6 in Perl 5, and can use what I generate seemlessly from Perl 5.

Meanwhile, we decided that Perl 6 is back to zero-deref instead of infinite-deref, just like Perl 5. This broke half of Pugs, so I’m “assisting” Autrijus in finding all the bugs it created (by writing correct code). The extremely large test suite is not working very well here, because many tests assumed infinite deref.

See my progress here: Perl-Compiler.

Logic.pm preview

Due to a request from several people, I am releasing a not-yet-CPAN-ready module called Logic.pm. The development of the idea to write this module is long and boring. But it’s my blog, not yours, so I’ll talk about it.

Many of the problems that I have been solving recently in programming have involved backtracking: rule matching, a simple-minded expert system, pattern matching in argument lists (the first and third are for Perl 6). So I decided to write a backtracking engine module that would help me out. Since I couldn’t find a good namespace to put it in, it was just going to be called Backtracker. I began that on saturday at about 1:00 pm, and finished at about 2:00 pm. You see, the backtracking algorithm is very simple in concept, and a module that implements it doesn’t buy you much. Essentially all a backtracker is is an object interface, plus two combinators: and and or.

Unsatisfied with my result, I decided that I was going to include a proof-of-concept implementation of my Argument Patterns proposal. In my prior art research, I downloaded gprolog, and started playing around. When I saw:

  | ?- append(X, Y, [1,2,3,4]).

  X = []
  Y = [1,2,3,4]

  X = [1]
  Y = [2,3,4]

  X = [1,2]
  Y = [3,4]

  X = [1,2,3]
  Y = [4]

  X = [1,2,3,4]
  Y = []

I was totally impressed. I decided that I must implement prolog unification in my engine. One hour and 135 lines of code later, I had finished Logic::Data::Unify. Wow. It’s amazing how simple and elegant the extremely powerful unification algorithm is.

Anyway, the status of the module: the logic subsystem works pretty well. It’s low on primitives, but those are pretty easy to write. There’s no syntax; you write everything out in essentially syntax-tree form (see examples/append.pl). There’s no documentation, but there are tests (and more than the one-test-for-every-ten-lines-of-code ratio that I like to maintain). The design is solid, but needs a little refactoring, as I’m seeing some limitations as far as custom backtracking control.

Here’s where I’m going: I have to implement some sort of global statefulness so you don’t have to pass that pesky $state object around. Then I’ll try to make some intuitive operator syntax based on overload (but that is a large challenge given the available operators). Maybe I’ll punt and make it look like lisp with the head moved out front. Then I plan to implement junctions using Logic, and finally multimethod dispatch. Then it’s CPAN time.

Two modules in One day

I just wrote and uploaded Perl6::Attributes to CPAN. It came out of frustration for code like this:

    sub populate {
        my ($self, $n) = @_;
        for (1..$n) {
            push @{$self->{organisms}}, Organism->new(rand($self->{width}), rand($self->{height}));

Which I can now write like this:

    sub populate {
        my ($self, $n) = @_;
        for (1..$n) {
            push @.organisms, Organism->new(rand($.width), rand($.height));

Much nicer, no?