Monthly Archives: March 2014

LogicGrowsOnTrees

From haskell-cafe

Like many other packages on Hackage, LogicGrowsOnTrees provides an implementation of logic programming using MonadPlus;  in this sense it is nothing new.  What sets it apart is that it has been designed from the beginning to work in a distributed environment, allowing it to be parallelized over large numbers of processors with no shared memory.  The benchmarks I have run (using the N-Queens problem with 17-19 queens) showed essentially perfect speed-up all the way up to 256 cores, and the only reason why this number is not larger is because I haven’t had the opportunity to run tests on a larger cluster. 

If the benchmarks are fair, I think this is a big deal.  

I have a couple of projects on the backburner which use combinatorial search.  This is awesome.

Advertisements

Playground Programming

Every now and then, I have a small idea about a development environment feature I’d like to see. At that point, I usually say to myself, “make a prototype to see if I like it and/or show the world by example”, and start putting pressure on myself to make it. But of course, there are so many ideas and so little time, and at the end of the day, instead of producing a prototype, I manage just to produce some guilt.

This time, I’m just going to share my idea without any self-obligation to make it.

I’m working on Chrome’s build system at Google. We are switching the build scripts to a new system which uses an ingenious testing system that I’ve never seen before (though it seems like the kind of thing that would be well-known). For each build script, we have a few test inputs to run it on. The tests run all of our scripts on all of their test inputs, but rather than running the commands, they simply record the commands that would have been run into “test expectation” files, which we then check into source control.

Checking in these auto-generated files is the ingenious part. Now, when we want to change or refactor anything about the system, we simply make the change, regenerate the expectations, and do a git diff. This will tell us what the effects of our change are. If it’s supposed to be a refactor, then there should be no expectation diffs. If it’s supposed to change something, we can look through the diffs and make sure that it changed exactly what it was supposed to. These expectation files are a form of specification, except they live at the opposite end of the development chain.

This fits in nicely with a Haskell development flow that I often use. The way it usually goes: I write a function, get it to compile cleanly, then I go to ghci and try it on a few conspicuous inputs. Does it work with an empty list? What about an infinite list (and I trim the output if the output is also infinite to sanity check). I give it enough examples that I have a pretty high confidence that it’s correct. Then I move on, assuming it’s correct, and do the same with the next function.

I really enjoy this way of working. It’s “hands on”.

What if my environment recorded my playground session, so that whenever I changed a function, I could see its impact? It would mark the specific cases that changed, so that I could make sure I’m not regressing. It’s almost the same as unit tests, but with a friendlier workflow and less overhead (reading rather than writing). Maybe a little less intentional and therefore a little more error-prone, but it would be less error-prone than the regression testing strategy I currently use for my small projects (read: none).

It’s bothersome to me that this is hard to make. It seems like such a simple idea. Maybe it is easy and I’m just missing something.

Algebraic and Analytic Programming

The professor began my undergrad number theory class by drawing a distinction between algebra and analysis, two major themes in mathematics. This distinction has been discussed elsewhere, and seems to be rather slippery (to mathematicians at least, because it evades precise definition).  My professor seemed to approach it from a synesthetic perspective — it’s about the feel of it.  Algebra is rigid, geometric (think polyhedra) , perfect.  The results are beautiful, compact, and eternal.  By contrast, analysis is messy and malleable.  Theorems have lots of assumptions which aren’t always satisfied, but analysts use them anyway and hope (and check later) that the assumptions really do hold up.  Perelman’s famous proof of Poincare’s conjecture, as I understand, is essentially an example of going back and checking analytic assumptions.  Analysis often makes precise and works with the notion of “good enough” — two things don’t have to be equal, they only need to converge toward each other with a sufficiently small error term.

I have been thinking about this distinction in the realm of programming.  As a Haskell programmer, most of my focus is in an algebraic-feeling programming.  I like to perfect my modules, making them beautiful and eternal, built up from definitions that are compact and each obviously correct.  I take care with my modules when I first write them, and then rarely touch them again (except to update them with dependency patches that the community helpfully provides).  This is in harmony with the current practice of denotative programming, which strives to give mathematical meaning to programs and thus make them easy to reason about. This meaning has, so far, always been of an algebraic nature.

What a jolt I felt when I began work at Google.  The programming that happens here feels quite different — much more like the analytic feeling (I presume — I mostly studied algebraic areas of math in school, so I have less experience).  Here the codebase and its dependencies are constantly in motion, gaining requirements, changing direction.  “Good enough” is good enough; we don’t need beautiful, eternal results.  It’s messy, it’s malleable. We use automated tests to keep things within appropriate error bounds — proofs and obviously-correct code would be intractable.  We don’t need perfect abstraction boundaries — we can go dig into a dependency and change its assumptions to fit our needs.

Much of the ideological disagreement within the Haskell community and between nearby communities happens across this line.  Unit tests are not good enough for algebraists; proofs are crazy to an analyst.  QuickCheck strikes a nice balance; it’s fuzzy unit tests for the algebraist.  It gives compact, simple, meaningful specifications for the fuzzy business of testing.  I wonder, can we find a dual middle-ground?  I have never seen an analytic proof of software correctness.  Can we say with mathematical certainty that our software is good enough, and what would such a proof look like?

UPDATE: Here’s a lovely related post via Lu Zeng. Algebra vs. Analysis predicts eating corn?