A lot of software engineering can be described as the art of skillful procrastination. When we’re designing some software, we would like to put off as many decisions as possible as long as possible, while still making progress toward the goal. This procrastination leads to good abstraction, because any code we wrote before we made said decision is independent of that decision, which means the decision can be changed later, leading to “maintainability”.
I would describe the philosophy of the Dana project as: make the right decision or keep thinking. It is a philosophy that only a project without a time limit can afford. This constraint forces reconsideration of the very fundamentals of computer science, since I believe much of our modern infrastructure is built on wrong decisions. I don’t ever expect Dana to be finished; instead, it is a unifying project within which some right decisions may be found.
When Dana becomes something runnable, it will look something like an hourglass: a large body of purely functional code on top, all represented eventually upon an extremely simple core calculus, below which layers of complexity are again added to adapt to existing operating systems and software. The layers above the core are only incidentally related to the layers below it, and I hope by studying whatever the eventual form of the upper layers, we can find ways to rip out the needless complexity below and replace it with something more fitting.
This is all very abstract, and it is hardly clearer in my mind than it is in this picture. However, I’m getting a good sense for what that skinny core is and how it relates to the rest of the system. That is what I describe for the rest of this article.
Despite what we pretend to believe, computer programs are more than their denotational semantics. Hard as we try, there will never be a system in which λx. x and λx. log2(2x) are indistinguishable (over, say, the naturals) due to execution performance. The finiteness of memory and non-instantaneousness of computations are things we will always have to deal with. Using a system which ignores these issues as a fundamental backbone would be a wrong decision.
However, Dana’s mission is to be a purely functional operating system. A key feature of reasoning pure functional programming is that of extensionality: if two functions give the same outputs for the same inputs, they are the same function. So here, λx. x and λx. log2(2x) must be the same function. To throw that away would be to give into just another imprecise, operational notion of computer programs.
At first sight, these two points directly contradict each other. How do we reconcile them?
I rather loosely used the term “same” above. What I mean by that word is that, if two things are the same, then you can safely substitute one for the other in all contexts and not observe a difference. Again there is imprecision in this definition: what do I mean by “observe a difference”? And therein lies our solution.
The core calculus is not something which is executed—not seriously, at least. Instead, it’s a way to define your observations. We come back to our two function “λx. x” and “λx. log2(2x)”—mind the quotes. These are represented in our core calculus not as the functions themselves, but as terms in an abstract syntax tree, in which they are obviously not the same. Call them f and g, respectively. Elsewhere in the core, we might define functions on syntax trees, such as Meaning and Performance. Then we can prove that Meaning(f) = Meaning(g) but Performance(f) ≠ Performance(g). These functions make explicit the varying parameters that cause the two implementations to vary.
This syntax tree might then be passed along to an interpreter, which executes it, respecting the Meaning and Performance functions. This can be proven if the interpreter, too, is implemented on top of the core calculus. But eventually something must leave the system to get to the drivers, and we can’t use the core calculus to prove its correctness anymore. Many methods can be used to convince ourselves of the correctness of the driver, but we can’t do it within Dana—an unfortunate fundamental necessity.
The point is, though, that everything above the core can be proven correct, and it is all coherent: the code we are proving things about is the very same code we’re running, and we don’t have to worry about translation errors. If we can trust the drivers, then we can trust the code.