Semantic Design

Here’s a post I’ve been wanting to write for a while. This is an exposition of a unique, powerful, and (retrospectively) obvious software design approach. I owe the whole of the Haskell language for helping me toward this philosophy, but it is much broader in scope than Haskell. In particular I owe Conal Elliott for being explicit about it with the Anygma team earlier this month: spelling out a reason I see beauty in Haskell and in FRP, and laying out a technique for designing systems with that sort of beauty.

This is not an alternative to eg. waterfall or agile. I see it as orthogonal to those, and more technical. I’d say it’s an alternative to Design Patterns. However, I ask readers to approach it as something new to be incorporated rather than a replacement or attack on Design Patterns.

I’ll just start with the big idea: make the types and objects in your program mean something. Usually there is some rhyme and reason for creating classes the way we do—eg. list is not just a convenient grouping of procedures, it conceptually represents a list. Semantic Design takes that to the extreme. We first decide exactly what a list is, and then we make sure that every operation we expose is consistent with that idea.

To demonstrate, I’ll use the example that Conal used when he presented this to us (in seminar form): images. Here’s an exercise for you: define “image” precisely. What is the essence of an image? I’m talking about the concept itself, not the way you would represent it in memory (but if you actually think of the concept of an image as a 2D array of pixels, that’s valid).

Here is a small selection of the examples we came up with in the “seminar”:

  1. A rectangular array of colors (for a suitable definition of color);
  2. A function taking a pair of real numbers and returning a possibly transparent color;
  3. A list of colored geometric primitives (polygons, lines, circles), where objects later in the list are shown on top of objects earlier;
  4. Any object with a draw method (our formalization of this very software-oriented concept was a function from screens to screens)

Hopefully these examples hinted at the idea that these are conceptual or mathematical definitions, not software ones. The only reason I say “mathematical” is because I view math mostly as a tool for talking with precision; I don’t see defining your concepts in terms of classic real numbers, vector spaces, functions as necessary, just convenient.

A bit of jargon: one of these concepts is called a semantic domain (of Image). The semantic domain is the conceptual analog to the software abstraction we’re creating.

With each of these semantic domains, we can start to imagine what kinds of operations we’d like to do with them. In considering these operations, another big idea enters the picture: The semantic domain should be composable. This is the hardest one of the axioms for me to explain the meaning of. Roughly it means we should be able to build more complex objects from simpler ones. This gives us an exponential increase in the expressibility of our system, so we don’t have to implement much to get a lot of power.

A straightforward composition operation on images is to put one on top of another. Imagine what that would look like for each of the semantic domains:

  1. The resulting image at coordinate (x,y) is an appropriate blend of the bottom image at (x,y) and the top image at (x,y).
  2. Same as above, except x and y are real numbers.
  3. The resulting image is the bottom image concatenated with the top image.
  4. The resulting image’s draw() method is { bottom.draw(); top.draw(); }.

We would also like translation, because that together with composition gives us a fair amount of power (we can now composite images together in any layout). Here are definition sketches for translation in each of the semantic domains (where (tx,ty) is the amount we wish to translate):

  1. The pixel at (x,y) in the resulting image is a blend of the pixels around (x-tx,y-ty) (if we can translate by non-integral amounts) if it is in the array’s range. If not, it’s transparent.
  2. The resulting image at coordinate (x,y) is the original at coordinate (x+tx, y+ty).
  3. Translate each of the primitives in the list by (tx,ty).
  4. Unclear how to define translation in this domain!

Let’s stop analyzing operations for now. Usually I just do this quickly in my head, nothing formal (which is why I didn’t write it in formal mathematical language). But it seems like we have eliminated model no. 4 just by looking at the operations we want. Translation doesn’t mean anything when an image is defined by its draw method. This is an interesting development, since it would typically not be considered bad style to start writing an image library thusly:

abstract class Image {
    public void Draw();
}

(Or an interface—six of one, half dozen of the other)

To carry on with no. 4, we would have to expand our semantic domain. Maybe no. 4 would become “any object with a draw method and a translate method”. But this is obviously getting out of hand without getting us anywhere semantically. Why is it important to get somewhere semantically? This gets us to our last axiom: The semantic domain should be as simple as possible.

The motivation for this should be obvious. The simpler the semantic domain, the easier it is to reason about the implementation, both from the outside and from the inside. From the outside (the user of a library, for example), you can predict the behavior of everything you’re doing since the semantic domain is easy to understand, and we coded our library to correspond exactly to the domain. From the inside, it is easy to verify that our implementations match the semantic definitions, since they are simple. I felt like I went around in a circle in that paragraph, but I hope this is obvious.

This metric excludes no. 4 from our list of candidates, since the semantic domain gets more and more complicated with each new operation we want. Further, I argue it excludes no. 1. I used the informality “appropriate blend”, and while practitioners will know what I mean, the fact that it was too inconvenient to specify exactly is a red flag. On the other hand, nos. 2 and 3 both a simple exact specification.

Investigating operations further, it becomes clear that no. 2 is simpler than no. 3. Scaling a function by (sx,sy) is simple: (x,y) → (x/sx,y/sy), whereas scaling a primitive needs a switch on the primitive type, and consideration of its position and size. In most cases we could have intuited this just by looking at the domains themselves and not the operations, since the domain of no. 2 is simpler. No. 3 has to talk about primitives, and a precise definition would have to list the properties of each primitive, whereas no. 2 is just a function.

I’m not saying no. 2 is the best, this is just the kind of reasoning we do in Semantic Design. There could be a better domain for images, but I think no. 2 is the best of these four. We must be creative in coming up with these, and we must take care that they are expressive enough for our requirements.

Once a semantic domain is settled on, we can implement. The implementation’s representation may be very similar to the semantic domain, or it may be very different. Sometimes we have to play all sorts of tricks to get an efficient implementation. But the point is that whatever representation we choose, and however we implement the operations on it, those operations must all have precise, meaningful definitions in the semantic domain. For example, if we choose no. 2, then it is meaningless to talk about an operation that returns the number of pixels in the image, because our domain has no idea what a pixel is. That doesn’t stop our representation from being a rectangular array (but other things, like supporting accurate arbitrary zooming, do).

But it is reasonable not to expose the full power of the semantic domain through the code interface. You can imagine that if this image library is for drawing on hardware using OpenGL, then supporting an arbitrary function from reals to colors may be a speed nightmare, so we just won’t expose that. The operations we expose simply need be a subset of the operations expressible on the domain.

In summary, my interpretation is: Semantic Design is about creating abstractions that don’t leak. It gives me confidence that the approach can be taken to the very extreme, where you model your semantic domain in a proof checker and prove that your code matches it. In typical software design we cannot begin to prove code is correct, because we haven’t any idea what “correct” means! If the abstractions leak, then your implementation has a bug—a place where it does not correspond to the semantic domain. Thus, when done correctly, a semantically designed abstraction cannot leak.

There’s still plenty to say on the subject, but this post is already too long. This was an informal introduction, and most of this stuff can be formalized for the more mathematically-minded. In addition, there is another beautiful closely related concept (also due to Conal) called “typeclass morphisms”, which at the moment have a very Haskell-centric interpretation. But I think I’m close to a more general meaning for those (probably will end up being something from category theory).

Remember the principles: objects and types mean something, the semantic domain should be composable, and the semantic domain (and operations) should be as simple as possible.

About these ads

20 thoughts on “Semantic Design

  1. nice post Luke!

    If I look back over my recent years of software that I’ve written, the subsystems/frameworks that have been the cleanest and most successful match this concept of “semantic design”. Its nice to have a name for it. I guess in the haskell world, the product of a good semantic design prices will often end up being a combinator library?!?

  2. Nice post, but I’d like to mention that you cheated in your example argumentation: (2) has the same drawback as (1) (“an appropriate blend”), yet only for (1) you consider it a show-stopper.

  3. maltem: To some extent, yes. I hand-waved over the entire discussion of color for the sake of brevity. An appropriate blend could mean a lot of things. For example, it is quite simple if our colors are booleans (representing bitmaps); it’s also pretty simple if we have a single bit of transparency. It’s more complex if we use an alpha channel. I just didn’t want to talk about all these choices.

    I can still use it to argue against no. 1, since whatever that complexity is, no. 1 has it as well, on top of the spatial blending complexity it already has. But it does make no. 2 a bit weaker.

    Indeed, the model we settled on was more general than no. 2, and quite a bit “cleaner”, IMO. I don’t know if it’s simpler. We separated the concepts of “image” and “color”: an image is a field of values over R2, and what would be considered a typical image is an Image Color. Then the blending complexity is factored out into a parameterization, such as:

    over :: (a -> a -> a) -> Image a -> Image a -> Image a
    

    Or alternatively:

    instance Monoid a => Monoid (Image a)
    instance Monoid Color
    
  4. But an image is a replaying of previously experienced visual qualia inside my brain that makes me feel a certain way. But an image is a binary file in a format acceptable to my document preparation system (for some operations, and not others). But an image is a textual phrase evoking a concept. But an image is a hexagonal array of samples of light intensities near several different wavelengths. But an image is a certain dispersal of amorphous grains of silver in a thin layer of gelatine. But an image is…all these things and more.

    I’ve grown to become very nervous of programmers who want to or (worse yet) think they have captured the “essence” of anything in their code.

    Who is going to work with these images and what do they want to achieve? For me, the “semantics” of the domain of a program is in the change of behaviour of the users subsequent to them using the program.

    In many domains (or human life where programs are useful) there simply isn’t any single, consistent, universally applicable, precisely describable definition of the terms in the domain. So, any appeal to such—particularly any based on the “precision” of mathematics—in the name of “correctness” is wishful thinking.

  5. keith — perhaps. Your concerns are somewhat philosophical in nature, so to bring it back to the real world:

    I didn’t spell this out very well, but certainly “image” is not all those things you wrote. What we needed from image: solid colors, cropping, translation, zooming, …, those are the pieces of image we wish to capture. I might go so far as to say it’s not about capturing “image” at all, but about capturing all those operations in a simple, precise model.

    The ultimate goal of Semantic Design is to make our software easy to reason about and easy to manipulate. FRP talks about an easy way to program and manipulate real-time software; it doesn’t philosophize about the nature of time itself.

    In terms of the domains where there is no precisely describable definition of the terms in the domain, I would introduce some new terms which do have precise definitions and do SD on those, then build the rest of the program as a (hopefully small) translation from the imprecise terms to the precise ones.

    But eventually, all software is about precision. The computer does not let you wave your hands and say “you know what I mean”. So as long as we’re being precise, we might as well be simple, consistent, and composable while we’re at it.

    The other extreme—and this is not a uniform bash on all other techniques, it’s just the opposite end of the spectrum (which is unfortunatly all too common)—is a purely operational description of your program, where in general the only way to understand what some code does is to understand all the code before it in execution order.

    In summary, I consider SD more about keeping your program internally consistent and easy to reason about, as opposed to making a mapping to the real world. In fact, the whole nature of software abstraction is to abstract away the real world, so we don’t have to think about things like the discreteness of colors and computer screens; so we can pretend the world is simpler than it is.

  6. What we needed from image: solid colors, cropping, translation, zooming, …, [...] I might go so far as to say it’s not about capturing “image” at all, but about capturing all those operations in a simple, precise model.

    Exactly. What “image” means in your context is exactly the subject of what your users want to do—and not what “an image” “is”. So you build some software to enable them to do that.

    As far as making the types and objects in your program mean something, that’s a very fine principle. The question is to whom—you the programmer, or the user? The trouble is that users (in my experience) do tend think about their domains in operational terms. They are after all only paying us so that they can get their stuff done. That emphatically does not mean that we should code in ways only understandable in operational terms.

    The operational extreme is problematical. I find the other extreme problematical too, and the problem is that this precision which programmers so want for their “internal consistently” leaks out into the world of the user, where it doesn’t have much of a place.

  7. Ah, as usual I think we’re agreeing more than we thought. But these are good points.

    The question is to whom—you the programmer, or the user?

    While I’m investigating how far I can take this in other areas, for this post I am firmly talking about me, the programmer. Or if I’m designing a library, the programmer and the library user. Whether the end user sees it is a question of the product you’re making, not what design strategy you use. But semantic consistency has the most value embedded in expressive, Turing-complete systems, where there’s already more than enough complexity to worry about.

    If the consistency is leaking out to the user’s world, that is an error in the design. I didn’t say it explicity, but while simplicity and consistency are very important, they don’t mean anything if the domain isn’t expressive enough to express what you need it to. For example, domain no. 3 is not expressive enough for a general-purpose image library, since we cannot represent a typical raster image in any sane way (maybe a huge list of squares, but… yuck).

    Like Design Patterns, this is a design technique, not a design dogma. There now and forever will be glue code, serving as the liaison between two conflicting world models. SD code is easy for programs to work with—it’s easy to verify, easy to rewrite, easy to extend, easy to glue. (To be fair, it’s hard to tweak, if the tweak is not consistent with the domain. Tweaks will typically look like extensions rather than modifications.)

    Depending on how closely the end product corresponds to the domain, there may be a lot of glue between them. But that glue can be SD’d too, building more and more complex domains by composing simple ones.

    Hmm, I considered an SQL database query engine. SQL is not very simple or consistent, so it seemed like a good candidate for something that may not admit this approach. The way I would tackle that at first glance would be to gradually simplify SQL to a simple relational model (functions on sets of tuples), through several intermediate models, each removing a little complexity. First remove the operational aspects, separating queries which do things from ones which ask things. Then straigtforwardly refine the ask queries to a simple relational model, and define queries which do things in terms of ones that ask (i.e. a query which does something is simply one that asks something and returns the updated set of relations). Then all the important bits, such as the optimizer, can be implemented on the simple relational model. Who knows if that would work, it’s just my first sketch (and I don’t know all that much about SQL to boot).

    Okay, enough rambling…

  8. Great post! I love the concept of abstraction leak, a semantic analog to a space leak.

    I’ve got some code for reading images from ppm, using bilinear interpolation that I hope to upload it in a day or so. Hope you find it useful. R2 -> color is a great semantic design for images.

  9. Whether the end user sees it is a question of the product you’re making, not what design strategy you use.

    It would be nice if the design strategies that we chose could be cleanly decoupled form the user’s experience but I’m not convinced that they are, generally.

    For example, when I teach design to Java programmers I try to wean them off the set/get pattern. I do this for several reasons but one is to try to eliminate mutable sate (which we know to be a source of much pain). This causes a large proportion of mainstream OO programmers to come to a complete stop: they can’t even begin to imagine how they could build systems without setters and getters. It often turns out that this is because they have a very strongly ingrained idea of an “application” as a grids-and-buttons interface onto CRUD queries onto tables. And how could that possibly work without poking values into objects? Here we see HCI, programming, design and modelling concepts all mashed up together. I suspect that this kind of phenomena a lot more common that a lot of technologists care to think.

    Meanwhile:

    for this post I am firmly talking about me, the programmer. Or if I’m designing a library, the programmer and the library user[...]

    And yet you are putting all this emphasis on the domain. Doesn’t the domain belong to the user?

    For example:

    domain no. 3 is not expressive enough for a general-purpose image library, since we cannot represent a typical raster image in any sane way

    And why would you want to? If the users of your application are primarily interested in structured graphics output to SVG the it’s perhaps not a problem that you can’t represent rasters well. Are they or aren’t they? Did you ask? It’s not clear to me that a practitioner of Semantic Design would believe that they ever had a reason to bother themselves with the grubby details of what any actual user would ever actually do. And that’s why I’m nervous to see the idea promoted.

    When programmers start talking about building “general purpose” this, that and the other, it too often turns out that what they mean is “I guessed a bunch of things that of course will be useful some time in the future, which I could do all by myself because I am so clever.” What guidance would you give to avoid that syndrome? Semantic Design seems as if would play right into the hands of such a project-wrecker. The next thing you know they’d be building a “framework”…

  10. The example definitions given are not at all semantically complete – There are lots of other types of images, such as non-planar, non-rectangular, non-color (infrared or HSL), and other-dimensional. Using definitions 1-4, it seems like they cover just “a 2-dimensional matrix of sets containing color values and an alpha channel or an ordered list of 2-dimensional geometric primitives with an uniform color value and an alpha channel located in the same plane.” This is pretty far from all types of images.

  11. Hello,

    I got to this post from the domain-driven design group on yahoo (little link to here). This is interesting as the poster mentioned. I didn’t read all of all the posts but it seems the context that the sematic design applies to is somewhat troublesome for some (such as Keith). This is where domain-driven design is somewhat beter defined (the name, that is). It concerns the specific domain you are working in. So the image would be a 2D image for a particular domain and a replaying of this-that-or-the-other in another domain. The domain dictates the context.

    Just my 2c worth.

    Regards,
    Eben

  12. I think this is the same idea as Behaviour Driven Development. I also think that semantic design is implied by the original thesis on the refactoring browser.
    In that thesis they say that a refactoring is a change in code that does not change the meaning of the program.
    Thus, unit tests define meaning in a program at some level.
    At the same time, this has strong connections to domain specific languages, etc

  13. Thus, when done correctly, a semantically designed abstraction cannot leak.

    Often a leaking abstraction refers to performance characteristics of the implementation showing through, and is not entirely avoidable. Often recommended solution to this is to have two (semi-)orthogonal interfaces, one for normal use and another for tuning performance.

    For example, domain no. 3 is not expressive enough for a general-purpose image library, since we cannot represent a typical raster image in any sane way (maybe a huge list of squares, but… yuck).

    You could have a pixel grid as a geometric element, conceptually consisting of a grid of unicolored squares. This is common enough in vector image formats.

    The “function from real pairs to colors” approach also has some problems as a description of the domain. First, you don’t really mean real numbers, unless you have something more powerful than a Turing machine. Rational numbers, floating point numbers or fixed point numbers would be reasonable choices. You could also conceivably use computable reals, but as even equality comparison is undecidable for them, they are not very practical.

    Less pedantically(?), you can’t compare two images for equality, or e.g. answer whether a polygonal area is empty (well, for limited-precision numbers you can as they are finitely enumerable, but that’s not very practical for the usual suspects such as 64- or even 32-bit floats or fixnums). You can approximate these things by e.g. random sampling (possibly with a deterministic seed or quasirandom numbers), but while that’s an interesting approach and likely to be useful for many purposes, it’s not entirely satisfactory in general. Making the equality comparison work deterministically and with sufficient efficiency and accuracy so that e.g. affine transforms preserve equality may not be possible, for instance.

    A draw method draws on something, i.e. a graphics context. You translate the draw method by translating the graphics context.

    Indeed. Thinking of a draw method as a mapping from a graphics context and an image to a list of geometric primitives, you can transform it to representation 3 without losing anything except possibly efficient encoding. An example of this happening would be transforming a PostScript file into SVG form or fonts into polygons. The draw method might have other parameters though, so it would really be a function from parameters to drawing functions, and you lose the parametrization when doing the conversion. With fonts, this means that you lose hinting.

  14. Rauli: re reals: I disagree, at least in principle. I will end up agreeing, but this is a good way to communicate some points which may not have been clear.

    I can compare equality on two functions f,g : R2 → Color. If f(x,y) = g(x,y) for every x,y, then they’re equal, if not, then they’re not. The point here is that it’s not a computer program doing the comparing. f and g are not programmatic objects, they are mathematical objects.

    In fact we could use no.2 as our semantic domain and no.3 as our representation simultaneously (with positions being determined by Doubles). I’ll bet that there is some clever algorithm that can compare lists of primitives for rendering equality. And in that case, we have correctly modeled equality in no.2.

    The interface to our semantic domain need not expose the full expressiveness of the domain. Our interface need only be consistent with the semantic domain.

    But you are also right in a way. If we expose this function or any like it:

     mkImage :: (Point -> Color) -> Image
    

    Then we can no longer compare equality. Such functions are tempting to expose with such an expressive domain. But that really is a choice, and we must understand what the consequences are of exposing such a powerful function.

    But the reason the semantic domain does not use Doubles is because the semantic domain of Double is very complicated. Rationals may be a good choice, actually, if you don’t want to do things like take integrals (since, if my math remembers me, integral is not meaningful on functions only defined on the rationals).

    But you are making me uneasy about using reals for the semantic domain when doubles are used in the representation, because it is leaky.

  15. If the formalization of the dray method is a function from screens to screens then it is equivalent to number 2, given a sufficiently rich algebra of screens (as you have for reals and colors).

  16. Very nice discussion. I must say, this is the same motivation for the C++ Concepts extension to be added to the next version of the standard. Those are actually built-into the language and helps assist in the generic programming approach gaining ground in the C++ community.

    What I’m trying to say really is that I don’t think this is new, but it has a different name in other circles. :)

  17. Hi Luke. The intuitions and general design ideas that you expressed are excellent, but you need to be careful about the details. In particular, (1) you need to be sure that your model (a term that I prefer over “semantic domain”) is rich enough to capture everything of interest, and (2) you need to be willing to accept that your implementation, often because of real-world approximations, may not be completely faithful to the model. I have written a couple of papers that address these issues, for example: (a) Polymorphic Temporal Media, and (b) Functional Reactive Programming from First Principles, both of which can be found here:

    http://haskell.org/yale/publications.html

    I have a more recent version of the first paper if you are interested.

    I hope this is helpful.

    -Paul Hudak

  18. Paul: Your second point is in the process of making itself clear to me at the moment. As my experience increases (especially with Haskell), I’m starting to realize the power of more rigorous mathematical reasoning in programming. So my Inner Mathematician says, “yes, I am willing to accept that my implementation may not be completely faithful to the model, if only it is possible to quantify by how much it is not faithful!”

    That is, the model says I can make a semantically nil refactor, but in reality it may introduce an error. I want to be able to quantify what kind and how much error there will be. This seems like a generalization of numerical analysis, unfortunately one of the most nit-picky and unenjoyable subjects for me… :-/

    I have successfully evangelized Haskell to two of my friends now (yay!), and I was talking about how the true joy of programming in Haskell comes from the certainty of reasoning. He showed me a sample implementation of an exercise he was doing:

    my_map f [] = []
    my_map f lst =
       if length lst > 0 then
           my_map f (tail lst)
       else
           f (head list)
    

    Despite this being nonsense, I can look at this immediately and say you can drop the else clause:

    my_map f [] = []
    my_map f lst = my_map f (tail lst)
    

    You can drop the else clause and it’s exactly equivalent (hmm… come to think of it it’s not—slightly more strict—pretend it is for the sake of this argument :-). In the languages the two of us came from, there are always these little caveats: “it’s exactly the same as long as lst isn’t a reference”, “it’s exactly the same as long as lst hasn’t overloaded the length method to be tricksy”, etc. The joy of Haskell lies in the absense of these little pathological cases such that you can be absolutely sure what you’re doing is correct.

    That’s what makes me uneasy about the implementation not being faithful to the model: that absolute reasoning properties be destroyed. The happy medium may lie in generalized error analysis, but I get a gut impression that the happy medium lifts the semantic domain to be just a little more complicated, so that it models imprecision and maintains a perfect correspondence with the implementation.

    </braindump>

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s