I recently revamped my graphics-drawingcombinators module to have a handsome denotational semantics. I may write a post going into full detail later, but the executive summary is that *Image a = R ^{2} → (Color, a)*. Due to TCM, I get semantics for Image’s Functor instance from this, and if Color is a monoid I get the semantics for the Applicative and Monoid instances as well. OpenGL combines colors by alpha blending, so I happily defined my monoid instance as alpha blending:

mempty = Color 0 0 0 0 mappend c@(Color _ _ _ a) c' = a*c + (1-a)*c'

It doesn’t take a genius to see that this violates two of the three monoid laws (the only one it satisfies is left unit: `mempty `mappend` x = x`). This is embarrassing. My new rigorous denotational semantics has a pretty awesome flaw.

To make this right, let’s redefine Color not as something which is drawn, but as a transformation on alpha-free colors. I do this because functions make great monoids: composition is always associative and identity is always a unit. But it’s not just any function, it’s an alpha blending function. So we say that *f* is a “Color” if there exists constants *a* and *x* such that *f(c) = a x + (1-a) c*, which I will write as *f = [a; x]*. Here *a* is a scalar and *x* is another alpha-free color like *c*. We would really like it if the composition of two Colors were also a Color. Let’s try:

f(g(c)) = [fa;fx]([ga;gx](c)) = fa fx + (1 - fa) ([ga;gx](c)) = fa fx + (1 - fa) (ga gx + (1 - ga) c) = fa fx + (1 - fa) ga gx + (1 - fa) (1 - ga) c = fa fx + (1 - fa) ga gx + (1 - fa - ga + fa ga) c = (fa fx + (1 - fa) ga gx) + (1 - (fa + ga - fa ga)) c

It looks like the “alpha” of our composition is *(fa + ga – fa ga)*. Let’s call this *a’*. Now we have to get the left side of the addition into the form *a’ r* for some r. Let’s just hack it: multiply and divide^{1} by *a’*.

= a' (fa fx + (1 - fa) ga gx) / a' + (1 - a') c

And then we can read off the answer:

[fa;fx] . [ga;gx] = [a' ; (fa fx + (1 - fa) ga gx) / a'] where a' = fa + ga - fa ga

For the identity, we need:

a x + (1 - a) c = c

Which is satisfied for *a = 0* with *x = * anything, so we can use [0,0].

Because we derived this from composition and identity, the laws *must* hold. The mathematically untrusting may wish to check this.

And there you have it: my new Color monoid which actually satisfies the laws. This is *the* way to compose alpha-blended colors — it reflects what happens if you draw one after the other, blending as you go. This is the math that pre-blends any segment of that sequence.

I should have known that OpenGL’s colors were transformations all along, since the color of an object that you see can depend on the color you used to clear the screen.

^{1} But what if *(fa + ga – fa ga) = 0*? Fortunately, the only place this happens when these variables are between 0 and 1 is fa = ga = 0, which means both f and g are the identity color.

http://www.alvyray.com/Memos/4_comp.pdf

I love this clever argument of using functions to derive a monoid! :-)

> this happens when these variables are between 0 and 1 is fa = ga = 0

Not obvious for me so:

proof:let’s solve x+y-xy=0 with x in [0;1] and y in [0;1].

Let a = x+y-xy = x(1-y) + 1y.

By construction, a is in [x;1] since y is in [0;1].

So x <= a. Thus 0 <= x <= a = 0 so x = 0.

So a = y = 0.

I took advantage of a variation on this in the mid (late?) 90s after tripping over it in a paper somewhere; IDT used the approach in the internals of a video card that was under development at that time. Obviously, it is a lot easier to do now with shaders, since we since evolved away from the fixed pipeline.

The variation (ab)uses the monoid you define here, and works reasonably well when you have comparatively few translucencies in the scene or where they tend to be fairly heavily opaque. It introduces artifacts, but only in the farthest composited polys which have the least contribution to the pixel value anyways:

You can approximate the result of accumulating color from depth-sorted alpha translucent texture fragments by just storing a weighted average distance and pre-blended color for the alpha transparent component rendered so far, and pre- or post- compose new alpha translucent values by adding them based on the comparison of their actual distance. Calculating weighted distance is tricky depending on if you are z- or w- buffering, but overall it works quite well.

While that result isn’t truly monoidal, you get surprisingly few artifacts, and it requires little space in your g-buffer. I haven’t seen that approach used outside of the demo scene, but I’m not exactly active in graphics circles these days, so I wouldn’t know.

An alternative is to use pre-composed alpha, then the constraint on a color f is that f(c) = x + a c.

> f(g(c)) = fx + fa * (gx + ga * c)

> = (fx + fa * gx) + (fa * ga) * c

This gives a simpler formula for composition, especially because it saves a division. An ordinary color [a;x] can be converted to a pre-composed one as simply (a*x,1-a).

Another small advantage is that pre-composed colors allow some cheap over-saturation effects.

You might want to read this: http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BPremultiplied%20alpha%5D%5D

Thanks for all three references to the premultiplied alpha representation. That is very clever. I don’t think I can use it in graphics-drawingcombinators, because the common case is not color compositing, but rather coloration of OpenGL primitives using eg. glColor4d, which does not take a premultiplied representation. In common usage of the library, the monoid instance is not used at all, but it is there because it defines the semantics of images drawn with OpenGL, so it must agree. But I’ll keep it in mind for future endeavors.

Thanks for the proof, sauf. My “proof” was to do an ImplicitPlot in Mathematica and use my eyes. Nice to have it a bit more rigorous :-)

Should this line “= fa fx + (1 – fa) ga gx + (1 – fa – ga + fa ga)” not have a “c” at the end?

Dougal: Fixed, thanks!

Heh, yes. Premultiplied alpha. Just notice that in your “[fa;fx] . [ga;gx] = [a' ; (fa fx + (1 - fa) ga gx) / a']” formula, fx is always directly multiplied to fa, and gx to ga, so it’s more convenient (and as Twan implies, more efficient) to keep them premultiplied.

I have to say it’s old news to me now, as I came to this same conclusion way back for the following: (http://jcabs-rumblings.com/GDC2001.html). Your derivation is pretty neat, though. Back then I was just mashing formulas together pseudo-randomly until I found with the best one, Darwin-style :-).

You say “It doesn’t take a genius to see that this violates two of the three monoid laws…”

But have you ever thought…maybe it actually *does* take a genius? And…apparently there are several of you around. I feel woefully behind the curve.

(BTW, randomly found your blog and thought I’d try to read it. Your writing is wonderful, although I only comprehend about 25% of it!)

“this violates two of the three monoid laws” – The monoid laws are like the most basic laws of addition. In other words, a monoid is an accumulation operation, like the addition operation on numbers. It defines the operation itself [+], and the unit of the operation [0] (see more here: http://blog.sigfpe.com/2009/01/haskell-monoids-and-their-uses.html). The laws are:

1: 0 + a = 0

2: a + 0 = 0

3: (a + b) + c = a + (b + c)

Those laws must hold for any a, b and c. They state that the unit of the monoid doesn’t add anything when accumulated, and that a series of accumulations can be grouped in any whichever way without affecting the result.

Now, Luke defines a color as [c;a], where c is the color (RGB components or whichever similar representation, but for the sake of exposition, can be just a single value resulting in B&W colors) and a is the alpha value used for accumulating colors such that:

0 = [0;0]

[c0;a0] + [c1;a1] = [c0*a0 + c1*(1-a0);a0*a0 + a1*(1-a0)]

This is alpha-blending composition of colors. It’s arguably the most common in computer graphics. So, the three laws:

1: [0;0] + [c;a] = [0*0 + c*(1-0);0*0 + a*(1-0)] = [c;a]

2: [c;a] + [0;0] = [c*a + 0*(1-a);a*a + 0*(1-a)] = [c*a;a*a] != [c;a]

3: ([c0;a0] + [c1;a1]) + [c2;a2] = [c0*a0 + c1*(1-a0);a0*a0 + a1*(1-a0)] + [c2;a2] = … this gets big quick, but in the end it won’t be the same as [c0;a0] + ([c1;a1] + [c2;a2])

So there you go. Hope this helps. In other words, alpha-blending is not associative, so it can’t be a monoid.

Several people here pointed out that, if we store colors as pre-multiplied by their own alpha [c*a;a] = [p;a], then it becomes associative! Alpha blending can now be defined as:

[p0;a0] + [p1;a1] = [p0 + p1*(1-a0);a0 + a1*(1-a0)]

If you now try it like I did above, you’ll see that the three laws do hold true.

Now, the wise among us might contend that we could have defined alpha blending originally as:

[c0;a0] + [c1;a1] = [(c0*a0 + c1*a1(1-a0)) / (a0 + a1*(1-a0));a0 + a1*(1-a0)]

That’s equivalent to pre-multiplied alpha, but without keeping the color multiplied by its own alpha (we do that during the operation, and then undo it at the end via the division. There’s one corner case: when a0 and a1 are both zero, the color is indeterminate (0/0). That’s actually sort-of reasonable: any color with zero alpha, as far as alpha-blending is concerned, is irrelevant (fully transparent and invisible). So keeping the colors pre-multiplied by their alphas is both a convenience (no corner case anymore) and more efficient (less operations to perform).

One great use of this associative alpha is in applying special effects in a videogame. Normally, you draw the scene, and then lay down the special effects one layer at as time over it, with the front-most effects drawn last. But that’s very costly (in one case, I’ve seen every pixel drawn-over 60+ times with effects layers because of this). So sometimes it’s better to accumulate the special effects in a lower-resolution image (less pixels means better performance, and for special effects you often can’t tell the difference), and then layer them over the scene in one single pass. But that means grouping the alpha-blending differently, which does require the associative property (the third one).

As for why “the monoid laws”… they give some meaning to the monoid operation. They allow you to define functions on monoids that are meaningful because they can rely on the laws to hold. “mconcat :: Monoid a => [a] -> a” is the canonical accumulation function for monoids, for instance, and cannot be defined in general without relying on the laws to hold.