Tag Archives: glop

Glop Refactor: Floaters, Actors, Input, …

I took a long walk last night at 9:00 to ponder some of the design decisions of Glop. I knew something had to be done about the Input subsystem, and some other things needed thinking through as well. Here’s what I came up with. I came back and implemented half of it.

First, there’s a new concept called a “Floater”. It’s just like an Actor, except, well, except nothing. It’s an Actor. I renamed Actors to Floaters to give a new conceptual feel to them. They’re lightweight, and they’re more useful than for just things on the screen. A Floater is, generically, a fire-and-forget object associated with the current game state.

I then rethought the input system using floaters. Instead of having an “input manager” in the current state, which has hooks into all the input subsystems, the current state just has a heap. This heap isn’t specialized; it’s just something that you can put stuff in. And then I split the input into “backend” and “frontend”. There’s only one backend at the moment (it’s possible that there will never be another in the core): Glop::Input::Backend::SDL. Then the frontends connect to the backend, which stores itself in the heap (or not, it’s up to the implementation. Backend::SDL does, though), which then forwards along events. The frontends translate those events into user callbacks.

What this is essentially doing is decoupling the kernel and the input. Each implementation can do what it needs to do, without having external management. This was a good choice.

Okay, Actors are called Floaters, so an Actor is something different now. What is it? It’s a kind of Floater, but it’s smart. Some actors want a geometric body associated with them, some of them want a physical body, some of them want to be able to be clicked on. So Actors can compose roles which are each of these things. If an Actor composes Actor::Role::Body, then it automatically has a body, without any work on the part of the user. If an Actor composes Actor::Role::Geom, then the user has to say what kind of geom, and then it just works.

Transient Channels

Transient Queues have failed. I wrote a simple demo the other day of a little pixar-like lamp guy who moves his joints when you push the arrow keys. I had to make him move them smoothly, lest he explode due to the integration error. I could have used a transient queue to do exactly what I wanted, but I didn’t. Why not? Let’s look at the transformation of some code to using them:

    use Class::Closure;
    sub CLASS {
        my $x = 0;
            left => sub { $x = -1; }

When I want to smoothly move $x to -1, it turns into:

    use Class::Closure;
    sub CLASS {
        my $x = 0;
        my $queue = Glop::TransientQueue->new;
            left => sub { $queue->push(Timed Glop::Transient (1, sub { $x -= STEP }) }
        method step => sub { $queue->run; }

That’s not really that much more is it? Only three lines to type. But one must realize that my vision of Glop is not to reduce typing, it’s to reduce brain strain. If I were worried about the poor programmer’s fingers, I wouldn’t have them typing Glop all over the place (except to advertize for my cult). But three lines are a lot compared to, say, Glop->input->Keyboard->register, which is just “register some callbacks.” In order to use a TransientQueue you have to 1) create a TransientQueue object in your class, 2) Push on a transient, and 3) add a step function to call the queue every frame. That’s a lot of conceptual load for a simple task. And it’s a one-time load, but it was enough to encourage me not to use it, and I wrote the damn software.

So, I’m killing them. They are no longer part of Glop. What I’m adding in their place are transient channels. They’re the exact same thing, except that you don’t have to create them or call run on them. They’re integrated into the Glop kernel. Here is the change using channels instead of queues:

    use Class::Closure;
    sub CLASS {
        my $x = 0;
            left => sub { Timed Glop::Transient (1, sub { $x -= STEP }) }

You hardly have to know what’s going on. They just do their job, and one second from now, they’re done and they’re gone. If you need the queue’s queueing behavior, then you do still have to create an object. But you don’t have to call run on it yourself; that’s done automatically. So even when you need the more sophistocated behavior, one conceptual strain point is gone. But most of the time, it’s all gone, and significantly easier than doing it yourself.

Speeding up OpenGL calls

A game with decent graphics (not even “good” by today’s standards) will crank through something to the order of 50,000 OpenGL calls per frame. Maintaining a 30 FPS framerate, then, would require 1,500,000 calls per second. According to my benchmark, on my box, perl is capable of making 2,100,000 do-nothing sub calls per second (regardless of whether it’s to perl or C code). So in a program like that, 71% of the processing power will be spent simply making the OpenGL calls.. That leaves 29% for everything else, including whatever processing OpenGL does within its functions on the main CPU, which simply won’t do.

So I’ve been thinking about ways to speed these up. Display lists are clearly one solution, but they’re not versatile enough. What if you wanted to cycle the colors or textures on a large structure you’re building (something a friend of mine turns out to be doing)?. You can’t use a display list, since you need to set the colors in between the calls.

What we’re battling here is the symbol table lookup and the pure op dispatch speed. We can’t get away from the symbol table lookup as long as we’re in Perl code1. It’s possible to get op dispatch overhead really low, as Parrot has so kindly pointed out. So I was thinking about a pipeline display list structure. Let me explain by example:

    my $list = pipeline {
        glcall qw{glPushMatrix};
        glcall qw{glTranslate 1, 0, 0};
        glcall qw{glColor $ $ $}; # extract three numbers from the pipeline
        prism_pipeline();         # might extract more numbers
        glcall qw{glPopMatrix};
    my ($r, $g, $b) = @_;
    while (1) {
        $r += STEP * 0.1;
        $g += STEP * 0.3;
        $b += STEP * 0.7;
        feed sin $r, sin $g, sin $b;
        prism_feed();   # generate the numbers that prism_pipeline asked for

We’ve turned four calls into two, and with a more complex figures, who-knows-how-many into who-knows-how-few. The pipeline code would be compiled into a quick CGP loop like Parrot uses (verging on the speed of a JIT), and it would just pull numbers out of the pipeline as it needed them. Perhaps we could even build in some simple branching instructions, and, well, maybe just use Parrot, but use a Parrot that’s ready for production.

But that’s the beauty of the interface. When Parrot does become available for production, we could just slap it in place of the original CGP loop, and then we have all of parrot’s neato stuff along with it.

Now, one problem with the interface I just showed is that it lexically separates the pipeline creation and the pipeline feed — a maintenance nightmare for sure. I wonder if I could stick it in a heredoc and actually parse out the piped bits (or hey, maybe even a source filter…). Naturally, making an OpenGL call when there’s no pipeline in scope would just call it, rather than compiling anything. But I wonder if that would give people the impression that they could do more than they actually could. Perhaps I could make one conditional call like:

    if ($BUILDING) {
        glcall qw{glPushMatrix};
        # ...
    else {
        feed $r, $g, $b;

And that would just become an idiom. I could call if ($BUILDING) something else to make it less confusing, like:

    draw {
        glcall qw{glPushMatrix};
        # ...
    with {
        feed $r, $g, $b;

And the more you separate those, the more speed you get (I of course don’t want it that way, but o/~ you can’t always get what you want o/~). Then instead of making draw and with sub(&)s, I could source filter them out into the if structure above, since it’s faster (you see that Perl’s speed becomes a pain here). Of course, there’s another advantage to that, which is that I can just dynamically redefine the regular OpenGL names so it looks like regular GL code, instead of that glcall qw{} encantation.

Ideas? I don’t want to have to use Inline::C in my games just to get them to draw fast.

1Sure, I could use a source filter and turn the function calls into, say, opcode numbers, but then you still have to do the lookup on whatever function you’re using to send the numbers.

Glop Actor Heirarchy

Just as POE has Components, Glop has Actors. And unlike the rest of the design, you use Actors by inheriting from them (well, you can use them directly, too). A simple Actor that will come with glop will be the 2D::Circle actor, which is good enough to toss into a scene and watch it bounce around. Of course, there will need to be things for it to bounce against. That’s what 2D::Wall is for. It is not, by default, affected by gravity, but it registers collisions.

I envisage a program in which a ball bounces around the screen, in its entirety, as such:

    use Glop qw{-fullscreen -2d},
            -view => [ 0, 0, 32, 24 ];   # set the viewport to (0,0)-(32,24)
    $KERNEL->world->gravity(v(0, -9.8));
    $KERNEL->add(Glop::Actor::2D::Wall->new(v(0,0), v(32, 1));   # floor
    $KERNEL->add(Glop::Actor::2D::Wall->new(v(0,0), v(1, 24));   # left
    $KERNEL->add(Glop::Actor::2D::Wall->new(v(31,0), v(32,24));  # right

Note, here, that the program responds to an OS quit event, and quits if you push escape. It just runs otherwise. Also note that the keyboard handling is already implemented, as well as the $KERNEL->add and $KERNEL->run methods. The next thing I’ll be working on is integrating ODE, followed by the collision detector. After that’s done, I’ll probaby have less than two weeks to prepare my presentation, so, um, yeah, I’ll be doing that.

Animations and Transient Callbacks

Animations in games are too hard. Usually they consist of creating some convoluted state machine, where the state is what they’re supposed to be doing, and the data is for how long they’ve been doing it. Now this is a fine model, but most of the time it isn’t formalized, and thus becomes a large switch or cascade of ifs. Hard to read, hard to maintain. Something needs to be done here.

Glop solves the problem using a queue of transient callbacks, basically a callback that knows when it’s not supposed to be called anymore. There’s a level of indirection around the queue, so some object can take it, store it away, and replace it with its own queue. You’ll see why this is important in a minute.

For example, let’s say you’re writing tetris. You’re dealing with a piece that, when “k” is pressed, you want to rotate clockwise. But you want it to rotate smoothly and not “jump” like many tetris implementations do. In the traditional “obscure state machine” model, you put it in a rotating state and record when it started rotating. You rotate it by the difference in seconds between the current time and the time it started, and when the difference exceeds π/2 you put it back into the rest state.

That seems to work. But what happens if the user presses the rotate button again while it’s rotating? Then you’ve got a problem. You can either ignore it, or set the initial time to “now” so that it jumps back and starts over. But there’s not a good way to make it rotate again after it’s done, which is the behavior I’d want as a game designer (an unskilled one, but I figure at least some of my ideas are sane).

With the transient callback queue it’s easy. You enqueue a “timeslice” functor whenever the button is pressed. The timeslice functor keeps calling the closure given to it (with the amount of time passed since its starting) every frame, and then quits when it’s done. Sounds familiar, but the interesting part is that you can enqueue it. Once it’s done, the next one starts and rotates it again. It looks something like this in code:

    method rotate_cw => sub {
        my ($self) = @_;
        enqueue $self Glop::Transient::timeslice(
                               pi/2, sub { $theta += $STEP });

That’s it! No complicated state machines; just a simple closure and a length of time. $_[0] inside the closure would refer to the amount of time passed since the timeslice reached the front of the queue, but we turned out not to need it here.

Now, what happens if you say “if they push down while it’s rotating, I want it to smoothly move down by one immediately.” To boot, you probably want the downs to queue independently of the rotates. It’s easy to do that, if a little more abstract.

    method BUILD => sub {
        my ($self) = @_;
        ($downq, $rotq) = $self->queue->fork;

That splits the queue into two different ones (you can split it into three just the same way, using the magic of Want.pm). Now you just use the two different queues and it all works nicely.

There are more kinds of transients than just the timeslice, as you may have guessed. There’s Transient::single, which just executes it once. There’s Transient::forever which, well, isn’t really transient. There’s Transient::indef (for indefinite; I may need to come up with a better name[1]), which goes until you say last TRANSIENT (which can be nicely shortened to just last in the absence of an inner loop). There may be others, and you can certainly add your own.

And that’s Glop’s take on animation. Now I have all the necessary information to start implementing the Actor base class.

[1] There needs to be a standardization of interface at some point. I figure that can wait, however, as I’m not sure all the spiffy interfaces I’m going to do. Nobody’s using Glop yet, and if they do, it’s their funeral since it’s very early in development. By OSCON it should be in a fairly interface-stable state. After all, I have the Glop::Draw->Cirlce style; the Glop::Transient::timeslice style; the just plain drawing {...} style. There needs to be some rhyme and reason to why I use each of these, and I may toss the second one altogether, so it becomes Glop::Transient->Timeslice.

Glop Input Design

Into the plethora of modules included in Glop I add POE. I’m going to use it under the hood, and try to hide it from the user as much as possible (well, in that postmodern way of exposing it when people want it). You’ll note that I’m coupling the design with all the libraries I’m including with it. I’m not certain that I’m making the right decision in doing so, but building abstraction layers around these kind of modules is not a simple task, and sometimes wouldn’t accomplish anything. Especially for things like POE, where the module basically is its abstraction layer.

Anyway, on to the title of this post, the input design. By “design” here, for the first time, it doesn’t mean “just an idea.” It means what I’m doing, and what I’ll be coding soon. Also, in this input design we begin to silhouette the rest of the design, which is quite exciting.

Mouse Interaction

First, every screen object implements an interface that the Glop kernel knows how to use. Among other things, this will include update (called every frame), draw, and select_draw. The latter method is the one we’re interested in, and it by default just registers a picker id and calls draw.

Whenever the mouse is clicked, it runs through all the objects in the scene and calls select_draw on them. It then finds the root of the select stack and calls select on it. This method should, and if you inherit from Glop::Actor (or whatever it turns out to be), does just call select on the next item in the stack. Then, presumably, the leaf objects will override this behavior and do something specific.

If you don’t need to do something as complex as the select_draw stuff, you can put it in a mode where it just reports all mouse click positions to a callback. In fact, the select_draw mode is just the default callback to this routine.

Keyboard Interaction

Keyboard interaction is completely different. All you do is register a callback with the global keyboard handler for a specific event. Give it an option whether you want to be called every time it changes, or every frame reporting the state. The latter mode means that you don’t have to keep track of the keystate when you’re doing an acceleration-driven object. It does that for you (well, more accurately, SDL does that for you, and it just passes along the information). The keycodes are strings with a familiar syntax: "x" (the key x), "C-x" or "C_x" (control-x), "Alt" (the obvious), etc.


    package Missile;
    use Glop;
    use Class::Closure;   # if you please
    sub CLASS {
        extends Glop::Actor;
        has(my $p) = v;   # v is short for v(0,0), which is short for v(0,0,0);
        has(my $v) = v;

        register Glop::Input::Keyboard
            'report',  # report every frame
            LEFT    => sub { $v += $STEP * v(-1,0) if $_[0] },
            RIGHT   => sub { $v += $STEP * v(1,0) if $_[0] },
            UP      => sub { $v += $STEP * v(0,1) if $_[0] },
            DOWN    => sub { $v += $STEP * v(0,-1) if $_[0] },
        # or for this insanely common case, there's a special semantic
        register Glop::Input::Keyboard
            qw(report keypad) => sub { $v += $STEP * $_[1] if $_[0] },

        method update => sub {  # This shouldn't be necessary if you're using ODE masses
                                # more on that later
            $p += $v;
        method draw => sub {
            drawing {    # Just a nicer way of saying glPush/PopMatrix
                Glop::Draw->Circle;    # default unit radius
        method select => sub {
            # see the animations post later this month for how this will work

Glop and PDL

As my third Glop example program I wrote a visualization of a quantum particle propagating in a crystal lattice1. I did it all in plain old Perl to start, and it was painfully slow: 1/2 a frame per second on a 32 x 24 lattice. Uh oh, this kind of speed simply won’t do for any serious game.

I rewrote my computation sub in C using the glory that is Inline::C, and it blazed along, even on a lattice four times the size (64 x 48). But then I noticed numerical stability problems. One particle sitting there without propagation would first be 100% likely to be there, then 150%, then 200%, then 300%… while I’d certainly like to measure it and find that it was three times more than always like to be there, I’m pretty sure the real world doesn’t do this (what happens when there is propagation and it’s 100% likely to be in two places at once).

I switched over to a Runge-Kutta integrator for the single particle case (which requires four computations of the derivative per frame), and that fixed the stability problems. But then propagation was still unstable, and in order to use RK on that you have to do it to the whole lattice four times—not just each position in the lattice four times.

So I scrapped all my C code; I wasn’t about to write that complex process in C, and it would have gotten unbearably slow yet again. I looked at PDL, since I had used it once for matrix computations during my Conjoint Analysis2 project. I studied up a bit, and implemented the RK method on the whole lattice: a computationally expensive process. PDL flew through it, giving me the same frame rates as the Euler integration gave for my C implementation. That’s amazing! I heard it was fast, but geez, that’s execellent.

I’m considering adding PDL to the Glop distribution, since numerical lattices are ubiquitous in games. I won’t if I don’t end up using it from the core, but I won’t hesitate to add it if I do want it in the core. Well, maybe I will. If it turns out that it’s easy to do something both in pure Perl and from PDL, I might provide a scan for PDL and use the pure Perl implementation if it’s not available.

1For the interested, the equation for zero-energy propagation on a one dimensional lattice is:

iħ ∂<x|ψ>/∂t = A <x|ψ> – 1/2 A <x-1|ψ> – 1/2 A <x+1|ψ>

Where <x|ψ> is the amplitude that the particle is at position x, ħ can be treated as an arbitrary constant when you’re doing a computational simulation, A is an arbitrary constant, and i is, of course, the square-root of -1. If you’re doing a multidimensional lattice, just add more of the negative terms, and change the 1/2 to 1/(the number of connected crystals).This is just a special case of a Hamiltonian integration. If you want to learn Quantum Mechanics, you must read Volume 3 of Feynman’s Lectures on Physics. It’s the absolute best.

2Adaptive Conjoint Analysis is a method that uses linear algebra to balance trade-offs between preferences. It’s a good way to figure out what consumers actually want, rather than what they think they want.


I’ve released the first segment of Glop to the CPAN, and it’s called Class::Closure. It’s not directly related to games, but it is in a yak-shaving sort of way. Perl’s class syntax sucks, most Perl programmers agree on that. Both creating a basic class and referring to its members is a pain. So I wrote Class::Closure, which fixes that without using a source filter (if you want it with an ugly source filter, see my Perl6::Classes).

But another thing that games require more than anything else is speed. So I did some benchmarks on Class::Closure, and found that its dispatch was about 4 times slower than the generic object model. Acceptable, but not optimal. I did a few caching tests, and one of them proved promising. If I cached the methods in the class’s symbol table, it turned out to be faster than the traditional object model. So I switched over the whole module to create a new symbol table for each object instead of storing it in a hash, and it’s about twice as fast as the traditional model. As an added bonus, it cleaned up the inheritance scheme somewhat.

Auf Wiedersehen, C(?:\+\+)?

I just got SDL_perl working. And you know what that means… C++: so long and good riddance. I shall now be doing my graphical programming in Perl.

I will, of course, still need to tolerate that vile language until the completion of Glop. After that, it will be used as a kind of “assembly”, which I’ll use to streamline the computationally intensive portions of my games when they start suffering, and for that purpose I don’t mind it one bit.

SDL_perl was a pain in the ass to get working on both Mac and Windows. For Mac, I ended up getting the full blown SDL+Perl framework distribution, which still didn’t compile. I had to fish through the source and change all occurances of #include <sdl.h> to #include <SDL/sdl.h>, and likewise with OpenGL. A semi-pain.

On Windows, I learned how to use ppm which comes with the ActiveState distro (my, how that works better than perl -MCPAN -eshell), and then hunted for a working SDL_Perl and OpenGL ppm. Worse yet, SDL::OpenGL doesn’t seem to work; I have to use the older and dirtier OpenGL.pm. And even worse, it doesn’t even break, it just decides not to export (or contain for that matter) any functions. What a pain.

One of Glop’s goals is to make it dead simple to install. I now see how ambitious a goal that is.

Oh, yes, and I’d love to have provided links to all these things, since they were kind of hard to find, but Comcrap’s DNS seems to be broken for the moment. I’ll scan through and fill them in when it comes back.