Tag Archives: algorithms

Sword Gestures

It has been two weeks since I have done the slightest amount of work on the sword game. Kudos to Ben for keeping it moving along. Anyway, it’s time for some brainstorming.

Our current idea (as of last week) for a control scheme for the sword game is (1) to have it in slow motion for strategic, rather than strictly action, play, and (2) to control the sword based on gestures, of sorts. Not pre-programmed gestures like (ugh) Black and White; more like drawings of what you want the sword to do. Draw a Z and you’ll get a Zorro motion. I like this idea a lot, and I think it has a lot of potential. So, I have to tackle the problem of, given a gesture, how do you figure out what to do with the sword.

Well, we want it to be sort of vague and interpretive. We also want it to be natural, the way a human would interpret when given that gesture. These two things point me straight toward setting up descriptive equations, filling in known information, and solving for the rest of the details. But what kind of information does the gesture specify?

We know it starts about where the sword currently is when you draw the gesture, and it ends in the same direction as the end of the gesture. We can intuit velocities, but I think it’s better to let the equations figure those out. We also know a few points on the curve (as many as we need, really). We’re trying to solve for the force to exert at every step. Oh, and we know we have a maximum force we are allowed to exert, which should play a role in the solving process (so the solution doesn’t try to do anything impossible).

A Bezier-like spline might suffice. We are given a curve, i.e. a bunch of points, and we’re trying to find a smooth interpolant. Let’s look at what information we have at each point:

  • A 2D position at each point.
  • A 3D position and velocity at the first point.
  • (perhaps) A time at the last point.

y much information. And the information we need:

  • A 3D force for the handle at each point.
  • A 3D force for the tip at each point.

I’m making the assumption that we linearly interpolate the forces between each point, so the force half-way between two points is the average of the forces at each of the two points. So we have 6 unknowns. We need 6 equations (for each point).

How do we interpret the 2D data? I’m going to say for simplicity’s sake that we put a plane in front of the fighter and require that the sword pass through the corresponding point. This may be the weakest assumption of the algorithm. And there’s another problem with this approach: in order to compute that, we need the position and orientation of the sword at each point, another five unknowns in exchange for only two equations!

Hmm, as I hash this approach out, it seems to be breaking down. Maybe I should scrap and come from a different angle.

Generalization

Okay, here’s the latest on my type inference algorithm: (1) it rocks, (2) it is still having trouble with generalization. My current example is the following (using ML-style reference syntax):

    fix \f x { x := !x + 1; if 10 < !x then x else f x }

That fix business is how you implement recursion under the hood. Now, the correct type of this function is:

    ^a ^b (b -> b) [ b <: Ref a ] [ Int <: a <: Num ]

This type may not even be obvious to type theorists. Let me explain: the caret (^) means “the greatest lower bound over”, which can be read roughly as “for all”. The form of such a type is ^variable (type_involving_variable) [constraints_for_variable]. That circumfix syntax is awkward, since the constraints for a are way at the end of the expression. Still, let’s try to read it: for all a, for all b, the type is a function which takes a b and returns a b, as long as b is a subtype of a reference to an a, as long as a is at least an Int and at most a Num. Yowch, that’s a mouthfull. The paraphrase of that is, this function takes any reference to a number (at least an integer; i.e. no “just even number” subtypes or anything) and returns that reference (in this case, changing its contents first). You can see that that is correct given the code.

I manually entered the equations for that expression into my type inferencer (it doesn’t do AST traversal yet, but that part is trivial). I looked over the two-odd pages of output for the reduced equations, and they were right on. That’s the (1) it rocks part. Then it tried to generalize those equations, and this is what it came up with:

    ^a ^b (b -> b) [ b <: Ref Int, b <: Ref a ] [ Int <: a <: Num ]

It looks pretty close. There’s just this pesky little b <: Ref Int clause, which is wrong! It says that whatever you pass this function has to be a subtype of Ref Int. But a Ref Num is not a subtype of a Ref Int; in fact, Ref x is only a subtype of Ref Int when x is itself Int. Some type checkers would actually type it this way, saying that the function’s type is Ref Int -> Ref Int, but for mine, that is not acceptable. So what’s going wrong with the algorithm?

Okay, so the generalization algorithm is given a bunch of equations involving type variables and other types. For each variable, the algorithm wants to find either (a) a direct substitution, where you can safely replace that variable with another type, or (b) a minimal set of constraints on that variable which must form acyclic dependencies (cyclic dependencies would lead to an infinite loop when we generalize over those variables).

It currently does this by finding all constraints on each variable, and then gradually reducing them. That is, it looks through every equation, and when it sees an equation that looks like a <: v, where v is a variable, it adds a as a “lower bound”, and when it sees v <: a, then it adds a as an ‘upper bound”. What you end up with is a mapping from variables to their bounds, like, say: a = { Int | Num }, b = { a | c }, c = { Ref b | a, Num, Str }, etc. Then it reduces them by eliminating any lower bounds which are lower than other lower bounds. That is, if you had a = { Int, Num | }, it would eliminate Int, because Num being a lower bound already implies that Int is. It will do this for variables and other types, consulting the equations to see what is a subtype of what.

After that, it looks for substitutions. That is, if you have a constraint that looks like a = { b | } or a = { | b } or a = { b | b }, then we say “okay, just bind a to b” and we substitute b for a everywhere (including in the other constraints). Then it goes back to the reduction phase. It alternates between these two phases until there is nothing left to do. You end up with a substitution and constraints that meet the requirements.

But I think the fact that it creates substitutions out of a = { b | } and a = { | b }is wrong. a = { b | b } is correct; that means that a and b are the same type. But a = { b | } says that a could be b, or any supertype of b, so making a substitution out of that is wrong. But I can’t just take out those two rules, because then we end up generalizing over every variable we solved for (the types end up looking like ^a ^b ^c ^d ^e ^f ^g ^h ^i ...).

Something that I have noticed that I haven’t figured out how to incorporate yet is some reductions you can make based on what kind of generalization you are doing (greatest lower bound, whch is typical, and least upper bound, which is quite uncommon but still supported). For example, ^a (a) [ Int <: a ]. the greatest lower bound of all types which are supertypes of Int, is obviously just Int. I’m wondering if I can take that idea and run with it to get a good generalization. However, it is not entirely simple, because ^a (a -> a) [ Int <: a ] does not reduce to Int -> Int; in fact it is not an arrow type at all, which means it must be written as a greatest lower bound! I proved something similar back when I was first exploring this calculus.

So I suppose I could say when you try to generalize over a variable, look at the type in which it is used. If it only appears in covariant position (on the right of an arrow, or not involved in an arrow), then substitute its greatest lower bound if it has one. If it only appears in contravariant position (on the left of an arrow), substitute its least upper bound if it has one. If it appears in both positions (appearing as an argument to eg. Ref counts as both positions) then you cannot substitute it, so just generalize. Let’s take our example from above. I’ll be working from this dump of my algorithm’s thought process, namely the equations section at the beginning.

We are trying to generalize to the greatest lower bound of 20 (that is the variable I assigned the result). 20 is only directly involved in one equation: 107 <: 20, so we can safely replace 20 by 107. Now we are trying to find the greatest lower bound of 107. 107 has several upper bounds, but we don’t care about those (unless it had no lower bounds, then we would). It only has one lower bound, namely (10 -> 19). Now here’s where it gets tricky. I have implemented it (several times, followed by a head slap and delete) where it just tries to find the least upper bound of 10 and the greatest lower bound of 19, but that doesn’t work, because, say, 19 could depend on 10. Only after you’re sure that they’re independent can you do that. So… what do we do now?

Complete Lattice Type System

I implemented a simple-minded type inferencer for meme (all monotypes–no polymorphism). Then I started implementing “Builtin” AST nodes so that I could have polymorphism in specific examples, like if. That got me wondering whether I could do polymorphism, as long as it was declared. I ran through the type inferencer algorithm by hand for hours, trying out different features to handle that case (this was an extremely tedious process). Finally I took my simple type inferencer and added a bunch of features to experiment with in code, rather than by hand. It went through several iterations, the first being universal types instantiate themselves on the left side of a <:, and “generalize” on the right. This generalization was similar to instantiation, but it would be come an “uppercase” variable, which would need to be instantiated again if it ever passed through a function argument.

Of course, that first iteration was a miserable failure. After messing around with it and racking my brain a bit more, I eventually came up with a way to do suprema and infima à la my last type theory post. And after fixing a few key things, IT WORKED. That is to say, it worked on my example. But it wasn’t a totally trivial example; it underlined the key thing I wanted it to do. Namely, this expression: (\id -> (id 1, id "hi")) ((\x -> x) :: forall a. a -> a), which is better than Haskell does :-).

I made it so you mark the type variables your are solving for as either supremum or infimum types. Function parameters can safely be infima, and function return values can safely be suprema. All the rest, I think, can be marked singular (i.e. standing for an atomic or an arrow type). Then there are a few laws that reduction must follow with these types. Namely: limit types (that’s the name I give to infima, suprema, and universal types together) cannot instantiate in equations involving other limit types (this makes sense mathematically; if the two types are equal, and they truly are nonsingular, then there is in fact no valid instantiation). Then there are rules for instantiating limit types when they are in equations with singular types.

However, the algorithm isn’t totally clean. It does some hackish stuff to make sure that it terminates (in this example; I’m nowhere near proving that the algorithm always terminates). It marks instantiated variables as “I have been instantiated, don’t instantiate me again”, which I don’t consider to be a very clean or correct solution. However, it does seem to work (and I’ve tried giving the algorithm input that I thought would make it infinite loop, such as self-referential types, and it didn’t).

Anyway, I’m not going to go too in-depth at the moment. I have to get to bed. Suffice it to say that it’s very cool. The source code is here (needs GHC 6.6 to run). To see that the output worked, note that the result of id 1 above is the variable v4, and the result of id "hi" is the variable v6. Somewhere in the big spew of output, you’ll see the lines Int <: v4 and Str <: v6, and (very important) not Int <: v6 or vice versa.

Oh right, the contest

I mentioned the ACM programming contest that I competed in on Saturday. The team was made up of Jude Allred, Paul Steinbrecker and me, on a team quite creatively named “University of Colorado Team 3″. The competition took 5 hours, and our task was to solve 7 difficult algorithmic problems.

Assuming that only the best programmers competed in the contest, that the contest actually measures programming ability, and that all programmers live in Colorado, we are the best programmers in the world. If you instead assume that all programmers live in North America in Colorado’s time zone, then we are only the 4th best programmers in the world. If you remove either of the previous two assumptions (but why would you want to do that?) then I have no idea.

For record keeping, since this is decent resume material, there were 12 competitors in the state and 45 in the region (time zone). We solved 5 problems (the three who beat us in the region all had solved 6, and everybody below us solved at most 4). We programmed all the entries in C++. If I have any advice for future competitors it is this: Know Dijkstra’s Algorithm. I think there were three problems which essentially reduced to that.

Telegnosis Cryptography

Usually, a programming project needs to have problems that require clever or interesting solutions in order to keep my interest. That’s one of the reasons I have mostly lost interest in the Ichor codebase (though progress also sparks my interest, so I could regain it for a little while). The reason I have been interested in telegnosis for so long is because I have added a design constraint to the project which creates interesting problems where there would otherwise be none.

In all popular multiplayer online games there has been a problem with cheating, especially in open source ones. I’m not saying Telegnosis will be popular (I’d like it to be, but board games usually don’t catch on). Nonetheless, I want cheating to be impossible in Telegnosis. The exact design constraint: it is impossible to write a client or server for the game which is capable of cheating without being caught by another player using the standard client.

The telengosis client currently enforces this in all but one area: time. I haven’t come up with a way to securely record times yet, assuming one is possible at all. I am planning on refactoring the crypto logic out from the rest of the game into a “crypto gaming” module. So, to concretify my ideas, I’ll jot down what I’m doing and what I plan to do in this entry. I hope a cryptologist reads this, so that any flaws in my algorithms can be pointed out.

Continue reading

Low-Overhead Weak Pointers

Abstract: The usual linked-backreference method of implementing weak pointers has awkward copying semantics and involves a linear time search whenever a pointer is deallocated. This entry describes a better weak pointer implementation method.

What is a weak pointer? Consider this C++ code:

    Foo* x = new Foo;
    Foo* y = x;
    delete x;
    cout << y;

This outputs the location at which a Foo was allocated, but that value is now as good as garbage, because that Foo was deallocated. It would be useful to have y automatically set to NULL so that code could check whether the object is still valid. A pointer that is set to NULL when its target is deallocated is called a weak pointer.

Weak pointers are usually implemented by storing a linked list in the target of locations that should be set to NULL when the object is deleted. Consider the bookkeeping of this implementation: whenever a pointer is copied, a new reference must be added to the list, and whenever a pointer is deallocated (eg. the pointer itself goes out of scope), the list must be searched in linear time to delete it from the list. That hurts. Obviously there are ways to speed it up, but the asymptotic bound is still there.

Here is my solution: First, set up a small object pool with entries that look like this:

    struct Entry {
        unsigned int age;
        union {
            void* ptr;
            int next;
        };
    };

Which is just the typical small object pool entry with an extra field: age, which is initialized to zero when the pool is created. If you don’t know what a small object pool is, then the next paragraph is for you!

A small object pool is a way of quickly allocating memory of predefined size. Most allocators today use this method for sufficiently small objects (say 64 bytes or less). You implement it by allocating a big array of Entrys, and initializing them to form a sort of linked list. That is, pool[0].next = 1; pool[1].next = 2; etc. This linked list forms a “free list”, a list of unallocated space. The pool also keeps track of the first free entry in the list (call it head), in this case 0. Then when it is asked to allocate some space, it allocates head, and then sets head to the next entry in the linked list: pool[head].next. When it is asked to deallocate a location, it sets that location’s .next to head and head in turn to that location. It’s just standard linked list calculations with a slightly clever twist, which gives you minimal space overhead and constant time allocate and delete.

Okay, so about this age field. A weak pointer, instead of storing a location in memory, stores an index into this pointer pool. A weak pointer also has an age field which refers to the age of the cell it is pointing to when the pointer was created. To access what the pointer is pointing to, you look up that index in the pool. To see whether the object has been destroyed, you check the pointer’s age against the cell’s age. If the two ages are equal, then the pointer is valid, otherwise the object has been destroyed while the pointer was referring to it.

To allocate a weak pointer to an object, first allocate a space in the pool. Set that space’s ptr field to the object, set the pointer’s index to the index of the new space, and set the pointer’s age to the new space’s age. When you delete the object, free it from the pool and increment that cell’s age field. If a new object is allocated there in the future, it will have a greater age so that the pointers to the old object won’t be fooled into thinking the new one is the old one.

This scheme is only guaranteed to work as long as you don’t allocate 2sizeof(int) times in the same location. While that is a huge number and it is unlikely that any real application would do that, it may be more likely than you think. If the weak pointers follow a cycle of alloc, delete, alloc, delete, then the same space is used over and over again and the rest of the pool is untouched. It is conceivable that you could do that four billion times. A solution is to check on deallocation whether an increment would rollover, and if so, invalidate that pool entry—never use it again. This creates a memory leak, however it would be so gradual (in the worst case, 8 bytes every 4 billion allocations on 32 bit machines) that it would be neglegible. I feel silly even addressing the memory leak issue.

If you’re writing this in C++ as I am, you’ll want to hide all this behind a pointer wrapper class. Either that or you’ll want to use the one I wrote. I will post a link as soon as I write it.

Kernel for the logic engine

In Haskell, the list monad provides a way to do non-deterministic, backtracking computations. For example, you can generate all sums of two numbers between 1 and 5 like so:

    sums = do
        x <- [1..5]    -- let x be a number between 1 and 5
        y <- [1..5]    -- let y be a number between 1 and 5
        return (x+y)  -- return their sum

It will proceed like so: let x be 1; let y be 1; output 1+1; let y be 2; output 1+2; let y be 3; output 1+3; …; let x be 2; let y be 1; output 2+1; …. If you don’t see how cool that is, that’s okay, but trust me that it’s damn cool. Especially so when you combine this with conditions (i.e. I could have generated all such prime sums), arbitrary recursion.

However, when you enter the realm of the infinite, this is not quite ideal. For example, let’s say you’re generating ordered pairs like so:

    sums = do
        x <- [1..]     -- let x be a number >= 1
        y <- [1..]     -- let y be a number >= 1
        return (x,y)  -- return the ordered pair

The enumeration proceeds as before: let x be 1; let y be 1; return (1,1); let y be 2; return (1,2); let y be 3; return (1,3); …. If you’re observant, then you’ll notice that x will never get to be 2, or any higher number. The entire (reachable) list of ordered pairs has first element 1. If you’re going to be searching through the generated possibilities for the pair (3,4), your algorithm will go into an infinite loop. (Maybe you are starting to see the parallel to my last post). Technically, the order type of the returned list is greater than ω, so some elements will take an infinite amount of time to reach.

So what we would like is a list monad where the returned results always come in a list of order type ω, that is, every element can be reached in an (arbitrarily large) finite amount of time. I’ve written one. The fundamental idea is that of the standard well-ordering of the rationals. Consider this grid:

1,1  1,2  1,3  1,4  1,5  ...
2,1  2,2  2,3  2,4  2,5  ...
3,1  3,2  3,3  3,4  3,5  ...
...

You can put this into a list of order type ω by traversing the positive diagonals: 1,1; 2,1; 1,2; 3,1; 2,2; 1,3; 4,1; 3,2; 2,3; 1,4; …

So basically, whenever the algorithm joins two lists, it does it not by concatenating them together (as the standard list monad does), but by “diagonalizing” them together. This ensures that if you only feed it lists of order type omega, it will output a list of order type omega.

Incidentally, this provides a kernel for the logic engine I was talking about. This will enumerate all logical sentences in order type omega, so that if you look through the list for sentences that are true, you will eventually get to a true statement if there is one.

Logically Backwards and Forwards

I tried to ask Ovid (the author of AI::Prolog) about forward-chaining logic, but I realized that I didn’t really know what I was asking. Let this post serve as a brainstorming session to find that.

A smart game AI would like to be able to make inferences based on what any particular NPC knows. For instance, if I own a house, and it contains a couch, it would be reasonable to assume that I own that couch. That’s easy using backward chaining logic in Prolog:

    owns(X,Y) :- owns(X,Z), contains(Z,Y).

But there’s also the idea that knowledge changes. For instance, I’d like to specify something like (pseudo-syntax):

    owns(me,X) --> !owns(me,X)  { GET MAD! }

That is, if a piece of knowledge enters the database that causes owns(me,X) to go from true to false for a particular X, obviously it has been stolen. I’ll say that the new knowledge invokes a predicate transition. So one way to do this would be, for every transition hook, to poll once every certain while and check whether it has changed. That sucks though. The way it should work is the opposite way from general backward-chaining logic. It much more like dataflow.

Hmm… that defines the problem pretty well. I’m just wondering if that’s the way I want to do it. Is it transitions that I am really interested in? I think it is. If it weren’t, then I suppose I would just query the expression and bind the value out.

So in order to define such a transition, I need to tell owns/2 that when it changes, I need to update and possibly act. Huh, that’s interesting. It seems to be a knowledge base problem more than a combinator problem. The only thing I suppose I’m wondering is if there’s a better way to update knowledge than to re-run the rule.

Well, that defined the problem. How’s that for unstructured prose?

More Multimethod Madness

Damian and I have been intelligently screaming at each other on perl6-language, finally really fleshing out the Manhattan distance MMD debate. It has settled down recently, after I discovered that Damian had been arguing against something that was almost, but not quite, entirely unlike my algorithm. I looked through the message history and realized that I had been arguing this algorithm this whole time, and I never said what it was! The most visible place I put it was in the pugs repository at the end of a document.

Anyway, I implemented my algorithm in a Perl 5 module, Class::Multimethods::Pure (the algorithm is also described there). The only current problem is that it is turtle slow (or whatever the expression is). I’ve never been very good or interested in writing locally fast code: I prefer abstraction to raw speed. However, if I’m seriously proposing this algorithm for Perl 6, I’d better have a good algorithm for dispatching in sublinear time. And algorithm design happens to be one of the things I am interested in.

A multimethod is composed of a set of multimethod variants, which are essentially lists of parameters together with the corresponding code. Currently, when the module compiles, it sorts the set of variants into an ordering of singular sets. That is, I go along the list of such sets, and for each set:

  • If it has no elements which match the input parameters, I move on to the next set (or die “no method found” if there are no more sets).
  • If it has exactly one matching set, I succeed and call that variant.
  • If it has more than one element, I die with an ambiguity error.

So that means that whenever a variant a is more specific (see the docs above for a precise description of what that means) than a variant b, b is necessarily in a later set than variant a. So the most specific variants are way up front. And that means that the more generic method you’re going to dispatch, the slower the algorithm gets. That is probably unavoidable. Here’s the problem: By the time you get to the last set, you’ve asked each “does” many too many times. Keep in mind that such questions can involve a subtype condition, which can (but shouldn’t) involve heavy computation.

The approach that I’m working out now is to build a DFA where the states are questions about “does” relationships and the transitions are “true” or “false”. I want to construct a DFA that asks as few questions as possible in order to determine whether there is a unique matching variant. The way I see this as a win is that if you ask the question “Does the first argument do C?”, then if so and C is a subset of A, you already know the answer to “Does the first argument do A?”. Likewise, if you ask if the first argument does A and it’s false, then you already know that it doesn’t do C.

That’s about as much as I know for sure. I’ve tried a few times to find an method of constructing these DFAs with no avail. I’ve tried having each state be a set of candidate methods, and each question narrow the state set until you have a dominated set (where there is one method in the set that is more specific than all the others). But consider this example:

    B does A
    variant 1: foo(A, A)
    variant 2: foo(A, B)
    variant 3: foo(B, A)
    variant 4: foo(B, B)

If you ask “does argument 1 do B” then you get the set 234. From the set 234, if you ask “does argument 2 do B”, you get the set 234. Obviously variant 4 is correct, but a “set of viable variants” approach doesn’t keep enough information to to tell you that.

If anybody has any ideas for this, I’d love to hear them.

Adventures in Beat Detection

I have been playing around with beat detection, because two games that I have been working on would really benefit from having strong beat detection. I think I’m on to something with this one (maybe), so I’ll post my methods and results here. I don’t have any schoolin’ in fourier analysis, so things I’m doing may seem naive to the experienced fouriereer. If you’re one of these, please please please point me in the right direction.

I came up with this algorithm on the bus to school this morning. When I got home I couldn’t wait to try it on some real data with Mathematica. I describe my findings here:

I started with this 5-second sound sample (sorry about the WAViness… that’s the format I loaded it in as). It’s an excerpt from Rachmaninoff’s symphonic dances, which I chose because it has a wide frequency spectrum but is still very rhythmic. Mathematica plots the samples like so:

    In[1]:=   wav = Import["symphonic_dances_fourier_field.wav"];
    In[2]:=   samples = wav[[1,1]];
    In[3]:=   ListPlot[samples]

Listen to the sample and notice how you can see the beats in the wave. These are raw intensity, and these are what a naive detector would use to pick the beats out. This example isn’t good enough to tell if my beat detector is any better than naive.

Next I compute the “fourier field” of a particular window size, in this case 8192 samples (about 0.18 seconds). This takes successive windows at a given step (in this case 256 samples) and computes their fourier transforms, and sticks it in a two-dimensional array indexed by [time; frequency]. Here’s the code:

    In[4]:=   fourierfieldtime[data_, window_, time_] :=
                  Fourier[data[[Range[time-window, time+window-1] ]] ]
    In[5]:=   fourierfield[data_, window_, step_] :=
                  Table[fourierfieldtime[data, window, i],
                      {i, window+1, Length[data]-window, step}]
    In[6]:=   field = fourierfield[samples, 4096, 256];
    In[7]:=   ListDensityPlot[Abs[field], Mesh -> False]

The next thing I do is take each frequency band (which gives that frequency’s “presence” over the time of the song) and do another fourier transform on it, and put it in a table. This should tell me how each frequency is pulsating in time, with the theory that most frequencies repeat themselves with the beat. Note that I take the absolute value of the previous transform, disregarding the phase. Otherwise I just run round-trip and get a strangely-averaged and phase-rotated version of what I started with.

    In[8]:=   timefield[data_] :=
                  Table[Fourier[Abs[data[[All, i]] ] ], {i, Length[data[[1]] ]}]
    In[9]:=   times = timefield[field];
    In[10]:=  ListDensityPlot[Abs[times], Mesh -> False]

Finally I just add all of these together vertically to get the rough correlation of frequencies of each frequency. We’re only interested in the very low frequencies (of each frequency)—things that happen between 1 and 40 times during the sample—so I only plot those.

    In[11]:=  compose[data_] := Plus @@ data
    In[12]:=  beats = compose[times];
    In[13]:=  ListPlot[Abs[beats[[Range[1, 40] ]] ],
                  PlotJoined -> True, PlotRange -> {0, 320}]

This is our final analysis, so let’s look at it carefully. What is it saying? We want to look at the dominant frequencies, the ones with the strongest correlation. Frequency one is of no interest; it is always high in phaseless fourier transforms. The largest is six, followed by four. What happens six times during this sample? The half notes, which are the strongest orchestra hits. I can’t figure out four yet. The next one is twelve, which corresponds to the quarter notes: that’s the one we’re really interested in. I can’t quite tell how we’re going to communicate that to a reading program. We might just report the five highest levels or something, and let the program determine what kind of tempo range it is looking for.

The final step in the process is to do an inverse fourier transform on this data, to get back something that resembles our beats over time (a very dumbed-down version of the song, as it were). This is pretty cool:

    In[14]:= orig = InverseFourier[beats];
    In[15]:= ListPlot[orig, PlotJoined -> True]

You can see each of the strong beats that you associate with the song. I can’t tell whether the beats occur at the tips of those or at the heavy turnarounds right before the tips. I’m inclined to say the latter. But this information that we’ve reconstructed could easily have been created by a dumb volume analysis. Next week (when I finally have some time again, damn homework—I’m procrastinating a huge project as I speak), I’ll try it on a harder song, a baroque piece or some jazz. My ultimate goal in this project is to get it to beat-detect jazz, which has a very implicit pulse rather than a strong beat.