IO-free splittable supply

I know it has been a while since I posted, and I’m sorry to say this isn’t a terribly deep or interesting one.

I like the value-supply package on hackage. It takes (essentially) an infinite list of things and arranges them into an infinite tree randomly… but a particularly useful kind of random. The first one you look at just happens to be the first element of the list, the second one you look at comes from the second element, etc.

This sort of magic isn’t useful for its magic, it’s just useful because you can use something like Int and not run out of bits too soon. All I ever use it for is unique labels in computations that would be made too strict by the state monad. As a contrived example of such a computation:

    allocateLabel :: State Label Label
    
    labelList :: State Label [Label]
    labelList = liftM2 (:) allocateLabel labelList

    bad :: State Label [(Label,Label)]
    bad = liftM2 zip labelList labelList

The problem is that we don't know what label to start with on the second labelList in bad... we never finish using them up in the first one. But if all you care about is uniqueness of labels, rather than their order, this computation is easily definable as a function of a Supply from value-supply:

    labelList :: Supply Label -> [Label]
    labelList sup = let (a,b) = split2 sup in supplyValue a : labelList b

    good :: Supply Label -> [(Label,Label)]
    good sup = let (a,b) = split2 sup in zip (labelList a) (labelList b)

Great. Only problem is, we can only make Supplys from the IO monad. As it should be, since the supply uses a nasty form of nondeterminism to assign elements to its trees.

But, if all you care about is uniqueness, which is the case for me more often than not, you can squash IO out of the interface (not the implementation, unfortunately). Like this:

    runSupply :: (forall a. Eq a => Supply a -> b) -> b
    runSupply f = f (unsafePerformIO (newSupply 0 succ))

In my opinion, this is the best possible interface for such a thing. Value-supply is only to be used when all you care about is uniqueness, not ordering or the precise values. Well, this interface encodes that exactly. When you use this function, all you are allowed to care about is uniqueness, any other concern is hidden behind the type system. It is also impossible to confuse indices from different supplies.

And this is a perfectly well-behaved function. You could demonstrate that by giving a pure implementation of Supply [Bool] (which just traces the path to the given leaf from the root). The unsafe business is merely an optimization (and for some use cases, like labelList above, a very potent one).

Hooray. I get my favorite evil module, with all of its evil safely contained in a jar that they call a box for some reason.

About these ads

5 thoughts on “IO-free splittable supply

  1. It’s great to see a proper type for runSupply.
    Another thing I often want is to have a fast mapping from the unique values to something. To have a fast map you need more than Eq. Perhaps a mapping can be built into the pure interface?

  2. Random numbers may be generated with the similar interface. Also I think that “Node” data constructor may be exposed safely thus simplifying the interface.

  3. re: Random Numbers. Yes, they can, but how on earth do you specify the purity invariant like we did for unique supplies? The only computations you could pass are those that are invariant under choice of random numbers. Some numerical methods do that, but convincing a computer of the fact is another story… I suspect a type signature that would make pure splittable supplies for random numbers correct would require a quite sophisticated proof technology. Probably more advanced than we have seen with Coq or Agda.

  4. re: Random Numbers.

    > Yes, they can, but how on earth do you specify the purity invariant like we did for unique supplies?

    The same question is valid when generating random numbers in a functional language by threading state. A method of generating is rarely specified and varies. Also a seed is obtained from IO and also varies. So a result of a program which uses random numbers varies.

    I don’t know how. Maybe, such an invariant should be probabilistic?

  5. Great tecnique!

    However, I’m not sure whether it’s as useful as I would like it to be. What about the use case of creating an AST of some intermediate language / bytecode / etc. that makes use of unique identifiers for variable names?

    Since the type of runSupply mandates that b may not depend on the label type a, we can't return said AST inside the result b. Or put differently, every operation on the AST (printing on screen, compiling to machine code, etc.) must happen inside the argument of runSupply.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s