Every now and then, I have a small idea about a development environment feature I’d like to see. At that point, I usually say to myself, “make a prototype to see if I like it and/or show the world by example”, and start putting pressure on myself to make it. But of course, there are so many ideas and so little time, and at the end of the day, instead of producing a prototype, I manage just to produce some guilt.
This time, I’m just going to share my idea without any self-obligation to make it.
I’m working on Chrome’s build system at Google. We are switching the build scripts to a new system which uses an ingenious testing system that I’ve never seen before (though it seems like the kind of thing that would be well-known). For each build script, we have a few test inputs to run it on. The tests run all of our scripts on all of their test inputs, but rather than running the commands, they simply record the commands that would have been run into “test expectation” files, which we then check into source control.
Checking in these auto-generated files is the ingenious part. Now, when we want to change or refactor anything about the system, we simply make the change, regenerate the expectations, and do a git diff. This will tell us what the effects of our change are. If it’s supposed to be a refactor, then there should be no expectation diffs. If it’s supposed to change something, we can look through the diffs and make sure that it changed exactly what it was supposed to. These expectation files are a form of specification, except they live at the opposite end of the development chain.
This fits in nicely with a Haskell development flow that I often use. The way it usually goes: I write a function, get it to compile cleanly, then I go to ghci and try it on a few conspicuous inputs. Does it work with an empty list? What about an infinite list (and I trim the output if the output is also infinite to sanity check). I give it enough examples that I have a pretty high confidence that it’s correct. Then I move on, assuming it’s correct, and do the same with the next function.
I really enjoy this way of working. It’s “hands on”.
What if my environment recorded my playground session, so that whenever I changed a function, I could see its impact? It would mark the specific cases that changed, so that I could make sure I’m not regressing. It’s almost the same as unit tests, but with a friendlier workflow and less overhead (reading rather than writing). Maybe a little less intentional and therefore a little more error-prone, but it would be less error-prone than the regression testing strategy I currently use for my small projects (read: none).
It’s bothersome to me that this is hard to make. It seems like such a simple idea. Maybe it is easy and I’m just missing something.