I built a raycaster in Haskell using codeworld-api
over the weekend, and put the
code up on sr.ht. Raycasting is not a particularly impressive
technique these days: John Carmack did it 17 years ago in a cave!
With a box of scraps! with texture mapping, while wrangling the
IBM VGA, and while getting acceptable performance out of an 80386.
(Aside: the Wolfenstein 3D
Black Book is a fantastic read.)
Despite being obsolete, raycasting is a fun first step. I mostly followed F. Permadi’s tutorial from 1996, which does a pretty good job of deriving the maths, not just hitting you with walls of C. I was struggling to get the actual ray casting to work right, and eventually found the Algorithm of Amanatides and Woo. This post is about reaching that success, and reflecting on the thought processes that got me there.
Computer graphics seems to be one of those areas where you can get things almost right and yet end up with total garbage. Which is what I was getting after I’d set up all the data structures and dove into trying to render a first-person image. After a while, I ceased shotgun debugging and did what I should have done from the start: systematically verified my assumptions.
I stepped back and wrote a top-down visualiser, to show the map and
the player’s location and facing. It showed me the map upside-down and
back-to-front and the player marker somewhere totally unexpected. I had
made several bogus assumptions, but the most critical was that
CodeWorld’s screen coordinates go up with increasing
y
, but just about every graphics tutorial uses coordinate
systems which go down with increasing y
. I had
also decided that increasing θ
meant left turns in some parts of the code, and right turns in
others.
After ironing out all the coordinate systems (world, map grid, camera) so they pointed in the same directions, I still had a lot of trouble getting sensible raycasting working. I didn’t know whether my renderer was dodgy or whether the rays I was casting were being funky, and both were reasonably fiddly pieces of code to get right. How can I bisect this problem?
Ancient Unix Wisdom: Make it work, make it right, make it fast.
I commented out my failed attempts at fancy raycasting algorithms, and did the simplest thing I could think of: cast rays by stepping them forward 0.01 units until they hit a wall. The result looks a bit dodgy, but its errors are predictable and easily understood:
Having reasonably sound raycasting gave me the confidence to vary the renderer until it was right, and I took a little detour to add basic shading based on z-distance. Even with dodgy walls, it adds a lot to the scene:
Now that I had a working renderer, I could go back and fix up the ray casting. It turns out there’s a real difference between reading an algorithm in a paper, versus implementing it, versus grokking it. I “knew” that already, but it was instructive to live through a fresh lesson about it.
Trying to describe the extra knowledge in grokking isn’t easy. It’s
something like the difference between knowing what the implementation
says versus what the implementation means. Amanatides’
algorithm doesn’t use abs
, but it makes a good example:
abs (x2 - x1)
says “compute the absolute value of the
difference between x2
and x1
”. Grokking that
means something like looking at it and thinking “distance”, without any
conscious intermediates.
A ray from an origin point p⃗ in a direction v⃗ has the equation r⃗ = p⃗ + tv⃗, where t ∈ [0, ∞). Amanatides’ algorithm implicitly steps through the values of t that intersect grid lines, making it faster and more precise than my naive version. Implementing the algorithm meant I could find the grid cell that the ray hits; grokking the algorithm meant I could reconstruct the value of t that hit the wall. Once I had that, I had proper distances, and the rendered scene looked a lot better (you can see on the map that rays no longer penetrate the walls):
What did I learn from all this, beyond how to cast rays like it’s 1992?
You might not have enough time to validate all your assumptions, but it seems useful to at least have some idea of which ones are the riskiest (where risk = shakiness * badness-if-wrong).
Having visibility over the data you’re manipulating is very useful: it reduces the amount of code your data has to rattle through before you can tell whether or not things worked. Top-down map visualisation roughly halved the amount of code I had to verify at each debugging step.
Ancient Unix Wisdom: When in doubt, use brute force. Using a dumb-but-correct algorithm let me verify the moderately difficult part of the code (the renderer), and then use a correct renderer to verify the most difficult part of the code (the ray caster).
Diagrams, diagrams, diagrams! I always tell my students to draw diagrams, and this exercise reinforced my belief that they are a Good Idea. Humans are visual creatures: while we have to write one-dimensional code to instruct the computer, we can use all that visual ability to understand problems before coding them up.
My recall of high-school trigonometry is decent, but I’m not fluent in actually using it (was I ever?). I had to draw lots of little triangles to derive the answers to simple questions like “you know a and θ, find h”. If I do further graphics work, I’d expect to move quite slowly through the mathematics of it all. Building that rote fluency is critical to working at any practical pace, and takes time.
All told, it was a lot of fun, and I’m looking forward to doing more.