> Most anti-aliasing techniques do boundary smoothing by treating each pixel as a little square and visualizing the fraction of the pixel area that is contained inside the outline.
That essay is written from a particular perspective and pretends that it is the clearly right perspective. Whether it's valid to consider pixels to be little squares depends on whether we're talking about display pixels or image sensor pixels and whether we're trying to resample or interpolate photographic data or trying to create data (pixel art, fonts) for a specific display medium. Sometimes there simply isn't an underlying continuous field to be point-sampled from. You'll never devise a good way to render text on an LCD with that article's mindset (not too surprising, since it's from 1995).
The paper is old, but it's actually a fairly trivial application of signal processing techniques that have been well-understood since the 40s. For sub-pixel anti-aliasing, the approach would be to
1. Construct the underlying continuous field. This is just a function f(x,y) that returns one if the point is within the text and 0 otherwise.
2. Convolve f with an anti-aliasing filter. The filter could be tall and skinny to account for the fact that the horizontal resolution is 3x the vertical resolution.
3. Sample the resulting image at sub-pixel positions to produce the red, green, and blue values.
In the special case where the anti-aliasing filter is a box filter, this is exactly the same as computing the average for each subpixel. For the technique proposed in the article, the filter kernel would be the sum of six shifted impulses (Dirac deltas).
Anyways, I liked the article and wasn't trying to be critical of it. The convolution approach described above is of theoretical interest, but implementing it with any non-trivial kernel in real-time is almost certainly intractable. What I meant was that every implementation of anti-aliased vector graphics is a kludge, and it's pretty easy to coerce aliasing artifacts out of all of them using zone plates as inputs.
I certainly didn't mean to imply that the signal processing perspective was untenable with the modern world of actually-rectangular pixels, but what you describe is really a post-hoc shoehorning of square-pixel thinking into the signal processing framework. And you still haven't accounted for pixel-oriented font hinting or pixel-first design of bitmap fonts and graphics that gives leeway to the underlying shapes in order to maximize legibility when rendered onto a pixel grid. The signal processing perspective can offer some valuable insight, but it's a pretty bad choice as an overriding mode of thought for computer graphics.
Sure, for bitmaps fonts or pixel hinting the signal processing framework doesn't provide much insight. However, the word aliasing itself refers to a concept from signal processing, and in my opinion, it's easiest to think of anti-aliasing from the signal processing perspective.
For example, look at the images in [1] (also a rather old paper). The box filter results (i.e. where the pixel value is set to the average of covered area) are less than ideal.
For what it's worth, you can find a nice detailed description of Microsoft's approach to sub-pixel anti-aliasing in "Optimal Filtering for Patterned Displays" [1]. There is also a follow-on, "Displaced Filtering for Patterned Displays" [2].
Interestingly, both papers feature an aliased zone plate. :-)
Sensor pixels and display pixels also aren’t little squares, and treating them as such (whether for font rendering, photo capture, rendering line drawings, or any other purpose) is pretty much always worse than treating pixels as a discrete approximation of a continuous image. Unfortunately 2D approximation is inherently more complex than 1D approximation, so you inevitably get some artifacts even when you do fancy computationally expensive math, and the choice is about which type of artifacts to privilege.
If you want to get really fancy, you could base all your calculations on the precise region (with a kinda fuzzy boundary) where light is collected by a sensor pixel or emitted by a display pixel, but the advantage over pretending the pixel is a jinc function or whatever [cf. https://en.wikipedia.org/wiki/Sombrero_function] is going to be marginal.
> Sensor pixels and display pixels also aren’t little squares [...]
They're pretty damn close, modulo the Bayer pattern for most sensors and RGB stripe arrangement for most displays. Calling an LCD's subpixels rectangles is certainly an approximation that's valid on the scale of the distance from one pixel to the next.
> [...] and treating them as such (whether for font rendering, photo capture, rendering line drawings, or any other purpose) is pretty much always worse than treating pixels as a discrete approximation of a continuous image.
Whether treating those pixels as rectangles or points is worse depends as much on the software/analytic approach you're using as on the physical reality of their rectangular geometry.
> Unfortunately 2D approximation is inherently more complex than 1D approximation, so you inevitably get some artifacts even when you do fancy computationally expensive math, and the choice is about which type of artifacts to privilege.
True, if you're unjustifiably constraining yourself to treating computer graphics only with the methods of a generic signal processing problem. Bresenham's algorithm is radically simpler than anything involving Bessel functions and also happens to work very well in the real world both in terms of speed and visual quality. Adding antialiasing to it leaves you with something that's still extremely simple and is easy to explain in terms of pixels. An exhortation to never treat pixels as little squares is just plain wrong.
Bresenham’s line algorithm works pretty well for how simple it is (especially assuming you are rendering on a CPU, circa 1970 – with GPUs available on every device it’s an anachronism which only persists through historical inertia), but rendering lines using supersampling on some not-so-rectangular grid and then using a high-quality antialiasing filter to integrate the samples looks strictly better every time, especially if you have a large number of thin lines. If you’re just rendering a couple simple shapes it probably doesn’t matter too much. If you’re trying to render a map or something then using better techniques makes a big difference.
[Unfortunately, even in the best case antialiased slightly diagonal straight lines look pretty shitty on a pixel display, regardless of what technique you use, up until you get to a pretty high resolution. Just an inherent issue with pixel grids.]
Note that this is a kludge [1].
[1] A Pixel Is Not A Little Square: http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf