MLAA (MorphoLogical AntiAliasing) on the GPU using Direct3D9.0, October 2010, by Nicolas Vizerie
=================================================================================================

1. About :
==========


MLAA (MorphoLogical AntiAliasing) is a recent technique developped by Intel, that apply antialiasing on a image
 by using a shape recognition strategy, after edge detection has been performed on the image. It can compete
with MSAA quality-wise.
The original paper is here : http://visual-computing.intel-research.net/papers/2009/mlaa/mlaa.pdf
The original technique is not very suitable to GPU with pixel shaders alone, so some adaptation was needed.
The reason is that the algorithm scan edges and patches pixel based on the edge length, and the configuration at edge extremities (to sum up).
Edges extremities can be far from the current pixel, so using a pixel shader (pure parallel model) requires each pixel to recompute the distance from itself
to the edge extremities. For an edge of length N, the complexity becomes O(N), which can lead to performance problems.
The obvious solution is to compute a bilateral distance texture.
The algorithm unfolds as follow :
- Detect edges of the image based on color difference (could be also Z and Normal deltas if 3D datas are available, this can make
a huge difference in quality), store in a R a boolean to indicate horizontal edges, in G another boolean to indicate vertical edges.
A rgb565 texture is well suited for this. During this operation, set stencil to 1 where edge was found (using a pixel discard)
- scan edges in cardinal directions, until edges end, or an orthogonal edge is found. Do this up to 4 pixel (only for pixel with
stencil = 1, for speedup). store the distance in a RGBA8 texture (each component for a cardinal direction)
- for each direction, propagate distance. if D(x) is the current distance for pixel x (which can be 4 at most for now).
Update the distance 4 time, by doing D'(x) <- D(x) + D(x + D(x)). 
The max distance is now 16. If the initial distance was < 3, propagation was complete, and nothing had to be done.
- repeat previous step : the max distance is now 64, unless initial distance is less than 16 (no-op in this case)
- repeat previous step : the max distance is now 255 (max distance that can be stored in a byte), unless initial distance is less than 64 (no-op in this case)
- perform final blend (see mlaa.ps in the .zip for details)
As hinted by www.iryokufx.com/mlaa , bilinear filtering can be use to speed up things. I used it during edge scaning (to test 2 edges in a single 
texture fetch, and in final blend)
I borrowed the idea to encode distance in a RGBA8 texture from there : igm.univ-mlv.fr/~biri/mlaa-gpu/MLAAGPU.pdf, though the idea
to use a bilateral texture do not come from this paper. I also tried to use a look up table to encode blend weight as they did, but in my case
 it was slower (I guess this is because the ALU/TEX ratio on my GPU must be higher).
On a nVidia 8700MGT, for a 800x600 image the time to process is 6,3 ms. I guess a desktop GPU would do much better.


Controls : 
==========
Enable Antialias : allow to see the quality improvement brought by MLAA
Use bilateral distance texture : this is the default. The second option allows to do all edge scanning in shader,
instead of using a bilateral distance texture. Speed is slower in this case (but also memory consumption)

To test with an another image, please change datas/mlaa_test.tga

Hardware requirement
====================
A GPU supporting pixel shaders 3.0 is required.

Version history
===============
1.0 : Initial release

Contact
=======
Nicolas Vizerie : nicovize@club-internet.fr
www.vizerie3d.net

