This post talks about double precision numbers in OpenGL and WebGL, and how to emulate them if there is no native hardware support.
In GLSL 4.00.9 (which is OpenGL 4.0) and higher, there is a native double precision floating point type. And if your graphics card is able to run OpenGL 4.0, it most likely has native hardware support for doubles (except for a few ATI/AMD cards). There are some caveats, though:
- Not all functions are supported with double precision arguments. For instance, there are no trigonometric and exponential functions. (The available functions may be found here).
- You can not pass double precision ‘varying’ parameters from the vertex shader to the fragment shader, and have the GPU automatically interpolate them. Double precision varying variables must be flat.
- Double precision performance may be artificially limited by the hardware manufacturers. This is the case for Nvidia’s Fermi architecture, where the scientific computing brand, the Tesla series, can execute double precision arithmetics at half the speed of single precision, while the consumer brand, the GeForce series, only can execute double precision arithmetics at 1/8 the speed of single precision. For Nvidia’s brand new Kepler architecture used in the GeForce 600 series, things change again: here the difference between single and double precision will be a whopping factor 24! Notice, that this will also be the case for some cards in the Kepler Tesla branch, such as the Tesla K10.
- In Fragmentarium (and in general, in Qt’s OpenGL wrapper classes) it is not possible to set double precision uniforms. This should be easy to circumvent by using the OpenGL API directly, though.
(Non-related Fragmentarium image)
In order to use double precision, you must either specify a GLSL version 4.00 (or higher) or use the extension:
#extension GL_ARB_gpu_shader_fp64 : enable
Older cards, like the GeForce 310M in my laptop, does not support double precision in hardware. Here it is possible to use emulated double precision instead.
I used the functions by Henry Thasler described here in his posts, to emulate a double precision number stored in two single precision floats. The worst part about doing emulated doubles in GLSL, is that GLSL does not support operator overloading. This means the syntax gets ugly for simple arithmetics, e.g. ‘z = add(mul(z1,z2),z3)’ instead of ‘z = z1*z2+z3’.
On Nvidia cards, it is necessary to turn off optimization to use Thasler’s code – this can be done using the following pragmas:
#pragma optionNV(fastmath off) #pragma optionNV(fastprecision off)
(Non-related Fragmentarium image)
To test performance, I used a Mandelbrot test scene, rendered at 1000×500 with 1000 iterations in Fragmentarium. The numbers show the performance in frames per second. The zoom factor was determined visually, by noticing when pixelation occurred.
|Geforce 570GTX||Tesla 2075||Max Zoom|
- Emulated double precision is slightly less accurate then true hardware doubles, but not much in this particular scenario.
- Emulated doubles are roughly 1/9th the speed of single precision. Amazingly, this suggest that on the Kepler architecture it might make more sense to use emulated double precision than the built-in hardware support!
- Hardware doubles on the 570GTX performs better than expected (they should perform at roughly 1/8 the speed). This is probably because double precision arithmetics isn’t the only bottleneck in the shader.
Notice that the Tesla card was running on Windows in WDDM mode, not TCC mode (since you cannot use GLSL shaders in TCC mode). Not that I think performance would change.
WebGL and double precision
WebGL does not support double precision in its current incarnation. This might change in the future, but currently the only choice is to emulate them. This, however, is problematic since WebGL seems to strip away pragmas! Henry Thasler’s emulation code doesn’t work under the ANGLE layer either. In fact, the only configuration I could get to work, was on a Intel HD 3000 GPU with ANGLE disabled. I did create a sample application to test this which can be tried out here:
Click to run WebGL app. Left side is single-precision, right side is emulated double precision. Here shown on Firefox without ANGLE on a Intel HD 3000 card.
It is not clear why the WebGL version does not work on Nvidia cards. Floating points may run at lower resolution in WebGL, but I’m using the ‘precision highp’ qualifiers. I also tried querying the resolution using glContext.getShaderPrecisionFormat(…), but had no luck – it is only available on Firefox, and on my GPU’s it just returns precision=0.
The most likely explanation is that Nvidia drivers perform some optimizations which spoils the emulation code. This is also the case for desktop OpenGL, but here the pragma’s solve the problem.
The emulation code uses constructs like:
z = a - (a - b);
which I suspect the well-meaning compiler might translate to ‘z=b’, since the rounding errors normally would be insignificant. Judging from some comments on Thasler’s original posts, it might be possible to prevent this using constructs such as: ‘z = a – float(a-b)’, but I have not pursued this.
Fragmentarium and Double Precision
Except that there are no double-precision sliders (uniforms), it is straight-forward to use double precision code in Fragmentarium. The only thing to remember is that you cannot pass doubles from the vertex shader to the fragment shader, which is the standard way of passing camera information to the shader in Fragmentarium.
I’ve also included a small port of Thaslers GLSL code in the distribution (see “Include/EmulatedDouble.frag”). It is quite easy to use (for an example, try the included “Theory/Mandelbrot – Emulated Doubles.frag”).