Category Archives: Fractals

GPU versus CPU for pixel graphics

After having gained a bit of experience with GPU shader programming during my Fragmentarium development, a natural question to ask is: how fast are these GPU’s?

This is not an easy question to answer, and it depends on the specific application. But I will try to give an answer for the kind of systems that I’m interested in: pixel graphics systems, where each pixel can be calculated independently of the others, such as raytraced 3D fractals.

Lets take my desktop computer, a fairly standard desktop machine, as an example. It is equipped with Nvidia Geforce 9800GT GPU @ 1.5 GHz, and a Intel Core 2 Quad Q8200 @ 2.33GHz.

How many processing unit are there?

Number of processing units (CPU): 4 CPU cores
Number of processing units (GPU): 112 Shader units

Based on these numbers, we might expect the GPU to be a factor of 28x times faster than the CPU. Of course, this totally ignores the efficiency and operating speed of the processing units. Lets try looking at the processing power in terms of maximum number of floating-point operations per second instead:

Theoretical Peak Performance

Both Intel and Nvidia list the GFLOPS (billion floating point operations per second) rating for their products. Intel’s list can be found here, and Nvidia’s here. For my system, I found the following numbers:

Performance (CPU): 37.3 GFLOPS
Performance (GPU): 504 GFLOPS

Based on these numbers, we might expect the GPU to be a factor of 14x times faster than the CPU. But what do these numbers really mean, and can they be compared? It turns out that these number are obtained by multiplying the processor frequency by the maximum number of instructions per clock cycle.

For the CPU, we have four cores. Now, when Intel calculate their numbers, they do it based on the special 128-bit SSE registers on every modern Pentium derived CPU. These extensions make it possible to handle two double precision floating point, or four single precision floating point numbers per clock cycle. And in fact there exists a special instruction – the MAD, or Multiply-Add, instruction – which allows for two arithmetic operations per clock cycle on each element in the SSE registers. These means Intel assume 4 (cores) x 2 (double precision floats) x 2 (MAD instructions) = 16 instructions per clock cycle. This gives the theoretical peak performance stated above:

Performance (CPU): 2.33 GHz * 4 * 2 * 2 = 37.3 GFLOPS (double precision floats)

What about the GPU? Here we have 112 independent processing units. On the GPU architecture an even more benchmarking-friendly instruction exists: the MAD+MUL which combines two multiplies and one addition in a single clock cycle. This means Nvidia assumes 112 (cores) * 3 (MAD+MUL instructions) = 336 instructions per clock cycle. Combining this with a stated processing frequency of 1.5 GHz, we arrive at the number stated earlier:

Performance (GPU): 1.5 GHz * 112 * 3 = 504 GFLOPS (single precision floats)

But wait… Nvidia’s number are for single precision floats – the Geforce 8800GT does not even support double precision floats. So for a fair comparison we should double Intel’s number, since the SSE extensions allows four simultaneous single precision numbers to be processed instead of two double precision floats. This way we get:

Performance (CPU): 2.33 GHz * 4 * 4 * 2 = 74.6 GFLOPS (single precision floats)

Now, using this as a guideline, we would expect my GPU to be a factor of 6.8x faster than my CPU. But we have some pretty big assumptions here: for instance, not many CPU programmers would write SSE-optimized code – and is a modern C++ compiler powerful enough to automatically take advantage of them anyway? And how often is the GPU able to use the three operation MUL+MAD instruction?

A real-world experiment

To find out I wrote a simple 2D Mandelbrot system and benchmarked it on the CPU and GPU. This is really the kind of computational tasks that I’m interested in: it is trivial to parallelize and is not memory-intensive, and the majority of executed code will be floating point arithmetics. I did not try to optimize the C++ code, because I wanted to see if the compiler was able to perform some SSE optimization for me. Here are the execution times:

13941 ms – CPU single precision (x87)
13941 ms – CPU double precision (x87)
10535 ms – CPU single precision (SSE)
11367 ms – CPU double precision (SSE)
424 ms – GPU single precision

(These numbers have some caveats – I did perform the tests multiple times and discarded the first few runs, but the CPU code was only single-threaded – so I assumed the numbers would scale perfectly and divided the execution times by four. Also, I verified by checking the generated assembly code, that SSE instructions indeed were used for the core Mandelbrot loop, when they were enabled.).

There are a couple of things to notice here: first, there is no difference between single and double precision on the CPU. This is as could be expected for the x87 compiled code (since the x87 defaults to 80-bit precision anyway), but for the SSE version, we would expect a double up in speed. As can be seen, the SSE code is really not very much more efficient the the x87 code – which strongly suggests that the compiler (here Visual Studio C++ 2008) is not very good at optimizing for SSE.

So for this example we got a factor of 25x speedup by using the GPU instead of the CPU.

“Measured” GFLOPS

Another questions is how this example compares to the theoretical peak performance. By using Nvidia’s Cg SDK I was able to get the GPU assembly code. Since I now could count the number of instruction in the main loop, and I knew how many iterations were performed, I was able to calculate the actual number of floating point operations per second:

GPU: 211 (Mandel)GFLOPS
CPU: 8.4 (Mandel)GFLOPS*

(*The CPU number was obtained by assuming the number of instructions in the core loop was the same as for the GPU: in reality, the CPU disassembly showed that the simple loop was partially unrolled to more than 200 lines of very complex assembly code.)

Compared to the theoretical maximum numbers of 504 GFLOPS and 74.6 GFLOPS respectively, this shows the GPU is much closer to its theoretical limit than the CPU.

GPU Caps Viewer – OpenCL

A second test was performed using the GPU Caps Viewer. This particular application includes a 4D Quaternion Julia fractal demo in OpenCL. This is interesting since OpenCL is a heterogeneous platform – it can be compiled to both CPU and GPU. And since Intel just released an alpha version of their OpenCL SDK, I could compare it to Nvidia’s SDK.

The results were interesting:

Intel OpenCL: ~5 fps
Nvidia OpenCL: ~35 fps

(The FPS did vary through the animation, so these numbers are not very accurate. There were no dedicated benchmark mode.)

This suggest that Intel’s OpenCL compiler is actually able to take advantage of the SSE instructions and provides a highly optimized output. Either that, or Nvidia’s OpenCL implementation is not very efficient (which is not likely).

The OpenCL based benchmark showed my GPU to be approximately 7x times faster than my CPU. Which is exactly the same as predicted by comparing the theoretical GFLOPS values (for single precision).


For normal code written in a high-level language like C or GLSL (multithreaded, single precision, and without explicit SSE instructions) the computational power is roughly equivalent to the number of cores or shader units. For my system this makes the GPU a factor of 25x faster.

Even though the CPU cores have higher operating frequency and in principle could execute more instructions via their SSE registers, this does not seem be fully utilized (and in fact, compiling with and without SSE optimization did not make a significant difference, even for this very simple example).

The OpenCL example tells another story: here the measured performance was proportional to the theoretical GFLOPS ratings. This is interesting since this indicate, that OpenCL could also be interesting for CPU-applications.

One thing to bear in mind is, that the examples tested here (the Mandelbrot and 4D Quaternion Julia) are very well-suited for GPU execution. For more complex code, with conditional branching, double precision floating point operations, and non-coalesced memory access, the CPU is much more efficient than the GPU. So for a desktop computer such as mine, a factor of 25x is probably the best you can hope for (and it is indeed a very impressive speedup for any kind of code).

It is also important to remember that GPU’s are not magical devices. They perform operations with a theoretical peak performance typically 5-15 times larger than a CPU. So whenever you see these 1000x speed up claims (e.g. some of the CUDA showcases), it is probably just an indication of a poor CPU implementation.

But even though the performance of GPU’s may be somewhat exaggerated you can still get a tremendous speedup. And GPU interfaces such as GLSL shaders are really simple to use: you do not need to deal explicitly with threads, you have built-in vectors and matrices, and you can compile GLSL code dynamically, during run-time. All features which makes GPU programming nearly ideal for exploring pixel graphic systems.

Fragmentarium – an IDE for exploring fractals and generative systems on the GPU.

As I mentioned in my previous post, I started experimenting with GLSL raytracing a couple of months ago, and I’m now ready to release the first version of Fragmentarium, an open source, cross-platform IDE for exploring pixel based graphics on the GPU.

It was mainly created for exploring Distance Estimated systems, such as Mandelbulbs or Kaleidoscopic IFS, but it can also be used for 2D systems.

Fragmentarium is inspired by Adobe’s Pixel Bender, but uses pure GLSL, and is specifically created with fractals and generative systems in mind. Besides Pixel Bender, there are also other, more specialized, GPU fractal applications out there, such as Boxplorer and Tacitus, but I wanted something code-centric, where I quickly can modify code and use code in a more modular manner.


  • Multi-tabbed IDE, with GLSL syntax highlighting

  • Modular GLSL programming – include other fragments
  • User widgets to manipulate parameter settings.
  • Different ‘mouse to GLSL’ mapping schemes (2D and 3D)
  • Includes raytracer for distance estimated systems
  • Many examples including Mandelbulb, Mandelbox, Kaleidoscopic IFS, and Julia Quaternion

Fragmentarium can be downloaded from:

There are binaries for Windows, but for now you’ll have to build it yourself for Mac and Linux. You will need a graphics card capable of running GLSL (any reasonably moderne discrete card will do).

Here is a screenshot:

Fragmentarium Screenshot

Fragmentarium screenshot (click to enlarge).

There’s also a gallery at Flickr: Fragmentarium Group

Fragmentarium is not a mature application yet. Especially the camera handling needs some work in the next versions – camera settings are not saved as part of the parameters, no field-of-view control and you often have to compensate for clipping. For future versions I also plan arbitrary resolution renders (tile based rendering) and animations.

There are probably also many small quirks and bugs – I’ve had several problems with ATI drivers, which seems to be much more strict than Nvidias.

Folding Space II: Kaleidoscopic Fractals

Another type of interesting 3D fractal has appeared over at the Kaleidoscopic 3D fractals, introduced in this thread, by Knighty.

Once again these fractals are defined by investing the convergence properties of a simple function. And similar to the Mandelbox, the function is built around the concept of folds. Geometrically, a fold is simply a conditional reflection: you reflect a point in a plane, if it is located on the wrong side of the plane.

It turns out that just by using plane-folds and scaling, it is possible to create classic 3D fractals, such as the Menger cube and the Sierpinsky tetrahedron, and even recursive versions of the rest of the Platonic solids: the octahedron, the dodecahedron, and the icosahedron.

Example of a recursive dodecahedron

The kaleidoscopic fractals introduce an additional 3D rotation before and after the folds. It turns out that these perturbations introduce a rich variety of interesting and complex structures.

I’ve followed the thread and implemented most of the proposed systems by modifying Subblue’s Pixel Bender scripts.

Below are some of my images:

The Menger Sponge

My first attempts. Pixel Bender kept crashing on me, until I realized that there is a GPU timeout in Windows Vista (read this for a solution).

The Sierpinsky

Then I moved on to the Sierpinsky. The sequence below shows something characteristic for these fractals: the first slightly perturbed variations look artificial and synthetic, but when the system is distorted, it becomes organic and alive.

The Icosahedron

I also tried the octahedron and dodecahedron, but my favorite is the icosahedron. Especially knighty’s hollow variant.

Arbitrary Planes

One nice thing about these systems is, that you do not necessarily need to derive a complex distance estimator – you can also just modify the distance estimator code, and see what happens. These last two images were constructed by modifying existing distance estimators.

It will be interesting to see where this is going.

Many fascinating 3D fractals have appeared at over the last few weeks. And GPU processing now makes it is possible to explore these systems in real-time.

The Reality of Fractals

“… no one, not even Benoit Mandelbrot himself […] had any real preconception of the set’s extraordinary richness. The Mandelbrot set was certainly no invention of any human mind. The set is just objectively there in the mathematics itself. If it has meaning to assign an actual existence to the Mandelbrot set, then that existence is not within our mind, for no one can fully comprehend the set’s endless variety and unlimited complication.”

Roger Penrose (from The Road to Reality)

The recent proliferation of 3D fractals, in particular the Mandelbox and Mandelbulb, got me thinking about the reality of these systems. The million dollar question is whether we discover or construct these entities. Surely these systems give the impression of exploring uncharted territory, perhaps even looking into another world. But the same can be said for many traditional man-made works of art.

I started out by citing Roger Penrose. He is a mathematical Platonist, and believes that both the fractals worlds (such as the Mandelbrot set) and the mathematical truths (such as Fermat’s last theorem) are discovered. In his view, the mathematical truths have an eternal, unchanging, objective existence in some kind of Platonic ideal world, independent of human observers.

In Penrose’s model, there are three distinct worlds: the physical world, the mental world (our perception of the physical world), and the cryptic Platonic world. Even Penrose himself admits that the connections and interactions between these worlds are mysterious. And personally I cannot see any kind of evidence pointing in favor of this third, metaphysical world.

Designer World by David Makin

Roger Penrose is a highly renowned mathematician and physicist, and I value his opinions and works highly. In fact, it was one of his earlier books, The Emperors New Mind, that in part motivated me to become a physicist myself. But even though he is probably one of the most talented mathematicians living today, I am not convinced by his Platonist belief.

Personally, I subscribe to the less exotic formalist view: that the mathematical truths are the theorems we can derive by applying a set of deduction rules to a set of mathematical axioms. The axioms are not completely arbitrary, though. For instance, a classic mathematical discipline, such as Euclidean geometry, was clearly motivated by empirical observations of the physical world. The same does not necessarily apply to modern mathematical areas. For instance, Lobachevsky’s non-Euclidean geometry, was conceived by exploring the consequences of modifying one of Euclid’s fundamental postulates (interestingly non-Euclidean geometry later turned out to be useful in describing the physical world through Einstein’s general theory of relativity).

But if modern mathematics has become detached from its empirical roots, what governs the evolution of modern mathematics? Are all formal systems thus equally interesting to study? My guess is that most mathematicians gain some kind of intuition about what directions to pursue, based on a mixture of trends, historical research, and feedback from applied mathematics.

Mandelballs by Krzysztof Marczak [Mandelbox / Juliabulb mix]

Does my formalist position mean that I consider the Mandelbrot set to be a man-made creation, in the same category as a Picasso painting or a Bach concert? Not exactly. Because I do believe in a physical realism (in the sense that I believe in a objective, physical world independent of human existence), and since I do believe some parts of mathematics is inspired by this physical world and tries to model it, I believe some parts of mathematics can be attributed an objective status as well. But it is a weaker kind of objective existence: the mathematical models and structures used to describe reality are not persistent and ever-lasting, instead they may be refined and altered, as we progressively create models with greater predictive power. And I think this is the reason fractals often resemble natural structures and phenomena: because the mathematics used to produce the fractals was inspired by nature in the first place. Let me give another example:

Teeth by Jesse

Would a distant alien civilization come up with the same Mandelbrot images as we see? I think it is very likely. Any advanced civilization studying nature, would most likely have created models such as the natural numbers, the real numbers, and eventually the complex numbers. The complex numbers are extremely useful when modeling many physical phenomena, such as waves or electrodynamics, and complex numbers are essential in the description of quantum mechanics. And if this hypothetical civilization had computational power available, eventually someone would investigate the convergence of a simple, iterated system like z = z2 + c. So there would probably be a lot of overlapping mathematical structures. But there would also be differences: for instance the construction of the slightly more complex Mandelbox set contains several human-made design decisions, making it less likely to be invented by our distant civilization.

I think there is a connection to other areas of generative art as well. In the opening quote Penrose claims that no-one could have any real preconception of the Mandelbrot sets extraordinary richness. And the same applies to many generative systems: they are impossible to predict and often surprisingly complex and detailed. But this does not imply that they have a meta-physical Platonic origin. Many biological and physical systems share the same properties. And many of the most interesting generative systems are inspired by these physical or biological systems (for instance using models such as genetic algorithms, flocking behavior, cellular automata, reaction-diffusion systems, and L-systems).

Another point to consider is, that creating beautiful and interesting fractal images as the ones above, requires much more than a simple formula. It requires aesthetic intuition and skills to choose a proper palette, find an interesting camera location, and it takes many hours of formula parameter tweaking. I know this from my experiments with 3D fractals – I’m very rarely satisfied with my own results.

But to sum it all up: Even though fractals (and generative systems) may posses endless variety and unlimited complication, there is no need to call upon metaphysical worlds in order to explain them.

Folding Space: The Mandelbox Fractal

Over at another interesting 3D fractal has emerged: the Mandelbox.

It originates from this thread, where it was introduced by Tglad (Tom Lowe). Similar to the original Mandelbrot set, an iterative function is applied to points in 3D space, and points which do not diverge are part of the set. The iterated function used for the Mandelbox set has a nice geometric interpretation: it corresponds to repeated folding operations.

In contrast to the organic presence of the Mandelbulbs, the Mandelbox has a very architectural and structural feel to it:

The Mandelbox probably owes it name to the cubic and square patterns that emerge at many levels:

It is also possible to create Julia Set variations:

Juliabox by ‘Jesse’ (click to see the large version of this fantastic image!)

Be sure to check out Tom Lowe’s Mandelbox site for more pictures and some technical background information, including links to a few (Windows only) software implementations.

I tried out the Ultra Fractal extension myself. This was my first encounter with Ultra Fractal, and it took me quite some time to figure out how to setup a Mandelbox render. For others, the following steps may help:

  1. Install Ultra Fractal (there is a free trial version).
  2. Choose ‘Options | Update Public Formulas…’ to get some needed dependencies.
  3. Install David Makin’s MMFwip3D package and install it into the Ultra Fractal formula folder – it is most likely located at “%userprofile%\Documents\Ultra Fractal 5\Formulas”.
  4. In principle, this is all you need. But the MMFwip3D formulas contain a vast number of parameters and settings. To get started try using some existing parameter set: this is a good starting point. In order to use these settings, simply select the text, copy it into the clipboard, and paste it into an Ultra Fractal fractal window.

The CPU-based implementations are somewhat slow, taking minutes to render even small images – but it probably won’t be long before a GPU-accelerated version appear: Subblue has already posted images of a PixelBender implementation in progress.

Mandelbulb Implementations

Several implementations have appeared since the Mandelbulb surfaced a couple of months ago.

The first public GPU implementation I know of was created by ‘cbuchner1′. It is based on a sample from NVIDIAs OptiX SDK, and features anaglyphic 3D, ambient occlusion, phong shading, reflection, and environment maps. It can be downloaded here (Windows only and requires a forum signup).

Example made with cbuchner1’s implementation

Very interestingly this binary runs on my laptops modest GeForce 8400M. I am a bit puzzled about this – NVIDIA state that the OptiX SDK requires a Quadro or a Tesla card, and I am not able to run the Julia OptiX demo, that cbuchner1s app is derived from.

Subblue has also created a Mandelbulb implementation, released as a Pixel Bender script and a Quartz composer plugin. A number of interesting customizations makes this my favorite choice: it is possible to explore negative and fractional powers, switch to Julia sets, and the lightning options can be fine-tuned. The only drawback is that Pixel Bender does not make it possible to directly rotate, zoom, and translate the camera – you have to rely on sliders for that.

Example created by Subblue.

Iñigo Quílez has also created a GPU implementation, but unfortunately he has not released any code yet. A couple of videos are available on Youtube, though: Part 1, Part 2, Part 3.

Quilez also discovered this intimate connection between the Shroud of Turin and the Mandelbulb.

The MathFuncRenderer also has a Mandelbulb implementation. I had a few quirks with this one – I had to install OpenAL, and the UI was quite non-responsive, but this may be due to my graphics card.

Another very interesting implementation is the GigaVoxels Mandelbulb: Whereas most implementations cast rays and use a distance estimator to speed up the ray marching, GigaVoxels use voxels stored into an Octree, which is populated on-the-fly.

For other implementations keep an eye on Fractal Forums Mandelbulb Implementation category.

Quaternion Julia sets and GPU computation.

Subblue has released another impressive Pixel Bender plugin, this time a Quaternion Julia set renderer.

The plugin can be downloaded here.

Quaternions are extensions of the complex numbers with four independent components. Quaternion Julia sets still explore the convergence of the system z ← z2 + c, but this time z and c are allowed to be quaternion-valued numbers. Since quaternions are essentially four-dimensional objects, only a slice (the intersection of the set with a plane) of the quaternion Julia sets is shown.

Quaternion Julia sets would be very time consuming to render if it wasn’t for a very elegant (and surprising) formula, the distance estimator, which for any given point gives you the distance to the closest point on the Julia Set. The distance estimator method was first described in: Ray tracing deterministic 3-D fractals (1989).

My first encounter with Quaternion Julia sets was Inigo Quilez’ amazing Kindernoiser demo which packed a complete renderer with ambient occlusion into a 4K executable. It also used the distance estimator method and GPU based acceleration. If you haven’t visited Quilez’ site be sure to do so. It is filled with impressive demos, and well-written tech articles.

Transfigurations (another Quaternion Julia set demo) from Inigo Quilez on Vimeo.

In the 1989 Quaternion Julia set paper, the authors produced their images on an AT&T Pixel Machine, with 64 CPU’s each running at 10 megaFLOPS. I suspect that this was an insanely expensive machine at the time. For comparison, the relatively modest NVIDIA GeForce 8400M GS in my laptop has a theoretical maximum processing rate of 38 gigaFLOPS, or approximately 60 times that of the Pixel Machine. A one megapixel image took the authors of the 1989 paper 1 hour to generate, whereas Subblues GPU implementation uses ca. 1 second on my laptop (making it much more efficient than what would have been expected from the FLOPS ratio).

GPU Acceleration and the future.

These days there is a lot of talk about using GPUs for general purpose programming. The first attempts to use GPUs to speed up general calculations relied on tricks such as using pixel shaders to perform calculations on data stored in texture memory, but since then several API’s have been introduced to make it easier to program the GPUs.

NVIDIAs CUDA is currently by far the most popular and documented API, but it is for NVIDIA only. Their gallery of applications demonstrates the diversity of how GPU calculations can be used. AMD/ATIs has their competing Stream API (formerly called Close To Metal) but don’t bet on this one – I’m pretty sure it is almost abandoned already. Update: as pointed out in the comments, the new ATI Stream 2.0 SDK will include ATIs OpenCL implemention, which for all I can tell is here to stay. What I meant to say was, that I don’t think ATIs earlier attempts at creating a GPU programming interface (including the Brook+ language) are likely to catch on.

Far more important is the emerging OpenCL standard (which is being promoted in Apples Snow Leopard, and is likely to become a de facto standard). Just as OpenGL, it is managed by the Khronos group. OpenCL was originally developed by Apple, and they still own the trademark, which is probably why Microsoft has chosen to promote their own API, DirectCompute. My guess is that CUDA and Brook+ will slowly fade away, as both OpenCL and DirectCompute will come to co-exist just the same way as OpenGL and Direct3D do.

For cross-platform development OpenCL is therefore the most interesting choice, and I’m hoping to see NVIDIA and AMD/ATI release public drivers for Windows as soon as possible (as of now they are in closed beta versions).

GPU acceleration could be very interesting from a generative art perspective, since it suddenly becomes possible to perform advanced visualization, such as ray-tracing, in real-time.

A final comment: a few days ago I found this quaternion Julia set GPU implementation for the iPhone 3GS using OpenGL ES 2.0 programmable shaders. I think this demonstrates the sophistication of the iPhone hardware and software platform – both that a hand-held device even has a programmable GPU, but also that the SDK is flexible enough to make it possible to access it.

Fractal Explorer Plugin

In July Subblue released another Pixel Blender plugin, called the Fractal Explorer Plugin – for exploring Julia sets and fractal orbit maps. I didn’t get around to try it out until recently, but it is really a great tool for exploring fractals.

Most people have probably seen examples of Julia and Mandelbrot sets – where the convergence properties of the series generated by repeated application of a complex-valued function is investigated.

The most well-known example is the iteration of the function z ← z2+c. The Mandelbrot set is created by plotting the convergence rate for this function while c varies over the complex plane. Likewise, the Julia set is created for a fixed c while varying the initial z-value over the complex plane.

Glynn1 by Subblue (A Julia-set where an exponent of 1.5 is used).

Where ordinary Julia and Mandelbrot sets only take into account whether the series created by the iterated function tends towards infinity (diverges) or not, fractal orbits instead uses another image as input, and checks whether the complex number series generated by the function hits a (non-transparent) pixel in the source image. This allows for some very fascinating ‘fractalization’ of existing images.

A fractal orbit showing a highly non-linear transformation of a Mondrian picture.

Subblue suggests starting out using Ernst Haeckel beautiful illustrations from the book Artforms of Nature, and he has put up a small gallery with some great examples:

An example of an orbit mapped Ernst Haeckel image.

To try out Subblue’s filter, download the Pixel Blender SDK and load his kernel filter and an input image of choice. It is necessary to uncheck the “Build | Turn On Flash Player Warnings and Errors” menu item in order to start the plugin. On my computer I also often experience that the Pixel Blender SDK is unable to detect and initialize my GPU – it sometimes help to close other programs and restart the application. The filter executes extremely fast on the GPU – often with more than 100 frames per second, making it easy to interactively explore and discover the fractals.

As I final note, I implemented a fractal drawings routine myself in Structure Synth (version 1.0) just for fun. It is implemented as a hidden easter egg, and not documented at all, but the code belows shows an example of how to invoke it:

Size: 800x800
MaxIter: 150
Term: (-0.2,0)
Term: 1*Z^1.5
BreakOut: 2
View: (-0.0,0.2) -> (0.7,0.9)

Arguable, this code is not very optimized (it is possible to add an unlimited number of terms, making the function evaluation somewhat slow), but still it takes seconds to calculate an image making it more than a hundred times slower than the Pixel Blender GPU solution.