Over the past few days, we've been trying to improve the current bottlenecks that slow down the engine.
The first bottleneck was simply a graphics one - how many polygons can be drawn on screen. The standard solution is frustum culling. In our case, this was a fairly easy task, since all our visual data is separated into 'visual blocks'. Each one is simply a cube, 12m on each side, which holds all visual data. Frustum culling for that simply meant checking whether each corner is visible, and if none of them are, drawing it was skipped.
Another improvement to graphics was reducing the sheer number of draw calls. As you can imagine, with one visual block being a 12x12x12m cube, in order to get good visual distance between them, we have quite a few of them. One way to decrease the number of draw calls is to simply add only full ones to a list when rebuilding, and draw only that list. The downside is that in order to accurately maintain that list, when the user moves, a rather large number of them have to be rechecked. This is an acceptable trade off, for the most part at least.
But another bottleneck, even worse is the sheer number of data that can be covered. In fact, 21*21*21 = 9261 visual blocks give us a little over 120m view distance on each side of the player - and cover a huge volume of 16 million cubic meters! - which means we need 16 million data samples to construct that volume!!!
This presents a generation bottleneck - the terrain generation function must be fast enough to generate that much data in a reasonable time. Of course one generated the data is stored to disk, and can be read from there.
For generating data, we were able to obtain a decent looking terrain at approx. 16 datablock generations per second. This means a point generation rate of 27,000 data points per second, while when loading datablocks, we were able to load of speeds ranging beteween 100 and 200 datablocks/second - which is an average of 259,000 datapoints per second.
This leads to the major problem in this fiasco, and that is, the speed at which the density function operates. For our project, we currently use libnoise, and we use something like 7 independent one octave functions, and two 2 octave functions Four of these (including one of the 2 octave) are combined together in different frequencies and amplitudes to generate terrain, while the other 3 are used for picking biomes and determining oceans and seas and mountain areas.
For those familiar with how Perlin noise works, this means a huge number of lookups and trilinear interpolations. Which makes it slow, slow, slow. ... Unfortunately, the slowness is greatly due to the sheer number of points generated, not only the speed of the noise. A quick switch to an implementation of Simplex Noise (also authored by Ken Perlin), showed no significant improvement in speed. (the measured speed is not only dependent on the generation but on number of other factors). While Simplex noise may be faster in higher dimensions we only use 3d noise anyway, and while straight forward tests of speed may show it faster in lesser dimensions, it is still comprable to Perlin noise when used in a real world application. Of course its speed isn't the only thing that it has improved over Perlin, but for now, it won't help our bottleneck.
The only way to help the generation bottleneck is to redesign the density function used to use less noise modules, and combine them in more ingenius ways.
The good thing is, that while this is a bottleneck as such, it is only applicable when the terrain is first generated. Often times players will move around generated terrain, which, as mentioned, is much faster. There's also the possibilty of pre-generating terrain while nothing else is pending - but this also has the drawback of starting to store excessive data, so it too must be balanced.
In other words, without more major overhaul, the speed of the density function is not likely to show significant improvement by just switching noise generators.