Week 6

Coding

Hurdles with BBB ARGB8888 format

It turns out that BBB has issues with the internal ARGB8888 format used by my library for computing 32-bit floating-point operations. It returns a mysterious error which is hard to decipher:

Creating Window surface..
PVR:(Error): WSEGL_CreateWindowDrawable: Couldn't set CRTC: Invalid argument [0, ]
Unable to create surface
    egl error 'EGL_BAD_ALLOC' (0x3003)

I am right now checking why it does not work as expected - there is an issue opened on Imagination forum and is under investigation. It is quite possible that it is some obsolete limitation, as the platform should be supporting such operations.

16-bit fixed-point arithmetic

In order to keep the project going I will be writing the 16-bit fixed-point arithmetic operations in shaders. If the 32-bit computing is unblocked, then the shaders will need only small tweaking to be working well with the larger format.

On 32-bit compatible platforms we can fit two 16-bit fixed-point numbers in one texel. Then we just transform them into floating-point representation in which they are represented on the GPU, do the computation and transform back. The transformation functions look as follows:

// since no bit shift in gles2.0 we precompute it on CPU
uniform int fraction_divider;

float fixed_to_float(vec2 inp)
{
    float n = inp.y * 256.0 + inp.x;
    return float(n / float(fraction_divider));
}

vec2 float_to_fixed(float inp)
{
    vec2 ret = vec2(0);
    inp = inp * float(fraction_divider);
    ret.x = inp - inp / 256.0;
    ret.y = inp / 256.0;
    return ret;
}

The coordinates are reversed due to our choice of arguments into the function (they can be swapped in the main function).

And after some tweaking, the programs run on the BBB!

Fixed point addition works just fine and I can access all four components of the RGBA texel (and I expected to be able to access them in lower precision or just access the two of four available).

Running the regular floating-point code on BBB

The regular floating-point code also works - the array addition, albeit suffers from numerical inaccuracy for regular numbers. This is surely caused by lower bit resolution in the GPU compared to the host (16 bits on the BBB and 32 bits on my host). Hopefully I will be able to make the BBB run utilizing full precision and have more accurate computations.

I also encountered weird errors when running convolution example:

WARNING: 0:88: Calls to any function that may require a gradient calculation inside a conditional block may return undefined results
WARNING: 0:103: Calls to any function that may require a gradient calculation inside a conditional block may return undefined results
WARNING: 2 compilation warnings.

It seems to be a limitation of some GPU architectures, and the workaround is to use a mix() or step() instead of conditional tests. There is an open issue which is investigated by Imagination.

Benchmarking

I created a separate benchmarking post, which I will be updating for each operation.