Coding

The last two weeks were spent mostly on my Master’s Summer School where we created a startup prototype in Digital Health (and we won :), so the time was well spent!

I also continued my benchmarking work with noop benchmarks added to the benchmarking post and chain API benchmarks as well. These are probably more interesting as they show how we can utilize our GPU to perform more convolved (again, no pun intended!) operations while our CPU is occupied otherwise. This opens the door to some future applications of this library, such as:

  • Tensorflow Lite backend
  • Integration in the OpenCL stack for BBB
  • Running application specific accelerations and allowing users to provide their own shaders without editing the library code (this might be too far-fetched)

I also wrote the Library Innards blog post which discusses how things are done under the hood in this library. I highly recommend reading it if you are interested in GPGPU and some nitty-gritty details of OpenGL ES and EGL.

Debugging

During testing on BBB I found that I am unable to run the chain API more than several times and this number decreases with the algorithm complexity (less repetitions for convolutions). It seems to be related to libglslcompiler and memory management issues:

Program received signal SIGABRT, Aborted.
__libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
47      ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S: No such file or directory.
(gdb) bt
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
#1  0xb6e803cc in __libc_signal_restore_set (set=0xbeffe028) at ../sysdeps/unix/sysv/linux/nptl-signals.h:79
#2  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:48
#3  0xb6e810ba in __GI_abort () at abort.c:89
#4  0xb6ea7cda in __libc_message (do_abort=do_abort@entry=2, fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:175
#5  0xb6eac1ca in malloc_printerr (action=<optimized out>, str=0xb6f2a48c "free(): invalid pointer", ptr=<optimized out>,
    ar_ptr=<optimized out>) at malloc.c:5049
#6  0xb6eac8b2 in _int_free (av=0xb6f467a4 <main_arena>, p=0x42717c, have_lock=<optimized out>) at malloc.c:3905
#7  0xb66ace92 in ?? () from /usr/lib/libglslcompiler.so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

This probably needs more debugging as some times it complains about regular Segmentation faults, sometimes the shader compiler refuses to accept my unchanged shader code and at times everything works just fine.

On my host PC I am not getting such compiler errors, but with enough repetitions I am getting the same compiler errors and these errors suggest that the compiler has trouble with compiling the shader code (but I have no clue as to why!).

Added scalar and array operations for the chain API

Now the chain API has all four (add/subtract/multiply/divide) operators supported for both scalar broadcast operations and the elementwise array operations.

TODOs and Future ideas

This marks the last week of coding in the GSoC 21 period and I am proud of what has been done so far. Nevertheless, there is always room for improvement and extensibility. Notably

Thus, the great TODO: list begins! (should be turned into a GitHub issues soon 😁)

  • Add more operations:
  • Fix paths to shaders to be non-relative and not hardcoded
  • Debug weird glslcompiler memory management issues as described above
  • Rendering trully headless (without the dummy HDMI plug I am currently using)related issue
  • Allowing users to provide their own shaders without editing the library code
  • Create Rust language bindings and add some examples in Rust programming language

Acknowledgments

Phew! These 10 weeks really passed fast! All this work could not be achieved without help of my mentors: Iain Hunter and Hunyue Yau! Also big thanks to Jason Kridner who is the shepherd of beagleboard.com organization and who provided me with hardware for this project!

I hope to be continuing my work on this (and other projects related to beagleboard) right after GSoC ends 😄.