Kernels, and a cheeky IEEE-754 proof with somewhat practical debugging value

less than 1 minute read

Published:

Takeaways

  • Designed a kernel, which faces a bug. Understanding and debugging the issue teaches you a lot about the limitations of floating point representations
  • Delved deeper into the mathematics of IEEE-754, and common implementations of the standard such as bfloat16, fp32, fp64, and fp8
  • Explained why an unsigned 32 bit integer may be more precise in workloads such as hashing than FP32
  • Conclude with a mathematical proof that characterizes the cases in which a 32-bit integer will be more precise than floating point schemes like FP32

Full version on Substack → Curb your memory hierarchy


Summary

Starting out in kernels and facing inexplicable bugs? Yeah, that was once me. Check this blog out for a walk through concepts that typically are seen as obscure in the high level programming world. We finish with a proof with practical debugging implications to weed out integer/floating-point precision bugs.