Kernels, and a cheeky IEEE-754 proof with somewhat practical debugging value
Published:
Takeaways
- Designed a kernel, which faces a bug. Understanding and debugging the issue teaches you a lot about the limitations of floating point representations
- Delved deeper into the mathematics of IEEE-754, and common implementations of the standard such as bfloat16, fp32, fp64, and fp8
- Explained why an unsigned 32 bit integer may be more precise in workloads such as hashing than FP32
- Conclude with a mathematical proof that characterizes the cases in which a 32-bit integer will be more precise than floating point schemes like FP32
Full version on Substack → Curb your memory hierarchy
Summary
Starting out in kernels and facing inexplicable bugs? Yeah, that was once me. Check this blog out for a walk through concepts that typically are seen as obscure in the high level programming world. We finish with a proof with practical debugging implications to weed out integer/floating-point precision bugs.
