Posts

Remark #2: The Adam update

Improving the lower bound for the unit distance problem

May 21, 2026 · 17 min · 3599 words · nor

Remark #1: On RMS matched Muon

A short note on some aspects of long context attention

Simple Rules, Complex Dynamics – Part I: Foundations & Intuition

The modded nanogpt speedrun, but in JAX and on TPUs

Theoretical properties of optimizers on a toy problem, and some intuition

August 2, 2025 · 48 min · 10137 words · nor

Deriving RoPE the proper way

July 28, 2025 · 25 min · 5177 words · nor

Solving the IMO 2025 problems

July 19, 2025 · 17 min · 3514 words · nor

Quantizing LLMs for inference

May 14, 2025 · 32 min · 6685 words · nor