Improving the lower bound for the unit distance problem
A refinement of Sawin's explicit unit-distance lower bound, mostly by GPT 5.5 Pro, for the Erdős problem recently solved by OpenAI's internal model.
A refinement of Sawin's explicit unit-distance lower bound, mostly by GPT 5.5 Pro, for the Erdős problem recently solved by OpenAI's internal model.
A writeup of porting the modded nanoGPT speedrun to pure JAX on TPU v6e, including hardware bottlenecks, bugs, optimizations, and open performance questions.
A theoretical comparison of normalized gradient descent, Muon, and Adam-style updates on a tractable matrix optimization toy problem, showing finite-time convergence and why Muon's guarantees come out nicer than Adam's.
A rigorous derivation showing RoPE is almost optimally expressive under certain natural constraints, characterizing the allowable positional rotations with a free N-dimensional generalization and constructions for the rotation vectors.
A practical overview of LLM inference quantization, from why memory bandwidth dominates local inference to GGUF, EXL, AWQ, GPTQ, KV-cache quants, and hardware tradeoffs.