nor's blog
Tags
Search
RSS
Comment
Posts
Remark #2: The Adam update
Improving the lower bound for the unit distance problem
Remark #1: On RMS matched Muon
A short note on some aspects of long context attention
Simple Rules, Complex Dynamics – Part I: Foundations & Intuition
The modded nanogpt speedrun, but in JAX and on TPUs
Theoretical properties of optimizers on a toy problem, and some intuition
Deriving RoPE the proper way
Solving the IMO 2025 problems
Quantizing LLMs for inference
Page 1 of 5
Next