nor's blog
Tags
Search
RSS
Comment
Posts
On RMS matched Muon
A short note on some aspects of long context attention
Simple Rules, Complex Dynamics – Part I: Foundations & Intuition
The modded nanogpt speedrun, but in JAX and on TPUs
Theoretical properties of optimizers on a toy problem, and some intuition
Deriving RoPE the proper way
Solving the IMO 2025 problems
Quantizing LLMs for inference
A Math Academy review
Calibrating Confidence
Page 1 of 5
Next