nor's blog
Tags
Search
RSS
Comment
machine-learning
Remark #2: The Adam update
Remark #1: On RMS matched Muon
A short note on some aspects of long context attention