Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Bayesian Geometry of Transformer Attention (arxiv.org)
2 points by samwillis 1 day ago | hide | past | favorite | 1 comment




Higher level overview and links to the other related papers: https://medium.com/@vishalmisra/attention-is-bayesian-infere...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: