Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> ∑ᵢ yᵢ · K(xᵢ, xₒ) ⁄ (∑ⱼ K(xⱼ, xₒ))

That clarifies things...



It does, for anybody who studied math at the level you need to understand Attention (some linear algebra). Please no low effort comments, ask if you don't have the math background and people will gladly help. This is sum notation, see https://en.m.wikipedia.org/wiki/Summation


No, it doesn't really clarify things. I had the best linear algebra grades in my year at my university, and if you don't know anything about kernels, this is not helpful (what are xi and yi in the first place?).


It's all described in the referenced link. No need for everyone to get antsy.

> 0. http://bactra.org/notebooks/nn-attention-and-transformers.ht...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: