Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Transformers learn embedding representations of tokens, which are easily mapped into a space. Similar tokens are mapped to similar places on the space. The fully connected layer at the end of each transformer block defines a transformation of a set of points in a space to another point in that space, not unlike the example of adding colors together to get a new color


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: