This is very much what dspy aims to address. Learning the incantations necessary to prompt well can be replaced by an algorithmic loop and example labelled cases.
This paper is always useful to remember when someone tells you to just use the cosine similarity: https://arxiv.org/abs/2403.05440
Sure, it’s useful but make sure it’s appropriate for your embeddings and remember to run evals to check that things are ‘similar’ in the way that you want them to appear similar.
Eg is red similar to green just because they are colors
Created by Steve Ruiz, his timeline is full of little interesting thoughts played out in the development of this, a fascinating insight into the minutiae that users take for granted but that make the difference: https://twitter.com/steveruizok
Transformers suffer from a quadratic bottleneck when calculating attention. Much work has been done investigating where memory can be saved by being more explicit on which attentions to calculate. This repo implements transformers with noted improvements
for all common models (GloVe, fastText, word2vec) the means across word embeddings are tightly concentrated around zero (relative to their dimensions), thus making the widely used cosine similarity practically equivalent to Pearson correlation https://www.aclweb.org/anthology/N19-1100/
Big fan my dude! While FastAPI is amazing, the docs for it are a work of art. I know a few people that have just used the FastAPI docs to learn what API's are and how they work, nevermind how use FastAPI itself.