Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was running one of them, and entering kaggle competitions throughout 2021 and 2022 using them. Many efforts and uses of Sentence-transformers (and new PhD projects) were thrown in the trash with Instruct GPT models and ChatGPT. I mean it's like developing a much better bicycle (lets say an ebike) but then cars come out. It was like that.

The future looked incredibly creative with cross-encoders, things like semantic paths, using the latent space to classify - everything was exciting. A all-in-one LLM that eclipsed embeddings on all but speed for these things was a bit of a kill joy.

Companies that changed existing indexing to use sentence transformers aren't exactly innovating; that process happened once or twice a decade for the last few decades. This was parents point I believe, in a way. And tbh, the improvement in results has never been noticeable to me; exact match is actually 90% of the solution to retrieval(maybe not search) already - we just take it for granted because we are so used to it.

I fully believe in a world without GPT-3, HN demos would be full of sentence transformer and other cool technology being used for demos and in creative ways, compared to how rarely you see them.



Also, people seem to have forgotten that the whole technique behind sentence transformers (pooling embeddings) works as a form of "medium term" memory in-between "long term" (vectorDB retrieval) and "short term" (the prompt).

You can compress a large N number of token embeddings into a smaller N number of token embeddings with some loss of information using pooling techniques like what was in sentence transformers.

But I've literally gotten into fights here on HN with people who claimed that "if this was so easy people would be doing it" and other BS. The reality is that LLMs and embedding techniques are still massively undetooled. For another example, why can't I average pool tokens in ChatGPT, such that I could ask "What is the definition of {apple|orange}". This is notably easy to do in Stable Diffusion land and also even works in LLMs - despite that even "greats" in our field will go and fight me in the comments when I post this[1] again and again, desperately trying to get a properly good programmer to implement it for production use cases...

[1] https://gist.github.com/Hellisotherpeople/45c619ee22aac6865c...


Share use cases?


>Many efforts and uses of Sentence-transformers (and new PhD projects) were thrown in the trash with Instruct GPT models and ChatGPT.

There still exists a need for fast and cheap models where LLMs do not make sense.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: