Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's impressive, but I have one problem with all of those models. I wanted them to answer what Mixtral or Llama2 are, but with no luck. It would be great if models could at least describe themselves.


There are two issues with that.

1. To create a model, you have to train it on training data. Mixtral and Llama2 did not exist before they were trained, so their training data did not contain any information about Mixtral or Llama2 (respectively). You could train it on fake data, but that might not work that well because:

2. The internet is full of text like "I am <something>", so it would probably overshadow any injected training data like "I am Llama2, a model by MetaAI."

You could of course inject the information as an invisible system prompt (like OpenAI is doing with ChatGPT), but that is a waste of computation resources.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: