Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Exactly. A temperature of 0 means you always pick the highest probability token (i.e. the "max" function), while a temperature of 1 means you randomly pick a token according to their given probability. Values in between are also possible.

However, it's important to note that the numbers which the models returns for the next tokens can only be interpreted as probabilities for foundation models. For fine-tuned models (instruction SL, RLHF) the numbers represent how good the model judges the next token to be. This also leads to a phenomenon called mode collapse.



Very interesting to know! (about FM vs FTM) Thanks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: