A language model takes in a sequence of tokens and outputs a probability (0-1) for each token in the vocabulary (the set of all tokens the model knows). Based on this probability distribution, there are various sampling strategies that can be employed to choose which token to actually show to the user.