I’m asking because I’m writing a short post about how LLMs work and want to explain the probabilities which makes sense for the words displayed but don’t obviously add up to a 100 even if one bucket is “all other possibilities” which I don’t see. Why does next line have a probability?
Sorry, I want to be helpful but I don’t fully understand the question. To get to 100% you’d need to show the probabilities for every token (of which there are over 50k). I think they are just showing the logit of the top probability in case that’s how someone wants to see the data represented.