Which term describes a generation method that uses the smallest set of tokens whose cumulative probability reaches a threshold?

Prepare for the GARP Risk and AI (RAI) Exam with targeted quizzes. Utilize flashcards, multiple-choice questions, and detailed explanations to enhance learning. Ace your exam with our comprehensive quiz!

Multiple Choice

Which term describes a generation method that uses the smallest set of tokens whose cumulative probability reaches a threshold?

Explanation:
The main idea being tested is how a generation method chooses a subset of tokens based on probability mass. Top-P (Nucleus) Sampling uses the smallest set of tokens whose cumulative probability reaches a chosen threshold, then samples from that subset (after renormalizing). This means you adapt the size of the candidate pool to the model’s current distribution: if the distribution is peaky, only a few tokens are needed; if it’s spread out, more tokens are included. The threshold controls diversity: a lower threshold restricts to fewer, more confident tokens, while a higher threshold allows more options and more varied output. This approach differs from Top-K Sampling, which always uses a fixed number of top tokens regardless of their combined probability, and from Temperature, which only scales all logits to make the distribution more or less peaked without changing how many tokens are considered. Chain of Thought prompting is about encouraging the model to show reasoning steps, not about token selection.

The main idea being tested is how a generation method chooses a subset of tokens based on probability mass. Top-P (Nucleus) Sampling uses the smallest set of tokens whose cumulative probability reaches a chosen threshold, then samples from that subset (after renormalizing). This means you adapt the size of the candidate pool to the model’s current distribution: if the distribution is peaky, only a few tokens are needed; if it’s spread out, more tokens are included. The threshold controls diversity: a lower threshold restricts to fewer, more confident tokens, while a higher threshold allows more options and more varied output. This approach differs from Top-K Sampling, which always uses a fixed number of top tokens regardless of their combined probability, and from Temperature, which only scales all logits to make the distribution more or less peaked without changing how many tokens are considered. Chain of Thought prompting is about encouraging the model to show reasoning steps, not about token selection.

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy