Few-Shot Learning
Also known as: Few-Shot, In-Context Learning, k-Shot Learning
A machine learning approach where models learn to perform tasks from only a small number of examples, typically provided within the prompt or during a brief adaptation phase.
“A machine learning approach where models learn to perform tasks from only a small number of examples, typically provided within the prompt or during a brief adaptation phase.
“
Overview
Few-shot learning enables AI models to adapt to new tasks using only a handful of examples, rather than requiring thousands or millions of labeled training samples. In the context of large language models, few-shot learning typically means providing a small number of input-output examples directly in the prompt to demonstrate the desired task pattern.
Variants
Zero-Shot
The model performs a task with no examples — only a natural language description of what's needed. This works because LLMs have learned general task-solving patterns during pre-training.
One-Shot
A single example is provided to demonstrate the desired behavior. Even one example can dramatically improve performance on structured output tasks.
Few-Shot (k-Shot)
Multiple examples (typically 2-10) are provided. More examples generally improve performance but consume more of the context window.
Context Management Considerations
Few-shot learning is a direct application of context management. Each example consumes tokens from the context window, so there's a tension between providing more examples (for better task understanding) and leaving room for the actual input data. Effective few-shot implementations carefully select the most representative and informative examples, often using semantic similarity to choose examples that are most relevant to the current query.
Best Practices
- Diverse Examples: Include examples that cover different edge cases and patterns
- Consistent Format: Use identical formatting across all examples
- Relevant Examples: Select examples most similar to the expected input
- Optimal Ordering: Place the most relevant examples closest to the query
Sources & Further Reading
Related Terms
Context Window
The maximum amount of text (measured in tokens) that a language model can process in a single interaction, determining how much information the model can consider when generating a response.
Large Language Model
A type of AI model trained on vast amounts of text data that can understand, generate, and manipulate human language, typically based on the transformer architecture with billions of parameters.
Machine Learning
A subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed, using algorithms that identify patterns in data.
Prompt Engineering
The practice of designing, optimizing, and structuring inputs (prompts) to AI language models to elicit desired outputs, including techniques for instruction formatting, context provision, and output specification.