Glossary Architecture 1 min read

Attention Mechanism

Also known as: Self-Attention, Scaled Dot-Product Attention, Multi-Head Attention

A neural network component that allows models to selectively focus on the most relevant parts of their input, dynamically weighting the importance of different elements in a sequence.

Self-attention, multi-head attention, cross-attention, query-key-value, softmax, attention weights, transformer architecture, scaled dot-product attention, causal attention, bidirectional attention, attention score, context-aware processing

Sources & References