Researchers from Carnegie Mellon University, San Diego State University, and Stanford University team have unveiled their latest research paper titled ‘Gaussian Adaptive Attention is All You Need: Robust Contextual Representations Across Multiple Modalities’ aiming to reshape contextual representations.
The authors of the paper include Aman Chadha, Aaron Elkins, and George Ioannides.
At the core of their research lies the Multi-Head Gaussian Adaptive Attention Mechanism (GAAM) and the Gaussian Adaptive Transformer (GAT), designed to elevate contextual representations across diverse modalities, including speech, text, and vision.
GAAM introduces learnable mean and variance parameters into its attention mechanism, marking a significant leap in model performance.
Introducing the GAAM and GAT, the researchers establish a fully learnable probabilistic attention framework. The incorporation of learnable mean and variance parameters empowers the model with dynamic recalibration of feature importance, resulting in a substantial enhancement of capacity.
The researchers introduce the Importance Factor (IF) as a novel learning-based metric. This metric enhances model explainability within GAAM-based methods, quantitatively evaluating feature significance and thereby improving interpretability.
Through rigorous testing across multiple modalities, the study validates the effectiveness of GAAM within GAT. The findings showcase its superiority in handling highly non-stationary data compared to conventional dot-product attention and earlier Gaussian-based attention mechanisms.
The paper demonstrates the seamless integration of GAAM with Grouped Query Attention, highlighting its compatibility with existing Pre-Trained Models (PTM). This integration not only showcases improved performance but does so with only a marginal increase in learnable parameters.