Transformer

Core concepts Last reviewed 2026-07-01

Definition The neural network architecture behind modern language models, introduced by Google researchers in 2017. Its attention mechanism lets a model weigh how every word in a passage relates to every other, enabling fluent long-form text.

In more depth

Transformers replaced earlier architectures that processed text strictly one word at a time, making it practical to train on vastly larger datasets in parallel. The self-attention mechanism is what lets models track references and context across long documents. Nearly every prominent generative AI system is built on this architecture; the letter T in GPT stands for transformer.

Related terms

About the editor: MHSB Solutions, Research desk. MHSB Solutions is not a law firm. This glossary is educational information, not legal advice.

Educational information, not legal advice. AI terminology and tools change quickly; definitions reflect usage as of the last-updated date. For what bar associations and courts actually require of lawyers using AI, see legalaicompliance.help and consult a licensed attorney in your jurisdiction.