LLM and AI Glossary

All the important Artificial Intelligence (AI) and LLM terms that you should know. This is our comprehensive AI and LLM glossary curated by GoML experts.

Get in Touch

Jump to section:

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

.

An AI agent is a computer program or system that can act on its own to complete tasks for you or another system.

Automated Machine Learning (AutoML) uses AI to handle the tricky parts of building machine learning models. This means people who aren't ML specialists can still use it, and it also makes the whole process faster and more effective for everyone.

Anomaly Detection

Anomaly detection is like a smart system that flags those odd things, whether it's a suspicious credit card transaction or a sensor reading that's way off.

Augmented Reality

Augmented reality (AR) is a technology that adds digital stuff, like pictures, videos, or 3D objects, right on top of what you're seeing in the real world, letting you interact with it.

Attention Mechanism

The attention mechanism allows AI models to intelligently pinpoint and focus on the most relevant parts of the data they receive for more accurate predictions

B

.

A blockchain is a secure, shared digital ledger where every transaction is permanently recorded and visible to everyone in the network.

Bias Mitigation

Bias mitigation is actively finding and reducing unfair prejudices within organizations or AI systems, ensuring more equitable outcomes.

BigGAN (Generative Model)

A BiGAN is a type of GAN that learns to generate data while also understanding its underlying representations, useful for unsupervised learning.

Bayesian Networks

A Bayesian network is a model using a graph to show how different things are probabilistically related, helping to reason with uncertainty.

C

.

A chatbot is a computer program simulating human conversation, often using AI to understand input and generate automated responses.

Context length is how much text an AI can read and remember at once to understand and respond accurately.

Continual Learning

Continual learning lets AI keep learning new things over time without forgetting what it already knows, like humans do.

Cognitive Computing

Cognitive computing is AI designed to learn, reason, and understand like humans, processing information and making decisions in complex ways.

Contrastive Learning

A way for AI to learn by comparing things, pulling similar things closer together and pushing different things apart.

D

.

Deepfakes are fake images, videos, or audio created or changed using AI or special software to look real but aren’t.

Deep learning is an AI technique where computers learn to understand data by mimicking how the human brain recognizes patterns and makes predictions.

Data-Centric AI

Data-centric AI focuses on improving training data quality and quantity to make AI perform better, rather than just changing the model.

Data Engineering

Data engineering builds systems to collect, store, and process data so it can be used for analysis, science, or machine learning.

Data science uses math, coding, and tools to understand and find useful insights from messy, structured, or unstructured data.

E

.

Edge computing processes and stores data near its source, like devices or users, to speed up responses and reduce delays.

Explainable AI (XAI)

AI systems designed to provide clear explanations for their decisions and reasoning processes to users.

Elastic Weight Consolidation

EWC is a smart technique that helps AI models remember their old knowledge while still learning new things.

Ensemble Learning

Ensemble learning combines many models to improve prediction accuracy by using their combined results instead of relying on just one.

Evidential Deep Learning

EDL helps AI models admit when they're uncertain about their predictions, making them more honest and trustworthy just like having a friend who's upfront about what they do and don't know for sure.

F

.

Further training a pretrained AI model on custom data to adapt it for specialized tasks or domains

The most advanced AI systems that push the boundaries of current capabilities and represent cutting-edge technological development.

Foundation Models

Large AI models trained on vast datasets that can be applied across many different use cases

Federated Learning

Multiple devices train AI models collaboratively while keeping their data local and decentralized, not centrally stored

Few-shot Learning

AI technique that learns new tasks using only a few examples or demonstrations, rather than large datasets.

G

.

AI systems that create new content like text, images, music, or code based on training data patterns.

Gaussian Processes

Statistical models that use probability distributions to make predictions with uncertainty estimates for continuous data problems.

Grounded Language Learning

AI learning language by connecting words to real-world experiences, objects, and actions rather than just text.

Graph Neural Networks (GNNs)

Neural networks designed to analyze graph data structures, making predictions based on nodes, edges, and relationships.

Generative Adversarial Networks (GANs)

Two competing neural networks where one generates fake data and another detects it, improving both continuously.

H

.

Combining different AI approaches like symbolic reasoning and neural networks to leverage strengths of multiple methods.

When AI models generate false or nonsensical information that seems plausible but isn't based on training data.

Hyperparameter Tuning

Adjusting model settings like learning rate and batch size to improve performance through systematic testing and optimization.

Hallucination Detection (in LLMs)

Methods to identify when language models generate false, misleading, or fabricated information during text generation.

Hierarchical Reinforcement Learning

Breaking complex tasks into smaller subtasks with different learning levels to improve AI decision-making efficiency.

I

.

Interpretability

Making AI model decisions understandable to humans by explaining how inputs lead to specific outputs or predictions.

Image Segmentation

Dividing images into meaningful regions or objects by identifying boundaries and classifying each pixel into categories.

Instruction Tuning

Fine-tuning language models to follow human instructions better by training on instruction-response pairs for tasks.

Instance-based Learning

Machine learning that makes predictions by comparing new examples to similar stored examples from training data.

Implicit Neural Representation

Using neural networks to encode continuous functions that represent shapes, images, or scenes in coordinate space.

J

.

Joint Probability

The likelihood of two or more events occurring together, calculated by multiplying individual probabilities under independence.

Jitter Regularization

Adding small random noise to training data to prevent overfitting and improve model generalization performance.

JAX (Accelerated ML Library)

Google's Python library for high-performance machine learning research with automatic differentiation and GPU/TPU acceleration support.

Joint Embedding Architecture

Neural network design where different data types are mapped into shared representation space for comparison.

K

.

Privacy protection method ensuring each person's data is indistinguishable from at least k-1 other individuals' data.

Mathematical technique allowing linear algorithms to work with non-linear data by mapping to higher-dimensional spaces.

Knowledge Graphs

Structured databases representing information as interconnected entities and relationships for AI reasoning and question answering.

Knowledge Distillation

Training smaller "student" models to mimic larger "teacher" models while maintaining performance with reduced computational requirements.

L

.

Logistic Regression

Statistical method for binary classification that predicts probability of outcomes using sigmoid function and linear combinations.

Latent Diffusion Models

AI models that generate images by learning to reverse noise addition process in compressed representation space.

Large Language Model (LLM)

AI systems trained on vast text data to understand and generate human-like language for various tasks.

LoRA (Low-Rank Adaptation)

Efficient fine-tuning technique that adapts large models by training small additional parameters instead of all weights.

LLaMA (Large Language Model)

Meta's family of foundation language models designed for research and various natural language processing applications.

M

.

Teaching AI models to learn new tasks quickly by learning general learning strategies from multiple related tasks.

Model Overfitting

When machine learning models memorize training data too closely, performing poorly on new, unseen data examples.

N

.

Normalizing Flows

Machine learning models that transform simple probability distributions into complex ones through invertible neural network transformations.

Computing system inspired by biological brains, using interconnected nodes to learn patterns from data through training.

Neuro-Symbolic AI

Combining neural networks' pattern recognition with symbolic reasoning for interpretable and logical AI decision-making systems

Natural Language Processing (NLP)

AI field focused on enabling computers to understand, interpret, and generate human language effectively.

NeRF (Neural Radiance Fields)

AI technique for creating 3D scene representations from 2D images, enabling novel viewpoint synthesis and rendering.

O

.

Mathematical process of finding best parameters or solutions that minimize error or maximize performance in models.

One-shot Learning

Machine learning approach where models learn to recognize new concepts from just one or few examples.

Ontology-based AI

AI systems using structured knowledge representations to understand relationships between concepts and enable logical reasoning.

Out-of-Distribution Detection

Identifying when input data differs significantly from training data to prevent unreliable AI model predictions.

Open Weight LLMs (e.g., Mistral, Falcon)

Large language models with publicly available model weights, allowing researchers to study and modify them.

P

.

Prompt Engineering

Crafting effective input instructions and queries to guide AI language models toward producing desired outputs and responses.

Precision and Recall

Precision measures true positives over all predicted positives; recall measures true positives over all actual positives in classification.

Principal Component Analysis

Method to simplify large datasets into smaller sets while maintaining significant patterns and trends through dimensionality reduction

Proximal Policy Optimization (PPO)

Reinforcement learning algorithm designed to optimize policies in stable and efficient manner for model alignment with preferences.

Parameter-Efficient Fine-Tuning (PEFT)

Method improving pretrained models for specific tasks by training small parameter sets while preserving model structure efficiently.

Q

.

Reinforcement learning algorithm where agents learn optimal actions by updating quality values for state-action pairs.

Quantum Machine Learning

Combining quantum computing principles with machine learning to potentially solve certain problems faster than classical computers.

Quantization Aware Training

Training neural networks with reduced precision arithmetic to create models optimized for efficient deployment.

Query-based Retrieval-Augmented Generation (RAG)

AI system that retrieves relevant information from databases to enhance language model responses with factual knowledge.

R

.

Reinforcement Learning

Machine learning where agents learn optimal behavior through trial-and-error interactions with environment using rewards.

Representation Learning

Machine learning focused on automatically discovering useful data representations for downstream tasks and improved performance.

Residual Networks (ResNet)

Deep neural network architecture using skip connections to enable training of very deep networks effectively.

Retrieval-Augmented Generation (RAG)

AI approach combining external information retrieval with text generation to produce more accurate and factual responses.

S

.

Semantic Search

Information retrieval method that understands meaning and context of queries rather than just matching keywords.

Structured Prediction

Machine learning for predicting complex, interdependent outputs like sequences, trees, or graphs with internal structure.

Self-Supervised Learning

Training AI models using inherent data structure as supervision signal without requiring manually labeled examples.

Support Vector Machine (SVM)

Machine learning algorithm that finds optimal boundary separating different classes by maximizing margin between them.

T

.

Breaking text into smaller units (words, subwords, characters) that machine learning models can process effectively.

Neural network architecture using attention mechanisms, forming the foundation for modern language models like GPT.

Transfer Learning

Using knowledge gained from one task to improve performance on related tasks, reducing training time.

Time to First Token (TTFT)

Latency metric measuring time from user input to when AI model starts generating its first response token.

Tree-based Models (e.g., LightGBM, CatBoost)

Machine learning algorithms using decision tree structures, including ensemble methods like Random Forest and XGBoost.

U

.

When machine learning models are too simple to capture underlying data patterns, resulting in poor performance.

Unsupervised Pretraining

Training AI models on unlabeled data to learn general representations before fine-tuning on specific tasks.

V

.

Programming approach emphasizing intuitive understanding and experimentation over rigid formal methods and extensive documentation.

Vanishing Gradient Problem

Issue in deep neural networks where gradients become too small, preventing effective learning in early layers.

Vision Transformers (ViTs)

Applying transformer architecture to computer vision tasks by treating image patches as sequence tokens.

Variational Autoencoders (VAEs)

Neural networks that learn to encode data into latent space and decode back, enabling generation.

W

.

Neural network technique where multiple connections use same parameter values, reducing model complexity and overfitting.

Word Embeddings

Vector representations of words that capture semantic relationships and enable mathematical operations on language.

Weak Supervision

Training machine learning models using noisy, limited, or programmatically generated labels instead of manual annotation.

Whisper (OpenAI’s Speech Recognition Model)

OpenAI's automatic speech recognition model that converts spoken language into text across multiple languages accurately.

X

.

Xformers (efficient Transformer implementations)

Efficient implementations of transformer architectures optimized for memory usage and computational speed in deep learning

Gradient boosting framework that builds ensemble of decision trees sequentially to achieve high performance efficiently.

Interactive interfaces that visualize and explain AI model behavior, predictions, and decision-making processes for users.

Y

.

YOLO (You Only Look Once)

Real-time object detection algorithm that identifies and locates multiple objects in images with single forward pass.

YAML Pipelines for ML workflows

Configuration files defining machine learning workflows, data processing steps, and model training procedures declaratively.

Yield Prediction Models (AgriTech AI)

AI systems in agriculture that forecast crop yields using weather, soil, and historical data.

Yann LeCun’s H-JEPA (Hierarchical Joint Embedding Predictive Architecture)

Hierarchical Joint Embedding Predictive Architecture for learning world models through self-supervised prediction of representations.

Z

.

Zero-shot Learning

AI models performing tasks they weren't explicitly trained on by leveraging knowledge from related tasks.

Z-Score Normalization

Statistical technique standardizing data by subtracting mean and dividing by standard deviation for consistent scaling.

ZKP (Zero Knowledge Proofs)

Cryptographic methods allowing verification of information without revealing the underlying data or computation details.

Zeno++ (Fault-tolerant ML training)

Fault-tolerant machine learning training system designed to handle failures and continue learning despite hardware or software issues.

Solutions

Resources

Company