Deep Dive into LLMs like ChatGPT by Andrej Karpathy
If you’re diving into LLMs or just curious about how tools like ChatGPT work under the hood, Karpathy’s breakdown is one of the clearest and most accessible resources out there.
The following are highlights along with time stamps.
Introduction (0:00 - 1:04)
- What are LLMs? Karpathy begins by explaining what large language models (LLMs) are, highlighting their capabilities and limitations. He also sets the goal of the session: to provide a comprehensive understanding of LLMs for a general audience.
Pre-training Stage (1:04 - 2:52)
Downloading and processing the internet
This part explores how massive amounts of text data are collected from the internet and cleaned for model training.The FineWeb data set
Karpathy uses the FineWeb data set to demonstrate the scale and complexity of preparing web data.Tokenization
Tokenization is introduced as the process of breaking text into smaller units (tokens) that the model can process.
Building the Base Model (2:53 - 43:07)
Training on tokens
Karpathy illustrates how the model learns to predict the next token in a sequence using simple word prediction examples.Neural network architecture
He introduces the transformer architecture and explains how it powers modern LLMs.The base model
This part highlights how the base model becomes an internet text simulator—able to generate realistic text, but not yet able to respond meaningfully to queries.
Supervised Fine-Tuning (SFT) (43:07 - 1:15:00)
Turning the base model into an assistant
Karpathy explains how supervised fine-tuning helps shape the model into a helpful assistant by training it on curated conversations.Human labelers
The role of human annotators is emphasized in creating high-quality training data.Conversation encoding
Conversations are represented as token sequences, allowing the model to learn from them.Programming by example
The model learns by example, emulating the responses provided by human labelers.
LLM Psychology (1:15:00 - 1:40:01)
Hallucinations
Karpathy dives into hallucinations—cases where LLMs generate incorrect info—and explains the root causes.The knowledge of self
He discusses the model’s lack of true self-awareness or identity.Context windows
The concept of a context window is compared to human working memory, helping to explain how LLMs retain relevant information.
Reinforcement Learning (RL) (1:40:01 - 2:14:20)
Improving LLMs’ abilities
Karpathy introduces reinforcement learning as the third stage of LLM training, used to sharpen model performance.School analogy
He compares this process to a student going to school and being graded.Emergent properties
He highlights surprising and complex behaviors that can emerge from RL-based training.
Summary and Future Capabilities (2:14:20 - 3:15:05)
The three stages of LLM training
A summary of the full LLM training lifecycle: pre-training, supervised fine-tuning, and reinforcement learning.Multimodality
Karpathy discusses how future LLMs will handle multiple data types—text, audio, images—for richer interaction.Agents
He introduces the idea of persistent LLM agents capable of working over longer timeframes.
Resources (3:15:05 - 3:21:43)
LM leaderboard (Hugging Face)
Karpathy recommends Hugging Face’s leaderboard for tracking LLM performance.LM Studio
A local app that lets users explore and experiment with LLMs.
Conclusion (3:21:43 - 3:31:10)
- Using LLMs as tools
Karpathy closes by emphasizing responsible use of LLMs—understanding their limits and verifying their outputs.
More detailed overview
Chapter | Timestamp |
---|---|
Introduction | 0:00 - 0:53 |
Building ChatGPT | 0:53 - 1:07 |
Downloading & Processing the Internet | 1:07 - 2:52 |
Common Crawl & Pre-Processing | 2:52 - 3:35 |
URL & Language Filtering | 3:35 - 4:53 |
Tokenization | 4:53 - 5:41 |
Training the Neural Network | 5:41 - 7:51 |
Representing Text | 7:51 - 9:51 |
Byte Pair Encoding | 9:51 - 12:07 |
Tokenization Explained | 12:07 - 14:30 |
Neural Network Training | 14:30 - 15:19 |
Neural Network Internals | 15:19 - 17:44 |
The Transformer | 17:44 - 20:12 |
Training the Transformer | 20:12 - 22:52 |
Inference | 22:52 - 26:02 |
GPT-2 Example | 26:02 - 31:08 |
Training GPT-2 | 31:08 - 33:39 |
GPU Compute | 33:39 - 36:52 |
Model Release | 36:52 - 44:13 |
Post Training & Eliciting Knowledge | 44:13 - 50:34 |
Conversation Encoding | 50:34 - 1:03:01 |
Post Training & Data Sets | 1:03:01 - 1:04:10 |
Supervised Fine-Tuning | 1:04:10 - 1:04:54 |
Human Labelers & Instructions | 1:04:54 - 1:05:05 |
Conversation Tokenization | 1:05:05 - 1:06:14 |
Synthesized Conversations | 1:06:14 - 1:09:14 |
ChatGPT: Imitating Human Labelers | 1:09:14 - 1:17:00 |
The Power of Tools | 1:17:00 - 1:19:31 |
Context Windows & Memory | 1:19:31 - 1:20:49 |
The “Self” of LLMs | 1:20:49 - 1:23:25 |
Computational Capabilities | 1:23:25 - 1:29:49 |
Problem Solving and Accuracy | 1:29:49 - 1:32:26 |
Emerging Properties of LLMs | 1:32:26 - 1:37:23 |
Chains of Thought | 1:37:23 - 1:38:59 |
Multimodality | 1:38:59 - 1:40:07 |
Context Windows & LLMs | 1:40:07 - 1:41:50 |
Knowledge of Self | 1:41:50 - 1:47:02 |
Computational Capabilities & Limitations | 1:47:02 - 1:58:38 |
Reinforcement Learning | 1:58:38 - 2:00:40 |
RL vs. SFT | 2:00:40 - 2:07:23 |
LLM Stages Summary | 2:07:23 - 2:10:06 |
Reinforcement Learning (cont.) | 2:10:06 - 2:14:15 |
Example & Practice | 2:14:15 - 2:16:07 |
Problem Solving & Accuracy (revisited) | 2:16:07 - 2:29:49 |
Emerging Properties: Chains of Thought | 2:29:49 - 2:31:37 |
Reasoning | 2:31:37 - 2:37:23 |
DeepSeek and Distillation Risk | 2:37:23 - 2:38:59 |
Multimodality & Agents | 2:38:59 - 3:02:40 |
Resources & Leaderboards | 3:02:40 - 3:15:07 |
ElMarina LLM Leaderboard | 3:15:07 - 3:15:30 |
DeepSeek | 3:15:30 - 3:15:57 |
OpenAI & Google | 3:15:57 - 3:17:30 |
Tools & Resources Mentioned
Common Crawl (2:52)
Massive internet text archive for training data.Tick Tokenizer (12:15)
Explore token representations of text.Hyperbolic (46:57)
Interface for experimenting with Llama 3 models.Together.ai (2:36:12)
Hosts various state-of-the-art LLMs, including DeepSeek R1.El Marina (3:15:15)
Human comparison rankings of LLMs.LM Studio (3:20:35)
A tool for running and interacting with LLMs locally.