How AI Actually Works

You don't need a math degree to understand how AI works. In this module, we'll use simple mental models and analogies to build genuine intuition for what's happening inside systems like ChatGPT and Claude — without a single equation.

The Pattern Matching Mental Model

The most important thing to understand about modern AI: it's a pattern-matching engine of extraordinary scale. When you type a question into ChatGPT or Claude, the AI isn't "thinking" or "knowing" — it's recognizing patterns it learned from billions of examples during training, and generating a response that fits those patterns.

The Autocomplete Analogy

Imagine your phone's autocomplete, but trained on the entire internet. It doesn't "know" what you mean — it predicts what word is statistically most likely to come next given what you've typed. Now scale that up a trillion-fold and you have the basic mechanism of an LLM.

Neural Networks: The Building Blocks

A neural network is a system inspired by how biological brains process information. Think of it as a series of interconnected layers, where each layer transforms the input and passes it along:

📝

Input

"What is AI?"

→

🔄

Hidden Layers

Pattern recognition

→

💡

Output

Generated response

Each layer in a neural network extracts increasingly abstract patterns:

Early layers detect simple features (in images: edges, colors; in text: word relationships)
Middle layers combine simple features into concepts (shapes, phrases, grammar patterns)
Deep layers represent complex ideas (objects, meaning, context, intent)

How Models Learn: Training 101

Training an AI model is conceptually simple, even if the math is complex:

Show it examples

Feed the model massive amounts of data — text from the internet, books, code, conversations.

Make a prediction

The model tries to predict what comes next (e.g., the next word in a sentence).

Measure the error

Compare the prediction to the actual answer. How wrong was it?

Adjust and repeat

Slightly tweak the model's internal parameters to be less wrong next time. Repeat billions of times.

The Sculptor Analogy

Training a neural network is like sculpting: you start with a rough block (random parameters), then make millions of tiny adjustments — each one guided by feedback about how close you are to the desired shape. After enough adjustments, a recognizable form emerges from the chaos.

Training vs. Inference

These are the two phases of an AI model's lifecycle, and understanding the distinction is crucial:

Training

Happens once (or periodically), before deployment
Costs millions of dollars in compute
Takes weeks to months
Requires massive GPU clusters
Produces the "model weights" — the learned patterns

Inference

Happens every time you use the model
Costs fractions of a cent per request
Takes milliseconds to seconds
Can run on smaller hardware
Uses the trained weights to generate responses

How ChatGPT-Style Models Work

Large language models (LLMs) like ChatGPT, Claude, and Gemini work through a three-stage process:

Pre-training: The model reads essentially the entire internet — books, websites, code repositories, academic papers. It learns language patterns, facts, reasoning strategies, and much more. This produces the "base model."
Fine-tuning / RLHF: The base model is then refined using human feedback. Human trainers rate different responses, teaching the model to be helpful, honest, and harmless. This is what makes it conversational rather than just a text predictor.
Inference (your conversation): When you send a message, the model processes your input through its neural network and generates a response one token at a time, each token influenced by everything before it.

Important Limitation

Because LLMs are pattern matchers trained on existing text, they can confidently generate incorrect information ("hallucinate"). They don't have a built-in fact-checking mechanism. Always verify important claims from AI outputs.

Recommended Resources

Video

But what is a neural network? | Deep Learning, Chapter 1

3Blue1Brown

Beautiful visual explanation of neural networks from first principles. The best introduction available.

Video

Intro to Large Language Models

Andrej Karpathy

Andrej Karpathy's accessible one-hour introduction to how LLMs work, from tokenization to inference.

Article

What Is ChatGPT Doing... and Why Does It Work?

Stephen Wolfram

Detailed but accessible deep dive into the mechanics of language models by the creator of Wolfram Alpha.

Key Takeaways

1AI works through pattern matching at massive scale — it learns statistical patterns from data, not rules.
2Neural networks process information through layers, each extracting increasingly abstract features.
3Training (learning from data) is expensive and slow; inference (using the trained model) is fast and cheap.
4LLMs generate text by predicting the most likely next token, one at a time.
5Because LLMs are pattern matchers, they can 'hallucinate' — generating confident but incorrect outputs.