The AI industry has spent the last few years convincing us that training a language model requires a massive server farm and millions of dollars. They want developers hooked on their APIs. They want us to treat these models like magical black boxes that only the big tech companies have the resources to build.
But Andrej Karpathy just completely undermined that narrative with his new open source project. You can actually build your own conversational AI pipeline for the cost of a nice dinner. It is called NanoChat, and it is the most honest look at AI development I have seen in a long time.
The problem with modern AI development
Right now most developers working in the artificial intelligence space are just routing text to an API endpoint. We send a prompt to OpenAI or Anthropic, wait for a response, and parse the JSON. It is useful work, but it is not machine learning. You are effectively renting someone else's brain.
When you only interact with models through an API, you lose all intuition for how they actually work. You do not see the tokenization quirks. You miss the mechanics of the attention layers. You have no idea what the loss curve looked like during training.
This creates a massive knowledge gap. The people building the foundation models understand the physics of AI. Everyone else is just playing with the user interface. We have an entire generation of software engineers who know how to prompt a model but have absolutely no idea what happens under the hood when they hit submit.
What exactly is NanoChat
Karpathy describes NanoChat as the best ChatGPT that $100 can buy. It is a complete pipeline for training a conversational model. He released it in late 2025, and it has already become the default starting point for engineers who want to understand how these systems operate.
Unlike massive corporate repositories that hide the core logic behind layers of enterprise abstractions, NanoChat is painfully simple. It gives you the raw PyTorch code to go from a pile of text documents to a working web interface where you can chat with your creation.
It builds on his previous work with nanoGPT, but it takes things a step further. Instead of just pretraining a base model to predict the next word, NanoChat walks you through the entire process of making that model useful. It includes the steps to make it actually respond to questions instead of just rambling.
Breaking down the pipeline
The genius of NanoChat is how it exposes every single step of the model creation process. Most tutorials stop at the model architecture. Karpathy forces you to look at the entire data lifecycle.
First, you have to deal with the tokenizer. This is the piece of code that chops words up into little numbers the model can digest. You get to see exactly why language models struggle with spelling or math, because you literally watch the code compress the text into awkward chunks.
Then comes the pretraining phase. This is where the model reads gigabytes of internet text and learns the basic structure of human language. It is a brute force statistical exercise. After that, the project walks you through the instruction tuning phase. This is the magic step that turns a text predictor into a helpful assistant. You see the exact format of the conversations used to teach the model how to be polite and answer questions directly.
Finally, you get a simple web interface. You can type a message in your browser and watch your custom model generate a response, token by token.
The $100 training run
The most compelling part of this project is the accessibility. You do not need to raise venture capital to run this code.
The entire pipeline is designed to run on a single node with eight NVIDIA H100 GPUs. Renting one of those nodes from a cloud provider costs roughly $100 for the time it takes to complete the training run. Recent updates to the project even show how you can train a GPT-2 level model in just two hours.
I keep thinking about the developers who are currently in university or just starting their careers. For $100, they can get hands-on experience training an entire language model. They get to watch the loss go down in real time. They get to see exactly what happens when you mess up the learning rate. That kind of education used to require getting hired at a massive research lab.
Why building from scratch matters
I know what some people are thinking. Why would I spend $100 to train a weak model when I can use the latest frontier models for pennies?
The point is not to build a production model that replaces Claude or Gemini. The point is to build a mental model. When you write the code that converts words into numbers, you stop thinking of the AI as a thinking machine. You start seeing it as a statistical engine.
This changes how you build applications on top of AI. When you understand the underlying math, you write better prompts. You understand why the model hallucinates certain facts but gets others right. You can guess how a model will fail before you even run the test. You gain an intuition that cannot be learned by reading documentation or watching tech talks.
Official Links
- GitHub Repository: https://github.com/karpathy/nanochat
Conclusion
The era of treating AI like magic is ending. The tools to build these systems are becoming smaller, cheaper, and easier to understand. If you have been relying exclusively on APIs to build your applications, take a weekend to run NanoChat. Spend the $100. It is the best investment you can make in your understanding of the technology that is driving the software industry forward.