Back to

Run autonomous agents locally with DeepSeek (for free)

Cut API costs to zero. Here's how to configure OpenClaw to run autonomous agents on your own hardware using DeepSeek and Ollama.

I checked my OpenAI API bill last month and honestly, I just stared at it for a minute.

When you’re running autonomous agents—loops that think, act, observe, and repeat—those tokens add up fast. A simple task that gets stuck in a loop can cost you $5 before you even realize it’s failed.

This is why local LLMs are becoming such a big deal for agent developers. It’s not just about privacy (though that’s huge); it’s about the freedom to let an agent run for hours without worrying about the meter running.

With the release of efficient models like DeepSeek-R1, we’ve hit a tipping point. You can now run capable autonomous agents on your own hardware, completely offline, for zero dollars.

Here is how to set up OpenClaw to use DeepSeek via Ollama.

Why local models matter for agents

If you are just using a chatbot, latency doesn't matter much. But agents are different. They need to make dozens of calls to finish a single objective.

Running locally changes the math:

  • Cost: $0. You can run millions of tokens.
  • Speed: On a decent Mac or GPU, local models can be faster than API calls that have network latency.
  • Privacy: Your data never leaves your machine. This is non-negotiable for some corporate use cases.

The trade-off has always been intelligence. Local models used to be dumb. They would get confused, loop endlessly, or fail to follow JSON formatting instructions.

DeepSeek changed that.

The DeepSeek difference

I’ve been testing DeepSeek-R1 (the distilled versions mostly, specifically the 8B and 32B parameters) and the reasoning capabilities are legitimately surprising.

For agentic workflows, we need a model that can:

  1. Understand a complex goal.
  2. Break it down into steps.
  3. Use tools (call functions) correctly.
  4. Analyze the output of those tools.

DeepSeek handles this remarkably well. It’s not GPT-4o, but for 80% of tasks, it’s close enough. And since it’s free, you can just ask it to "try again" if it fails, or have a second agent critique the first one's work.

How to configure OpenClaw with Ollama

Getting this running is easier than you might think. We are going to use Ollama as the backend because it exposes an OpenAI-compatible API that OpenClaw can talk to easily.

1. Install Ollama and pull the model

If you haven't installed Ollama, do that first. Then, pull the DeepSeek model. I recommend starting with the 8B version if you have 16GB RAM, or 32B if you have a beefier setup (32GB+ RAM).

ollama pull deepseek-r1:8b

Test it quickly in your terminal to make sure it's working:

ollama run deepseek-r1:8b "Plan a 3-day trip to Tokyo."

If it starts spitting out text, you are good.

2. Point OpenClaw to your local server

By default, Ollama runs on port 11434. You just need to tell OpenClaw to look there instead of sending requests to OpenAI's servers.

In your OpenClaw configuration (usually a .env file or the config UI), you need to change three things:

  • Base URL: Change this to http://localhost:11434/v1
  • API Key: Ollama doesn't require one, but some clients freak out if it's empty. Just put ollama or sk-dummy.
  • Model: Set this to deepseek-r1:8b (or whatever tag you pulled).

Here is what it looks like in a typical .env file:

LLM_PROVIDER="ollama"
OPENAI_BASE_URL="http://localhost:11434/v1"
OPENAI_API_KEY="ollama"
LLM_MODEL="deepseek-r1:8b"

3. Adjusting the temperature

Local models can sometimes be a bit repetitive. I’ve found that bumping the temperature up slightly helps with creativity, but for agents that need to follow strict instructions, keep it low.

For DeepSeek-R1 specifically, it tends to be quite verbose because of its "chain of thought" training. You might need to adjust your system prompt to tell it: “Be concise. Do not output your thinking process, just the final tool call.”

What to expect (and what to watch out for)

When you first fire it up, it feels a bit like magic. You watch the logs, see the agent thinking, executing commands, and reading files—all happening on your silicon.

However, keep an eye on context windows. While DeepSeek supports large contexts, running a local model with a full 128k context window will eat your RAM for breakfast. If your agent gets stuck in a loop reading massive files, your computer might grind to a halt.

Also, tool use reliability. DeepSeek is good, but sometimes it will try to call a tool that doesn't exist or hallucinate arguments. OpenClaw’s error handling usually catches this and asks the model to fix it, but it can lead to some wasted cycles.

Conclusion

Running autonomous agents locally used to be a fun experiment that didn't really work. Now, it's a viable way to build and test powerful workflows without incurring massive costs.

DeepSeek has lowered the barrier to entry significantly. If you have a decent laptop, you have an army of agents waiting to work for you. Give it a spin.