<< All Posts

Running Opencode with Ollama and Qwen3:8b


If you’ve been following along, in a previous post I covered how to run Ollama locally with the Qwen2.5:1.5b model. This time we’re taking it a step further by using Ollama to run Opencode, an open-source AI coding assistant that lives entirely in your terminal.

Opencode gives you an AI pair-programming experience similar to tools like GitHub Copilot or Cursor, but running locally on your own machine with no API costs and full privacy.

Now the big question: Was it worth it? Can you do any AI-code assisted by Quen3:8b model? Well, I’ll leave that for you to try and figure out on your own. :)

What is Opencode?

Opencode is an open-source AI coding assistant that runs in your terminal. It can help with code generation, refactoring, debugging, and general programming questions, all powered by a local LLM through Ollama. Think of it as having an AI coding buddy right in your terminal session.

One important note: Opencode recommends using a model with at least a 64k token context window, which is why Qwen3:8b is a solid choice here.


The Easiest Way: ollama launch

Ollama recently introduced the launch command which makes running Opencode incredibly simple. You don’t need to install Opencode separately or configure anything manually. Just run:

ollama launch opencode

That’s it. Ollama will prompt you to choose a model. Select Qwen3:8b and you’re up and running.

If you haven’t pulled the model yet, Ollama will download it for you. The Qwen3:8b model is roughly 5GB, so the first launch will take a few minutes depending on your connection.

If you want to go straight to configuration without launching:

ollama launch opencode --config

This opens the configuration file at ~/.config/opencode/opencode.json where you can tweak settings if needed.


Manual Setup (Alternative)

If you prefer setting things up manually, you can install Opencode directly:

curl -fsSL https://opencode.ai/install | bash

Then configure it to point to your local Ollama instance by editing ~/.config/opencode/opencode.json:

{
  "provider": {
    "id": "@ai-sdk/openai-compatible",
    "options": {
      "baseURL": "http://localhost:11434/v1"
    }
  },
  "model": "qwen3:8b"
}

Make sure Ollama is running (ollama serve) before starting Opencode.


Why Qwen3:8b?

In my previous post, I used the much smaller Qwen2.5:1.5b model which worked great for simple queries. However, for a coding assistant like Opencode that needs to understand larger chunks of code and maintain context across a conversation, the 8 billion parameter Qwen3 model is a better fit:

  • Larger context window to handle real code files and project context
  • Better reasoning for complex coding tasks like refactoring and debugging
  • Thinking mode where the model shows its reasoning process before answering (you can see this in the verbose output from the previous post, where Qwen3:8b reasons through its answer)

The trade-off is speed and resource usage. On my old laptop with the NVIDIA GeForce GTX 960M, Qwen3:8b ran at about 3.17 tokens/s, which is quite slow for interactive use. If you have a more modern GPU or an Apple Silicon Mac, the experience will be significantly better.


Quick Tips

  • Check your hardware: For a usable interactive experience with Qwen3:8b, you’ll want a GPU with at least 8GB VRAM or an Apple Silicon Mac with 16GB+ unified memory
  • Model management: Use ollama list to see your installed models and ollama rm <model> to free up space
  • Try different models: If Qwen3:8b is too heavy for your hardware, you could try smaller models, though the coding experience may be less impressive

The ollama launch command is a great move by the Ollama team. It removes the friction of configuring tools manually and makes it simple to get started with AI-powered development tools running entirely on your own machine.

Cheers,
Rodolfo