Getting started¶
This is a 15-minute walkthrough from a fresh install to a working agent. By the end you'll know the three core entry points (aimu.chat(), aimu.client(), and Agent) and how to swap providers without changing call sites.
1. Install¶
Pick a backend. For this tutorial we'll use Ollama because it's local and free:
Then pull a small tool-capable model:
For cloud providers instead, use pip install aimu[anthropic] (or [openai_compat]) and set the corresponding API key in your environment. Every example below works identically; only the model string changes.
2. Your first chat¶
import aimu
response = aimu.chat("What is the capital of France?", model="ollama:qwen3.5:9b")
print(response)
You should see something like "The capital of France is Paris."
aimu.chat() is a one-shot: it builds a fresh client, sends one message, returns the response, and is done. There's no client object to manage.
Omitting the model¶
You can leave out model= entirely:
When the model is omitted, AIMU resolves a default in this order:
- The
AIMU_LANGUAGE_MODELenv var (a"provider:model_id"string). Set it in your project's.envto pin a default:AIMU_LANGUAGE_MODEL=ollama:qwen3.5:9b. - An already-available local model: a running Ollama server, a model already in your HuggingFace cache, or a running local OpenAI-compatible server (LM Studio, vLLM, llama-server, SGLang). The chosen model is logged.
- Otherwise a
ValueErrorlisting how to fix it.
AIMU never auto-selects a cloud provider (no surprise API bills) and never downloads weights implicitly. Passing model= explicitly (as every example here does) is always the clearest, most reproducible choice.
3. Multi-turn conversation¶
For a conversation, build a reusable client with aimu.client():
import aimu
client = aimu.client("ollama:qwen3.5:9b", system="You are concise.")
client.chat("My favourite colour is blue.")
print(client.chat("What did I just tell you?"))
# 'You told me your favourite colour is blue.'
client.chat() accumulates history in client.messages. The system message is set at construction time and locks once the first chat is sent. Call client.reset() if you need to start over.
4. Streaming¶
For UIs and progress visibility, pass stream=True. You get an iterator of StreamChunk objects:
for chunk in aimu.chat("Tell me a short story", model="ollama:qwen3.5:9b", stream=True):
if chunk.is_text():
print(chunk.content, end="", flush=True)
chunk.is_text() is True for both THINKING and GENERATING phases. For finer control see how-to: stream output.
5. Swap providers¶
The whole point of ModelClient is provider-agnostic code. Switch by changing the model string:
# Local Ollama
aimu.chat("hi", model="ollama:qwen3.5:9b")
# Anthropic (needs ANTHROPIC_API_KEY)
aimu.chat("hi", model="anthropic:claude-sonnet-4-6")
# OpenAI (needs OPENAI_API_KEY)
aimu.chat("hi", model="openai:gpt-4o-mini")
# Google Gemini (needs GOOGLE_API_KEY)
aimu.chat("hi", model="gemini:gemini-2.5-flash")
The rest of your code is unchanged. See how-to: switch providers for the full list.
6. Your first agent¶
So far we've called chat() directly. An Agent adds a tool-using loop on top: it keeps calling chat() until the model stops invoking tools.
First, declare a tool:
import aimu
@aimu.tool
def letter_counter(word: str, letter: str) -> int:
"""Count occurrences of a letter in a word."""
return word.lower().count(letter.lower())
The @aimu.tool decorator inspects the signature and docstring to build a tool spec for the model. The function itself is unchanged.
Then wrap a client in an Agent:
from aimu.agents import Agent
client = aimu.client("ollama:qwen3.5:9b")
agent = Agent(client, "You are a helpful assistant.", tools=[letter_counter])
print(agent.run("How many r's are in 'strawberry'?"))
# The model calls letter_counter(word="strawberry", letter="r"), gets 3,
# and responds with the answer.
The agent's loop: send the user's message → if the model called tools, dispatch them and continue → repeat until the model returns text without calling tools.
What's next¶
You've now used the three load-bearing APIs:
aimu.chat()for one-shotsaimu.client()for conversationsAgentfor autonomous tool-using loops
The next tutorials build on each:
- First agent with tools: deeper on
@tooland built-in tools. - Workflows: code-controlled patterns (chain / router / parallel) when you want the flow fixed.
- Vision and streaming: image input plus the full
StreamChunkAPI.
Or jump into how-to guides for specific tasks.