How-to guides¶

Task-oriented recipes. Each guide answers a specific question: how do I do this?

If you're new to AIMU, start with the tutorials instead; those build a working mental model. How-to guides assume you already know the basics and want the steps for a particular task.

Working with models¶

Switch providers: change backends without changing call sites
Add a new model: register a model enum member
Add or update a provider: write a provider client and wire it into the factory
Stream output: stream=True, phase filtering, helpers
Get structured output: schema= on chat() / generate() returns a typed dataclass/Pydantic object; native enforcement with a parse fallback
Use async (aio): embed AIMU in async apps; asyncio.TaskGroup-backed Parallel
Handle vision input: pass images via images=
Generate images: aimu.image_client() / generate_image() with HuggingFace diffusers + Google Nano Banana
Generate audio: aimu.audio_client() / generate_audio() with HuggingFace MusicGen, AudioLDM2, and Stable Audio
Generate speech: aimu.speech_client() / generate_speech() for TTS with HuggingFace MMS-TTS/BARK or OpenAI tts-1/tts-1-hd
Embed text: aimu.embedding_client() / embed() with OpenAI, Ollama, or local HuggingFace sentence-transformers; plug into SemanticMemoryStore
Iterative image refinement: a generate → evaluate → refine loop, built two ways (agent-directed vs code-directed)
Iterative text refinement: the GPU-free text twin of the above; generate → judge → refine across a code loop, an agent, a workflow class, and two search strategies

Tools¶

Add a custom tool: @tool decorator rules and patterns
Use MCP tools: cross-process tools via FastMCP
Gate tool calls: an approval hook to confirm or block risky tool calls
Fetch HTML and submit web forms: raw HTML, form discovery, GET/POST with a shared session
Cancel a run: stop an in-flight async run with RunHandle and resume from partial state

Agents and workflows¶

Use skills: SkillAgent and the SKILL.md format
Build a personal assistant: channels, a scheduler, and runtime skill authoring for an always-on assistant
Build an orchestrator: OrchestratorAgent.assemble or subclass
Spawn sub-agents: make_subagent_tool for dynamic, parallel, isolated sub-agents (the runtime complement to OrchestratorAgent)
Connect agents (A2A): consume a remote agent as a Runner, or expose one with serve_a2a
Plan, execute, evaluate, replan: PlanExecuteEvaluator for tasks with measurable success criteria

Memory and persistence¶

Persist conversations: ConversationManager
Use sessions (multi-user): per-conversation state keyed by channel:sender via SessionStore
Use semantic memory: SemanticMemoryStore
Use document memory: DocumentStore
Retrieval-augmented generation: use aimu.rag to chunk, retrieve, rerank, and ground answers in your documents

Prompts and evaluation¶

Tune prompts: hill-climbing optimisation against labelled data
Benchmark models: multi-model comparison harness
Integrate DeepEval: use DeepEval metrics as scorers / judges