aimu.models¶
Provider-agnostic model clients.
Factory and base class¶
aimu.models.ModelClient ¶
Bases: BaseModelClient
Public factory for provider-backed model clients.
Accepts either a provider's Model enum member or a "provider:model_id" string::
from aimu.models import ModelClient, OllamaModel
# Enum form
client = ModelClient(OllamaModel.QWEN_3_8B)
# String form (no enum import needed)
client = ModelClient("anthropic:claude-sonnet-4-6")
client = ModelClient("ollama:qwen3.5:9b")
Provider-specific kwargs are forwarded to the concrete client::
ModelClient(LlamaCppModel.QWEN_3_8B, model_path="/path/to/model.gguf")
ModelClient(OllamaModel.LLAMA_3_1_8B, model_keep_alive_seconds=120)
ModelClient(LMStudioOpenAIModel.LLAMA_3_2_3B, base_url="http://myserver:1234/v1")
aimu.models.BaseModelClient ¶
BaseModelClient(model: Model, model_kwargs: Optional[dict] = None, system_message: Optional[str] = None)
Bases: _ChatStateMixin, ABC
Abstract base for all provider clients.
Subclasses implement :meth:generate, :meth:chat, and :meth:_update_generate_kwargs.
Tool calling, message history, vision input, and streaming filters are handled here
once for every provider.
generate ¶
generate(prompt: str, generate_kwargs: Optional[dict[str, Any]] = None, stream: bool = False, include: Optional[Iterable[Union[str, StreamingContentType]]] = None) -> Union[str, Iterator[StreamChunk]]
Single-turn generation. See :meth:chat for the include filter semantics.
chat ¶
chat(user_message: str, generate_kwargs: Optional[dict[str, Any]] = None, use_tools: bool = True, stream: bool = False, images: Optional[list] = None, include: Optional[Iterable[Union[str, StreamingContentType]]] = None) -> Union[str, Iterator[StreamChunk]]
Multi-turn chat with persistent message history.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
user_message
|
str
|
The text the user is sending this turn. |
required |
generate_kwargs
|
Optional[dict[str, Any]]
|
Provider-specific generation parameters. Unknown keys are dropped per-provider; see each client for accepted names. |
None
|
use_tools
|
bool
|
If False, suppress tool calls even when the model supports tools. |
True
|
stream
|
bool
|
If True, return an iterator of :class: |
False
|
images
|
Optional[list]
|
Optional list of images for vision-capable models. Each entry may be a
file path (str or |
None
|
include
|
Optional[Iterable[Union[str, StreamingContentType]]]
|
Optional iterable of stream phases to yield. Defaults to all phases
(THINKING, TOOL_CALLING, GENERATING, DONE). Has no effect when |
None
|
Types¶
aimu.models.ModelSpec
dataclass
¶
ModelSpec(id: str, tools: bool = False, thinking: bool = False, vision: bool = False, generation_kwargs: Optional[dict] = None)
Capability descriptor for a single model.
Holds the provider-side model id plus universal capability flags. Provider-specific
extras (e.g. HuggingFace tool-call format) live on the provider's Model subclass,
not here.
Equality and hash are by id only, so a ModelSpec can be used directly as an
enum value even when generation_kwargs is a dict.
aimu.models.Model ¶
Bases: Enum
Base enum for provider model catalogs.
Each member's value is a ModelSpec; capability flags are mirrored as plain
attributes (supports_tools, supports_thinking, supports_vision,
generation_kwargs) for direct read access. .value returns the provider id
string so code can call e.g. ollama.pull(model.value).
aimu.models.StreamChunk ¶
Bases: NamedTuple
A single chunk yielded by client.chat(stream=True), Agent.run(stream=True),
image_client.generate(stream=True), or any streaming tool / workflow.
Fields
phase: content type of this chunk (THINKING, TOOL_CALLING, GENERATING,
IMAGE_GENERATING, DONE)
content: shape depends on phase:
- str for THINKING / GENERATING (token).
- dict {"name", "arguments", "response"} for TOOL_CALLING
(arguments is the dict the model passed to the tool).
- dict {"step", "total_steps", "image", "final", "result"} for
IMAGE_GENERATING — step is 1-indexed, image is an optional
PIL.Image (None unless preview_every opted in this step),
final=True marks the terminal chunk for one image, and result
carries the encoded output (path / bytes / data-url per format=)
on the final chunk.
- str for DONE (usually empty).
agent: name of the agent that produced this chunk, or None for a plain
client.chat() / client.generate() call. Set automatically by
Agent and workflow runners.
iteration: zero-based iteration index inside the agent loop, or 0 for plain chat.
Use chunk.is_text() / chunk.is_tool_call() / chunk.is_image_progress() to
dispatch on phase without repeating the equality check in user code.
aimu.models.StreamingContentType ¶
Bases: str, Enum
Provider clients¶
aimu.models.OllamaClient ¶
OllamaClient(model: OllamaModel, system_message: Optional[str] = None, model_keep_alive_seconds: int = 60)
Bases: BaseModelClient
aimu.models.AnthropicClient ¶
AnthropicClient(model: AnthropicModel, model_kwargs: Optional[dict] = None, system_message: Optional[str] = None)
Bases: BaseModelClient
Client for Anthropic Claude models using the native anthropic SDK.
Reads ANTHROPIC_API_KEY from the environment (or a .env file). self.messages is always stored in OpenAI format; conversion to the Anthropic API format happens at call time.
aimu.models.HuggingFaceClient ¶
HuggingFaceClient(model: HuggingFaceModel, model_kwargs: Optional[dict] = None, system_message: Optional[str] = None)
Bases: BaseModelClient
aimu.models.LlamaCppClient ¶
LlamaCppClient(model: LlamaCppModel, model_path: str, n_ctx: int = 4096, n_gpu_layers: int = -1, chat_format: Optional[str] = None, chat_handler: Optional[Any] = None, verbose: bool = False, system_message: Optional[str] = None, model_kwargs: Optional[dict] = None)
Bases: BaseModelClient
aimu.models.OpenAIClient ¶
OpenAIClient(model: OpenAIModel, system_message: Optional[str] = None, model_kwargs: Optional[dict] = None)
Bases: OpenAICompatClient
Client for the OpenAI API (GPT and o-series models).
Reads OPENAI_API_KEY from the environment (or a .env file).
aimu.models.GeminiClient ¶
GeminiClient(model: GeminiModel, system_message: Optional[str] = None, model_kwargs: Optional[dict] = None)
Bases: OpenAICompatClient
Client for Google Gemini models via Google's OpenAI-compatible REST API.
Reads GOOGLE_API_KEY from the environment (or a .env file).
aimu.models.LMStudioOpenAIClient ¶
Bases: OpenAICompatClient
aimu.models.OllamaOpenAIClient ¶
Bases: OpenAICompatClient
aimu.models.HFOpenAIClient ¶
Bases: OpenAICompatClient
aimu.models.VLLMOpenAIClient ¶
Bases: OpenAICompatClient
aimu.models.LlamaServerOpenAIClient ¶
LlamaServerOpenAIClient(model: LlamaServerOpenAIModel, base_url: str = LLAMASERVER_BASE_URL, **kwargs)
Bases: OpenAICompatClient
Client for llama.cpp's llama-server OpenAI-compatible REST API.
Start the server with
llama-server -m /path/to/model.gguf --port 8080
aimu.models.SGLangOpenAIClient ¶
Bases: OpenAICompatClient
Client for SGLang's OpenAI-compatible REST API.
Start the server with
python -m sglang.launch_server --model-path
aimu.models.OpenAICompatClient ¶
OpenAICompatClient(model: Model, base_url: str, api_key: str = 'not-needed', system_message: Optional[str] = None, model_kwargs: Optional[dict] = None)
Bases: BaseModelClient