Skip to content

aimu.models

Provider-agnostic model clients.

Factory and base class

aimu.models.ModelClient

ModelClient(model: Union[Model, ModelSpec, str], **kwargs: Any)

Bases: BaseModelClient

Public factory for provider-backed model clients.

Accepts either a provider's Model enum member or a "provider:model_id" string::

from aimu.models import ModelClient, OllamaModel

# Enum form
client = ModelClient(OllamaModel.QWEN_3_8B)

# String form (no enum import needed)
client = ModelClient("anthropic:claude-sonnet-4-6")
client = ModelClient("ollama:qwen3.5:9b")

Provider-specific kwargs are forwarded to the concrete client::

ModelClient(LlamaCppModel.QWEN_3_8B, model_path="/path/to/model.gguf")
ModelClient(OllamaModel.LLAMA_3_1_8B, model_keep_alive_seconds=120)
ModelClient(LMStudioOpenAIModel.LLAMA_3_2_3B, base_url="http://myserver:1234/v1")

aimu.models.BaseModelClient

BaseModelClient(model: Model, model_kwargs: Optional[dict] = None, system_message: Optional[str] = None)

Bases: _ChatStateMixin, ABC

Abstract base for all provider clients.

Subclasses implement :meth:generate, :meth:chat, and :meth:_update_generate_kwargs. Tool calling, message history, vision input, and streaming filters are handled here once for every provider.

generate

generate(prompt: str, generate_kwargs: Optional[dict[str, Any]] = None, stream: bool = False, include: Optional[Iterable[Union[str, StreamingContentType]]] = None) -> Union[str, Iterator[StreamChunk]]

Single-turn generation. See :meth:chat for the include filter semantics.

chat

chat(user_message: str, generate_kwargs: Optional[dict[str, Any]] = None, use_tools: bool = True, stream: bool = False, images: Optional[list] = None, include: Optional[Iterable[Union[str, StreamingContentType]]] = None) -> Union[str, Iterator[StreamChunk]]

Multi-turn chat with persistent message history.

Parameters:

Name Type Description Default
user_message str

The text the user is sending this turn.

required
generate_kwargs Optional[dict[str, Any]]

Provider-specific generation parameters. Unknown keys are dropped per-provider; see each client for accepted names.

None
use_tools bool

If False, suppress tool calls even when the model supports tools.

True
stream bool

If True, return an iterator of :class:StreamChunk instead of a string.

False
images Optional[list]

Optional list of images for vision-capable models. Each entry may be a file path (str or pathlib.Path), raw bytes, an http(s):// URL, or a data:image/...;base64,... URL. Raises ValueError if the model does not support vision. Only used on the initial user turn.

None
include Optional[Iterable[Union[str, StreamingContentType]]]

Optional iterable of stream phases to yield. Defaults to all phases (THINKING, TOOL_CALLING, GENERATING, DONE). Has no effect when stream=False. Values may be :class:StreamingContentType members or their string equivalents ("thinking", "tool_calling", "generating", "done").

None

Types

aimu.models.ModelSpec dataclass

ModelSpec(id: str, tools: bool = False, thinking: bool = False, vision: bool = False, generation_kwargs: Optional[dict] = None)

Capability descriptor for a single model.

Holds the provider-side model id plus universal capability flags. Provider-specific extras (e.g. HuggingFace tool-call format) live on the provider's Model subclass, not here.

Equality and hash are by id only, so a ModelSpec can be used directly as an enum value even when generation_kwargs is a dict.

aimu.models.Model

Model(spec: ModelSpec)

Bases: Enum

Base enum for provider model catalogs.

Each member's value is a ModelSpec; capability flags are mirrored as plain attributes (supports_tools, supports_thinking, supports_vision, generation_kwargs) for direct read access. .value returns the provider id string so code can call e.g. ollama.pull(model.value).

aimu.models.StreamChunk

Bases: NamedTuple

A single chunk yielded by client.chat(stream=True), Agent.run(stream=True), image_client.generate(stream=True), or any streaming tool / workflow.

Fields

phase: content type of this chunk (THINKING, TOOL_CALLING, GENERATING, IMAGE_GENERATING, DONE) content: shape depends on phase: - str for THINKING / GENERATING (token). - dict {"name", "arguments", "response"} for TOOL_CALLING (arguments is the dict the model passed to the tool). - dict {"step", "total_steps", "image", "final", "result"} for IMAGE_GENERATING — step is 1-indexed, image is an optional PIL.Image (None unless preview_every opted in this step), final=True marks the terminal chunk for one image, and result carries the encoded output (path / bytes / data-url per format=) on the final chunk. - str for DONE (usually empty). agent: name of the agent that produced this chunk, or None for a plain client.chat() / client.generate() call. Set automatically by Agent and workflow runners. iteration: zero-based iteration index inside the agent loop, or 0 for plain chat.

Use chunk.is_text() / chunk.is_tool_call() / chunk.is_image_progress() to dispatch on phase without repeating the equality check in user code.

is_text

is_text() -> bool

True if this chunk carries text (THINKING or GENERATING).

is_tool_call

is_tool_call() -> bool

True if this chunk carries a tool-call result.

is_image_progress

is_image_progress() -> bool

True if this chunk carries image-generation progress (IMAGE_GENERATING).

aimu.models.StreamingContentType

Bases: str, Enum

Provider clients

aimu.models.OllamaClient

OllamaClient(model: OllamaModel, system_message: Optional[str] = None, model_keep_alive_seconds: int = 60)

aimu.models.AnthropicClient

AnthropicClient(model: AnthropicModel, model_kwargs: Optional[dict] = None, system_message: Optional[str] = None)

Bases: BaseModelClient

Client for Anthropic Claude models using the native anthropic SDK.

Reads ANTHROPIC_API_KEY from the environment (or a .env file). self.messages is always stored in OpenAI format; conversion to the Anthropic API format happens at call time.

aimu.models.HuggingFaceClient

HuggingFaceClient(model: HuggingFaceModel, model_kwargs: Optional[dict] = None, system_message: Optional[str] = None)

aimu.models.LlamaCppClient

LlamaCppClient(model: LlamaCppModel, model_path: str, n_ctx: int = 4096, n_gpu_layers: int = -1, chat_format: Optional[str] = None, chat_handler: Optional[Any] = None, verbose: bool = False, system_message: Optional[str] = None, model_kwargs: Optional[dict] = None)

aimu.models.OpenAIClient

OpenAIClient(model: OpenAIModel, system_message: Optional[str] = None, model_kwargs: Optional[dict] = None)

Bases: OpenAICompatClient

Client for the OpenAI API (GPT and o-series models).

Reads OPENAI_API_KEY from the environment (or a .env file).

aimu.models.GeminiClient

GeminiClient(model: GeminiModel, system_message: Optional[str] = None, model_kwargs: Optional[dict] = None)

Bases: OpenAICompatClient

Client for Google Gemini models via Google's OpenAI-compatible REST API.

Reads GOOGLE_API_KEY from the environment (or a .env file).

aimu.models.LMStudioOpenAIClient

LMStudioOpenAIClient(model: LMStudioOpenAIModel, base_url: str = LMSTUDIO_BASE_URL, **kwargs)

aimu.models.OllamaOpenAIClient

OllamaOpenAIClient(model: OllamaOpenAIModel, base_url: str = OLLAMA_BASE_URL, **kwargs)

aimu.models.HFOpenAIClient

HFOpenAIClient(model: HFOpenAIModel, base_url: str = HF_OPENAI_BASE_URL, **kwargs)

aimu.models.VLLMOpenAIClient

VLLMOpenAIClient(model: VLLMOpenAIModel, base_url: str = VLLM_BASE_URL, **kwargs)

aimu.models.LlamaServerOpenAIClient

LlamaServerOpenAIClient(model: LlamaServerOpenAIModel, base_url: str = LLAMASERVER_BASE_URL, **kwargs)

Bases: OpenAICompatClient

Client for llama.cpp's llama-server OpenAI-compatible REST API.

Start the server with

llama-server -m /path/to/model.gguf --port 8080

aimu.models.SGLangOpenAIClient

SGLangOpenAIClient(model: SGLangOpenAIModel, base_url: str = SGLANG_BASE_URL, **kwargs)

Bases: OpenAICompatClient

Client for SGLang's OpenAI-compatible REST API.

Start the server with

python -m sglang.launch_server --model-path --port 30000

aimu.models.OpenAICompatClient

OpenAICompatClient(model: Model, base_url: str, api_key: str = 'not-needed', system_message: Optional[str] = None, model_kwargs: Optional[dict] = None)