Add a new model¶
To make a new model usable with ModelClient, add a member to that provider's Model enum. The member's value is a ModelSpec with the model id and capability flags.
AIMU ships curated models only
Every modality requires the model to be a member of its provider enum (or a hand-built spec, see Custom models). Passing an arbitrary "provider:some/unknown-repo" string raises ValueError, rather than silently fabricating a spec with guessed capabilities. Capability flags (tools/thinking/vision, pipeline_class, supports_negative_prompt, voice/step defaults, and so on) can't be inferred reliably from a repo name, and a wrong guess causes hard-to-debug runtime failures. So the catalog is intentional: to use a new model, add it to the enum here.
Basic case¶
For most providers (OllamaModel, AnthropicModel, OpenAIModel, etc.) the enum value is a single ModelSpec:
# aimu/models/providers/anthropic.py
from ..base import Model, ModelSpec
class AnthropicModel(Model):
CLAUDE_SONNET_4_6 = ModelSpec("claude-sonnet-4-6", tools=True, thinking=True, vision=True)
# Add a new model:
CLAUDE_OPUS_5 = ModelSpec("claude-opus-5", tools=True, thinking=True, vision=True)
That's it. The model id ("claude-opus-5") is what gets sent to the provider; the capability flags drive is_tool_using_model, is_thinking_model, is_vision_model, and the derived TOOL_MODELS / THINKING_MODELS / VISION_MODELS classproperties.
A bare ModelSpec defaults the Anthropic reasoning request to ThinkingStyle.ENABLED (budget_tokens). New Claude reasoning models (Opus 4.7+ and Fable 5) require the adaptive request shape and 400 on the enabled form, so define those with a ThinkingStyle.ADAPTIVE extra (see Anthropic provider-specific extras below).
With custom generation kwargs¶
Some models work best with specific sampling parameters:
QWEN_3_8B = ModelSpec(
"qwen3:8b",
tools=True,
thinking=True,
generation_kwargs={"temperature": 0.6, "top_p": 0.95, "top_k": 20},
)
generation_kwargs is merged on top of the client's defaults whenever chat() or generate() is called without an explicit generate_kwargs argument.
HuggingFace provider-specific extras¶
HuggingFaceModel carries extra positional values for the tool-call response format and a thinking-template flag:
# aimu/models/providers/hf/text.py
class HuggingFaceModel(Model):
QWEN_3_8B = (
ModelSpec("Qwen/Qwen3-8B", tools=True, thinking=True),
ToolCallFormat.XML, # tool_call_format
)
QWEN_3_5_9B = (
ModelSpec("Qwen/Qwen3.5-9B", tools=True, thinking=True),
ToolCallFormat.XML,
True, # think_opener_in_prompt
)
Pick ToolCallFormat.XML / JSON_OBJECT / JSON_ARRAY / BRACKETED / NA based on how the base model emits tool calls. See existing entries in aimu/models/providers/hf/text.py for examples per model family.
Anthropic provider-specific extras¶
AnthropicModel carries an optional ThinkingStyle extra that selects how the thinking request is built. Omit it (bare ModelSpec) for ENABLED/budget-style models; pass ThinkingStyle.ADAPTIVE for models that require adaptive thinking:
# aimu/models/providers/anthropic.py
class AnthropicModel(Model):
# Adaptive: thinking={"type": "adaptive", "display": "summarized"}; sampling params dropped.
# Required by Opus 4.7+ and Fable 5 (the enabled form 400s here).
CLAUDE_OPUS_4_8 = (ModelSpec("claude-opus-4-8", tools=True, thinking=True, vision=True), ThinkingStyle.ADAPTIVE)
# Enabled (default): thinking={"type": "enabled", "budget_tokens": N}; the model always thinks.
CLAUDE_HAIKU_4_5 = ModelSpec("claude-haiku-4-5", tools=True, thinking=True, vision=True)
Rule of thumb: Opus 4.7 and later, and Fable 5, are adaptive-only; Opus 4.6, Sonnet 4.6, and Haiku 4.5 use the budget form. Confirm a new model's support with the Models API (client.models.retrieve(id).capabilities["thinking"]["types"]); adaptive.supported / enabled.supported tell you which shape to pick.
Image, audio, and speech models¶
The non-text modalities follow the same pattern with their own spec type and enum. Add a member to the provider enum in the relevant client module:
# aimu/models/providers/hf/image.py: diffusers text-to-image
class HuggingFaceImageModel(ImageModel):
SD_1_5 = HuggingFaceImageSpec("runwayml/stable-diffusion-v1-5", max_prompt_tokens=77)
FLUX_2_KLEIN_4B = HuggingFaceImageSpec(
"black-forest-labs/FLUX.2-klein-4B",
pipeline_class="Flux2KleinPipeline",
img2img_pipeline_class="Flux2KleinPipeline",
img2img_uses_strength=False, # unified pipeline conditions on the image directly
supports_negative_prompt=False, # Flux2KleinPipeline.__call__ has no negative_prompt param
default_steps=4,
max_prompt_tokens=512, # T5-XXL encoder
)
# aimu/models/providers/gemini/image.py: Gemini Nano Banana (cloud)
class GeminiImageModel(ImageModel):
NANO_BANANA = GeminiImageSpec("gemini-2.5-flash-image") # supports_negative_prompt defaults False
# aimu/models/providers/hf/audio.py: music / sound generation
class HuggingFaceAudioModel(AudioModel):
MUSICGEN_SMALL = HuggingFaceAudioSpec("facebook/musicgen-small", pipeline_type="musicgen")
AUDIOLDM2 = HuggingFaceAudioSpec("cvssp/audioldm2", pipeline_type="audioldm2", default_steps=200)
# aimu/models/providers/hf/speech.py: text-to-speech
class HuggingFaceSpeechModel(SpeechModel):
BARK = HuggingFaceSpeechSpec("suno/bark", pipeline_type="bark", default_voice="v2/en_speaker_6")
Capability fields differ per modality; set the ones that matter for the model:
HuggingFaceImageSpec:pipeline_class,img2img_pipeline_class,img2img_uses_strength,supports_negative_prompt,default_steps/default_guidance/default_width/default_height,default_negative_prompt,max_prompt_tokens.GeminiImageSpec:supports_negative_prompt(defaultsFalse),default_aspect_ratio,default_image_size,image_config_kwargs.HuggingFaceAudioSpec:pipeline_type("musicgen"/"audioldm2"/"stable_audio"),default_duration_s,default_steps.HuggingFaceSpeechSpec/OpenAISpeechSpec:pipeline_type(HF:"tts_pipeline"/"speecht5"/"bark"),default_voice,default_speed.
Because these specs carry behaviour the runtime depends on (e.g. pipeline_class, supports_negative_prompt), the enum is the single source of truth: the string form ("hf:<repo>", "gemini:<id>") resolves to the same spec object as the enum member, and an unknown id raises.
Custom models¶
For a one-off model not worth adding to the catalog, construct the spec yourself and pass the object (not a string) to the client. This is an explicit, deliberate escape hatch where you supply every capability flag:
from aimu.models import ImageClient
from aimu.models.base import HuggingFaceImageSpec
spec = HuggingFaceImageSpec("my-org/my-diffusion-model", pipeline_class="StableDiffusionPipeline")
client = ImageClient(spec) # accepted: you've stated the capabilities explicitly
If the model is genuinely best-of-class, prefer adding it to the enum so everyone benefits.
Verify¶
from aimu.models import AnthropicModel
assert AnthropicModel.CLAUDE_OPUS_5.value == "claude-opus-5"
assert AnthropicModel.CLAUDE_OPUS_5.supports_tools
assert AnthropicModel.CLAUDE_OPUS_5 in AnthropicModel.TOOL_MODELS
Then run the model-client tests against the new entry:
The model client is auto-pulled (Ollama, HuggingFace) or hits the cloud API on first use.
See also¶
aimu.models.ModelSpec: the dataclass fields- Model matrix: every shipped model with capability flags