Model matrix¶
Every model enum member shipped with AIMU, with capability flags. Generated by hand; kept up-to-date with the enums in aimu/models/.
Legend: ✅ = supported, — = not supported.
Anthropic (AnthropicModel)¶
| Enum member | Model id | Tools | Thinking | Vision |
|---|---|---|---|---|
CLAUDE_SONNET_4_6 |
claude-sonnet-4-6 |
✅ | ✅ | ✅ |
CLAUDE_OPUS_4_6 |
claude-opus-4-6 |
✅ | ✅ | ✅ |
CLAUDE_HAIKU_4_5 |
claude-haiku-4-5 |
✅ | — | ✅ |
OpenAI (OpenAIModel)¶
| Enum member | Model id | Tools | Thinking | Vision |
|---|---|---|---|---|
GPT_4O_MINI |
gpt-4o-mini |
✅ | — | ✅ |
GPT_4O |
gpt-4o |
✅ | — | ✅ |
GPT_4_1 |
gpt-4.1 |
✅ | — | ✅ |
GPT_4_1_MINI |
gpt-4.1-mini |
✅ | — | ✅ |
GPT_4_1_NANO |
gpt-4.1-nano |
✅ | — | ✅ |
O4_MINI |
o4-mini |
✅ | — | ✅ |
O3 |
o3 |
✅ | — | ✅ |
O3_MINI |
o3-mini |
✅ | — | — |
o-series models emit reasoning tokens that aren't exposed via the API, so thinking=False even though they reason internally. Pass reasoning_effort via generate_kwargs if needed.
Google Gemini (GeminiModel)¶
| Enum member | Model id | Tools | Thinking | Vision |
|---|---|---|---|---|
GEMINI_2_0_FLASH |
gemini-2.0-flash |
✅ | — | ✅ |
GEMINI_2_0_FLASH_LITE |
gemini-2.0-flash-lite |
✅ | — | ✅ |
GEMINI_1_5_PRO |
gemini-1.5-pro |
✅ | — | ✅ |
GEMINI_1_5_FLASH |
gemini-1.5-flash |
✅ | — | ✅ |
GEMINI_2_5_PRO |
gemini-2.5-pro |
✅ | ✅ | ✅ |
GEMINI_2_5_FLASH |
gemini-2.5-flash |
✅ | ✅ | ✅ |
Gemini 2.5 thinking models emit <think> tags on Google's OpenAI-compatible endpoint.
Ollama native (OllamaModel)¶
| Enum member | Model id | Tools | Thinking | Vision |
|---|---|---|---|---|
QWEN_3_6_35B |
qwen3.6:35b |
✅ | ✅ | — |
QWEN_3_6_27B |
qwen3.6:27b |
✅ | ✅ | — |
QWEN_3_5_9B |
qwen3.5:9b |
✅ | ✅ | — |
QWEN_3_32B |
qwen3:32b |
✅ | ✅ | — |
QWEN_3_8B |
qwen3:8b |
✅ | ✅ | — |
GEMMA_4_E4B |
gemma4:e4b |
✅ | ✅ | ✅ |
GEMMA_4_26B |
gemma4:26b |
✅ | ✅ | ✅ |
GEMMA_4_31B |
gemma4:31b |
✅ | ✅ | ✅ |
GEMMA_3_12B |
gemma3:12b |
— | — | ✅ |
NEMOTRON_CASCADE_2_30B |
nemotron-cascade-2:30b |
✅ | ✅ | — |
NEMOTRON_3_NANO_30B |
nemotron-3-nano:30b |
✅ | ✅ | — |
GLM_4_7_FLASH_31B_Q4 |
glm-4.7-flash:q4_K_M |
— | ✅ | — |
GPT_OSS_20B |
gpt-oss:20b |
✅ | ✅ | — |
MAGISTRAL_SMALL_24B |
magistral:24b |
✅ | ✅ | — |
MINISTRAL_3_14B |
ministral-3:14b |
✅ | — | — |
PHI_4_MINI_3_8B |
phi4-mini:3.8b |
— | — | — |
PHI_4_14B |
phi4:14b |
— | — | — |
DEEPSEEK_R1_8B |
deepseek-r1:8b |
— | ✅ | — |
SMOLLM2_1_7B |
smollm2:1.7b |
— | — | — |
LLAMA_3_2_3B |
llama3.2:3b |
— | — | — |
LLAMA_3_1_8B |
llama3.1:8b |
— | — | — |
Some Ollama models can technically be asked for tools but produce unreliable tool calls; those are marked tools=False and documented in the enum source.
HuggingFace (HuggingFaceModel)¶
| Enum member | Repo id | Tools | Thinking | Vision |
|---|---|---|---|---|
QWEN_3_6_27B |
Qwen/Qwen3.6-27B-FP8 |
✅ | ✅ | — |
QWEN_3_6_27B_VL |
Qwen/Qwen3.6-27B-FP8 |
✅ | ✅ | ✅ |
QWEN_3_5_9B |
Qwen/Qwen3.5-9B |
✅ | ✅ | — |
QWEN_3_5_9B_VL |
Qwen/Qwen3.5-9B |
✅ | ✅ | ✅ |
QWEN_3_8B |
Qwen/Qwen3-8B |
✅ | ✅ | — |
GEMMA_4_E4B |
google/gemma-4-E4B-it |
✅ | — | ✅ |
GEMMA_3_12B |
google/gemma-3-12b-it |
— | — | ✅ |
GPT_OSS_20B |
openai/gpt-oss-20b |
✅ | ✅ | — |
MAGISTRAL_SMALL |
mistralai/Magistral-Small-2509 |
✅ | — | — |
MISTRAL_NEMO_12B |
mistralai/Mistral-Nemo-Instruct-2407 |
✅ | — | — |
MISTRAL_7B |
mistralai/Mistral-7B-Instruct-v0.3 |
✅ | — | — |
PHI_4_MINI_3_8B |
microsoft/Phi-4-mini-instruct |
— | — | — |
PHI_4_14B |
microsoft/phi-4 |
— | — | — |
DEEPSEEK_R1_8B |
deepseek-ai/DeepSeek-R1-Distill-Llama-8B |
— | ✅ | — |
SMOLLM3_3B |
HuggingFaceTB/SmolLM3-3B |
✅ | ✅ | — |
LLAMA_3_2_3B |
unsloth/Llama-3.2-3B-Instruct |
✅ | — | — |
LLAMA_3_1_8B |
meta-llama/Meta-Llama-3.1-8B-Instruct |
✅ | — | — |
_VL suffix variants load with AutoModelForImageTextToText for the vision encoder.
llama-cpp (LlamaCppModel)¶
| Enum member | Hint id | Tools | Thinking | Vision |
|---|---|---|---|---|
LLAMA_3_1_8B |
llama-3.1-8b |
— | — | — |
LLAMA_3_2_3B |
llama-3.2-3b |
— | — | — |
MISTRAL_7B |
mistral-7b |
✅ | — | — |
QWEN_3_4B |
qwen3-4b |
✅ | ✅ | — |
QWEN_3_8B |
qwen3-8b |
✅ | ✅ | — |
DEEPSEEK_R1_7B |
deepseek-r1-7b |
— | ✅ | — |
PHI_4_MINI |
phi-4-mini |
✅ | — | — |
llama-cpp model ids are hints; the actual model is loaded from model_path= regardless. Capability flags are honoured by the client.
OpenAI-compatible local servers¶
LMStudioOpenAIModel, OllamaOpenAIModel, HFOpenAIModel, VLLMOpenAIModel, LlamaServerOpenAIModel, and SGLangOpenAIModel all enumerate the same set of common open models (Llama 3.x, Mistral 7B, Phi-4 Mini, Qwen 3.x, DeepSeek R1, Gemma 3). The model id format differs per server (LM Studio uses loaded model keys, Ollama uses name:tag, vLLM/SGLang/HF Serve use HuggingFace repo paths, llama-server uses GGUF filenames). See the enum source for each.
Image generation¶
Image clients use a different spec class than text (HuggingFaceImageSpec / GeminiImageSpec) — the capability flags don't apply, so the matrix shows model-specific defaults instead.
HuggingFace diffusers (HuggingFaceImageModel)¶
| Enum member | Repo id | Pipeline class | Default steps | Default size |
|---|---|---|---|---|
SD_1_5 |
runwayml/stable-diffusion-v1-5 |
StableDiffusionPipeline |
25 | 512×512 |
SDXL_BASE |
stabilityai/stable-diffusion-xl-base-1.0 |
StableDiffusionXLPipeline |
30 | 1024×1024 |
SD_3_5_MEDIUM |
stabilityai/stable-diffusion-3.5-medium |
StableDiffusion3Pipeline |
28 | 1024×1024 |
FLUX_DEV |
black-forest-labs/FLUX.1-dev |
FluxPipeline |
28 | 1024×1024 |
FLUX_SCHNELL |
black-forest-labs/FLUX.1-schnell |
FluxPipeline |
4 | 1024×1024 |
Spec defaults are starting points — pass num_inference_steps=, guidance_scale=, width=, height=, seed= to override per call. Power users can bypass the enum with a "hf:<repo_id>" string for any HuggingFace diffusers model (defaults to DiffusionPipeline auto-detect loader).
Google Gemini (GeminiImageModel)¶
| Enum member | Model id | Notes |
|---|---|---|
NANO_BANANA |
gemini-2.5-flash-image |
GA channel. Aspect ratio via aspect_ratio= (e.g. "1:1", "16:9"). |
NANO_BANANA_PREVIEW |
gemini-2.5-flash-image-preview |
Preview channel; kept for users who pinned it. |
Short-name aliases like "gemini:nano-banana" resolve to the full model id at construction. Nano Banana's generate_content API returns one image per call; num_images > 1 issues N requests.
See also¶
- Provider matrix — provider × extra × API key
- How-to: add a new model — extending these enums
- How-to: generate images — using the image surface