Skip to content

Model matrix

Every model enum member shipped with AIMU, with capability flags. Generated by hand; kept up-to-date with the enums in aimu/models/.

Legend: ✅ = supported, — = not supported.

Anthropic (AnthropicModel)

Enum member Model id Tools Thinking Vision
CLAUDE_SONNET_4_6 claude-sonnet-4-6
CLAUDE_OPUS_4_6 claude-opus-4-6
CLAUDE_HAIKU_4_5 claude-haiku-4-5

OpenAI (OpenAIModel)

Enum member Model id Tools Thinking Vision
GPT_4O_MINI gpt-4o-mini
GPT_4O gpt-4o
GPT_4_1 gpt-4.1
GPT_4_1_MINI gpt-4.1-mini
GPT_4_1_NANO gpt-4.1-nano
O4_MINI o4-mini
O3 o3
O3_MINI o3-mini

o-series models emit reasoning tokens that aren't exposed via the API, so thinking=False even though they reason internally. Pass reasoning_effort via generate_kwargs if needed.

Google Gemini (GeminiModel)

Enum member Model id Tools Thinking Vision
GEMINI_2_0_FLASH gemini-2.0-flash
GEMINI_2_0_FLASH_LITE gemini-2.0-flash-lite
GEMINI_1_5_PRO gemini-1.5-pro
GEMINI_1_5_FLASH gemini-1.5-flash
GEMINI_2_5_PRO gemini-2.5-pro
GEMINI_2_5_FLASH gemini-2.5-flash

Gemini 2.5 thinking models emit <think> tags on Google's OpenAI-compatible endpoint.

Ollama native (OllamaModel)

Enum member Model id Tools Thinking Vision
QWEN_3_6_35B qwen3.6:35b
QWEN_3_6_27B qwen3.6:27b
QWEN_3_5_9B qwen3.5:9b
QWEN_3_32B qwen3:32b
QWEN_3_8B qwen3:8b
GEMMA_4_E4B gemma4:e4b
GEMMA_4_26B gemma4:26b
GEMMA_4_31B gemma4:31b
GEMMA_3_12B gemma3:12b
NEMOTRON_CASCADE_2_30B nemotron-cascade-2:30b
NEMOTRON_3_NANO_30B nemotron-3-nano:30b
GLM_4_7_FLASH_31B_Q4 glm-4.7-flash:q4_K_M
GPT_OSS_20B gpt-oss:20b
MAGISTRAL_SMALL_24B magistral:24b
MINISTRAL_3_14B ministral-3:14b
PHI_4_MINI_3_8B phi4-mini:3.8b
PHI_4_14B phi4:14b
DEEPSEEK_R1_8B deepseek-r1:8b
SMOLLM2_1_7B smollm2:1.7b
LLAMA_3_2_3B llama3.2:3b
LLAMA_3_1_8B llama3.1:8b

Some Ollama models can technically be asked for tools but produce unreliable tool calls; those are marked tools=False and documented in the enum source.

HuggingFace (HuggingFaceModel)

Enum member Repo id Tools Thinking Vision
QWEN_3_6_27B Qwen/Qwen3.6-27B-FP8
QWEN_3_6_27B_VL Qwen/Qwen3.6-27B-FP8
QWEN_3_5_9B Qwen/Qwen3.5-9B
QWEN_3_5_9B_VL Qwen/Qwen3.5-9B
QWEN_3_8B Qwen/Qwen3-8B
GEMMA_4_E4B google/gemma-4-E4B-it
GEMMA_3_12B google/gemma-3-12b-it
GPT_OSS_20B openai/gpt-oss-20b
MAGISTRAL_SMALL mistralai/Magistral-Small-2509
MISTRAL_NEMO_12B mistralai/Mistral-Nemo-Instruct-2407
MISTRAL_7B mistralai/Mistral-7B-Instruct-v0.3
PHI_4_MINI_3_8B microsoft/Phi-4-mini-instruct
PHI_4_14B microsoft/phi-4
DEEPSEEK_R1_8B deepseek-ai/DeepSeek-R1-Distill-Llama-8B
SMOLLM3_3B HuggingFaceTB/SmolLM3-3B
LLAMA_3_2_3B unsloth/Llama-3.2-3B-Instruct
LLAMA_3_1_8B meta-llama/Meta-Llama-3.1-8B-Instruct

_VL suffix variants load with AutoModelForImageTextToText for the vision encoder.

llama-cpp (LlamaCppModel)

Enum member Hint id Tools Thinking Vision
LLAMA_3_1_8B llama-3.1-8b
LLAMA_3_2_3B llama-3.2-3b
MISTRAL_7B mistral-7b
QWEN_3_4B qwen3-4b
QWEN_3_8B qwen3-8b
DEEPSEEK_R1_7B deepseek-r1-7b
PHI_4_MINI phi-4-mini

llama-cpp model ids are hints; the actual model is loaded from model_path= regardless. Capability flags are honoured by the client.

OpenAI-compatible local servers

LMStudioOpenAIModel, OllamaOpenAIModel, HFOpenAIModel, VLLMOpenAIModel, LlamaServerOpenAIModel, and SGLangOpenAIModel all enumerate the same set of common open models (Llama 3.x, Mistral 7B, Phi-4 Mini, Qwen 3.x, DeepSeek R1, Gemma 3). The model id format differs per server (LM Studio uses loaded model keys, Ollama uses name:tag, vLLM/SGLang/HF Serve use HuggingFace repo paths, llama-server uses GGUF filenames). See the enum source for each.

Image generation

Image clients use a different spec class than text (HuggingFaceImageSpec / GeminiImageSpec) — the capability flags don't apply, so the matrix shows model-specific defaults instead.

HuggingFace diffusers (HuggingFaceImageModel)

Enum member Repo id Pipeline class Default steps Default size
SD_1_5 runwayml/stable-diffusion-v1-5 StableDiffusionPipeline 25 512×512
SDXL_BASE stabilityai/stable-diffusion-xl-base-1.0 StableDiffusionXLPipeline 30 1024×1024
SD_3_5_MEDIUM stabilityai/stable-diffusion-3.5-medium StableDiffusion3Pipeline 28 1024×1024
FLUX_DEV black-forest-labs/FLUX.1-dev FluxPipeline 28 1024×1024
FLUX_SCHNELL black-forest-labs/FLUX.1-schnell FluxPipeline 4 1024×1024

Spec defaults are starting points — pass num_inference_steps=, guidance_scale=, width=, height=, seed= to override per call. Power users can bypass the enum with a "hf:<repo_id>" string for any HuggingFace diffusers model (defaults to DiffusionPipeline auto-detect loader).

Google Gemini (GeminiImageModel)

Enum member Model id Notes
NANO_BANANA gemini-2.5-flash-image GA channel. Aspect ratio via aspect_ratio= (e.g. "1:1", "16:9").
NANO_BANANA_PREVIEW gemini-2.5-flash-image-preview Preview channel; kept for users who pinned it.

Short-name aliases like "gemini:nano-banana" resolve to the full model id at construction. Nano Banana's generate_content API returns one image per call; num_images > 1 issues N requests.

See also