Add or update a provider¶
A provider is a backend AIMU talks to (Ollama, Anthropic, an OpenAI-compatible server, …). Adding one means writing a client class and wiring it into the factory. This is different from adding a new model, which is just a new member on an existing provider's Model enum.
Provider clients live under aimu/models/providers/, one module per provider. The public surface is unchanged by where the file sits: users still reach your client through aimu.client("yourprovider:model-id"), ModelClient(YourModel.X), or from aimu.models import YourClient.
1. Decide the file layout¶
- Flat module:
providers/<name>.py. The default, used byanthropic.py,ollama.py,llamacpp.py. - Subpackage:
providers/<name>/<modality>.py, only when the provider ships more than one standalone modality client (the rule that giveshf/,openai/,gemini/theirtext.py/image.py/ …). A single-client provider stays flat.
2. Write the client¶
There are two paths depending on the backend's API.
Path A: OpenAI-compatible endpoint (easiest)¶
If the backend speaks the OpenAI REST API, you don't reimplement chat/streaming/tools. Subclass OpenAICompatClient and supply a base_url + a Model enum. For a local inference server, add it alongside the others in providers/openai_compat.py:
# aimu/models/providers/openai_compat.py
class MyServerOpenAIModel(Model):
QWEN_3_8B = ModelSpec("qwen3-8b", tools=True, thinking=True)
class MyServerOpenAIClient(OpenAICompatClient):
MODELS = MyServerOpenAIModel
def __init__(self, model: MyServerOpenAIModel, base_url: str = "http://localhost:9000/v1", **kwargs):
super().__init__(model, base_url=base_url, **kwargs)
A cloud provider with its own identity and ≥2 modalities gets a subpackage instead. See how OpenAIClient lives in providers/openai/text.py and subclasses OpenAICompatClient via from ..openai_compat import OpenAICompatClient.
Path B: native SDK¶
For a backend with its own SDK/wire format, subclass BaseModelClient and implement the three abstract methods plus the capability classproperties. self.messages always stays in OpenAI format; adapt to the provider's format at request time (never mutate self.messages). Reuse the shared helpers in aimu/models/_internal/, such as streaming and image_input:
# aimu/models/providers/myprovider.py
from ..base import BaseModelClient, Model, ModelSpec, StreamChunk, StreamingContentType, classproperty
class MyProviderModel(Model):
BIG = ModelSpec("big-v1", tools=True, thinking=True, vision=True)
class MyProviderClient(BaseModelClient):
MODELS = MyProviderModel
@classproperty
def TOOL_MODELS(cls): # noqa: N805
return [m for m in cls.MODELS if m.supports_tools]
@classproperty
def THINKING_MODELS(cls): # noqa: N805
return [m for m in cls.MODELS if m.supports_thinking]
@classproperty
def VISION_MODELS(cls): # noqa: N805
return [m for m in cls.MODELS if m.supports_vision]
def __init__(self, model, model_kwargs=None, system_message=None):
super().__init__(model, model_kwargs, system_message)
# ... construct the backend SDK client ...
def _update_generate_kwargs(self, generate_kwargs=None): ...
def _generate(self, prompt, generate_kwargs=None, stream=False, images=None): ...
def _chat(self, user_message, generate_kwargs=None, use_tools=True, stream=False, images=None): ...
chat() / generate() (and the include= stream filter) are concrete on the base; you only implement _chat / _generate. Use self._chat_setup(...) to build the request and self._handle_tool_calls(...) / self._handle_tool_calls_streamed(...) to dispatch tools uniformly. See providers/anthropic.py for a full native example (including the OpenAI↔Anthropic format adapters).
Provider-local helpers
A helper used by one provider family lives with it (e.g. providers/hf/_device.py, providers/_thinking.py), not in _internal/. Put anything only your provider needs next to your provider.
3. Wire the factory¶
In aimu/models/model_client.py, three small edits make ModelClient and "provider:id" strings work:
# 1. Guarded import (so a missing optional dep degrades gracefully)
try:
from .providers.myprovider import MyProviderClient, MyProviderModel
_HAS_MYPROVIDER = True
except Exception:
_HAS_MYPROVIDER = False
MyProviderClient = MyProviderModel = None # type: ignore[assignment,misc]
# 2. Registry entry in _provider_registry() -> enables "myprovider:big-v1" strings
if _HAS_MYPROVIDER:
registry["myprovider"] = (MyProviderModel, MyProviderClient)
# 3. Dispatch in ModelClient.__init__ -> enables ModelClient(MyProviderModel.BIG)
elif _HAS_MYPROVIDER and isinstance(model, MyProviderModel):
self._client = MyProviderClient(model, **kwargs)
4. Export it (with graceful degradation)¶
In aimu/models/__init__.py, re-export the client + enum under a try/except that sets a HAS_MYPROVIDER flag and None fallbacks, then add the names to __all__ when the flag is set. This keeps import aimu working on a minimal install. (Path A local-server subclasses are exported the same way, alongside the other openai_compat names.)
5. Mirror the async surface¶
Add aimu/aio/providers/myprovider.py:
- Native providers subclass
AsyncBaseModelClient(async_chat/_generate;asyncio.TaskGroupfor concurrent tool calls). - In-process providers (those that load weights, like HF/LlamaCpp) instead wrap a sync client so they don't load weights twice (Decision 7, see async design).
Then wire aimu/aio/_model_client.py (same three edits as step 3) and export from aimu/aio/__init__.py.
6. Tests¶
- Mock coverage:
tests/test_models_api.py(no backend needed). This is the canary that catches wiring/import breaks. - Live coverage flows through
tests/test_models.py; add your provider to the dispatch intests/conftest.py/tests/helpers.pyso--client=myproviderresolves. Live tests are opt-in; barepytestskips them.
Verify¶
from aimu.models import ModelClient, resolve_model_string
assert resolve_model_string("myprovider:big-v1").supports_tools
client = ModelClient("myprovider:big-v1")
print(client.chat("hello"))
pytest tests/test_models_api.py # mock wiring (always)
pytest tests/test_models.py --client=myprovider # live (needs the backend)
See also¶
- Add a new model: register a model on an existing provider
- Architecture: the
BaseModelClientcontract and factory pattern - Use async (
aio): the async surface your mirror plugs into