Stream phases¶

StreamingContentType is a string enum with five values, attached to every StreamChunk via chunk.phase.

Phase	Value	Emitted when	`chunk.content` type
`THINKING`	`"thinking"`	Reasoning model emits internal-monologue tokens	`str` (token)
`TOOL_CALLING`	`"tool_calling"`	A tool call has just been dispatched and its result is available	`dict {"name": str, "arguments": dict, "response": str}`
`GENERATING`	`"generating"`	The final response stream	`str` (token)
`IMAGE_GENERATING`	`"image_generating"`	Per-step image-generation progress (HF callback / Gemini start-done)	`dict {"step": int, "total_steps": int, "image": PIL.Image \\| None, "final": bool, "result": str \\| bytes \\| None}`
`AUDIO_GENERATING`	`"audio_generating"`	Per-step audio-generation progress (one final chunk for MusicGen; N progress + 1 final for diffusers)	`dict {"step": int, "total_steps": int, "final": bool, "result": str \\| bytes \\| tuple \\| None, "duration_s": float}`
`SPEECH_GENERATING`	`"speech_generating"`	Per-chunk speech-generation progress (one final chunk for HuggingFace single-pass; N chunks for OpenAI HTTP stream)	`dict {"chunk_index": int, "total_chunks": int \\| None, "final": bool, "result": str \\| bytes \\| ... \\| None}`
`DONE`	`"done"`	Terminal marker (rarely yielded; reserved)	`str` (usually empty)

The enum inherits from str, so members compare equal to their string values, so chunk.phase == "thinking" is fine.

Helpers on `StreamChunk`¶

chunk.is_text()             # True for THINKING and GENERATING
chunk.is_tool_call()        # True for TOOL_CALLING
chunk.is_image_progress()   # True for IMAGE_GENERATING
chunk.is_audio_progress()   # True for AUDIO_GENERATING
chunk.is_speech_progress()  # True for SPEECH_GENERATING

These avoid sprinkling phase comparisons through your code.

Typical sequences¶

Model behaviour	Phase sequence
Plain text generation	`GENERATING*`
Thinking model, no tools	`THINKING`, then `GENERATING`
Tool-using model	`THINKING` (if thinking), `TOOL_CALLING` per call, `GENERATING`
Tool-using model, multi-round	`TOOL_CALLING* → THINKING? → GENERATING` repeated; `chunk.iteration` increments per round (agent context)

* = "one or more chunks".

Async surface¶

aimu.aio yields the same StreamChunk type; phase, content, agent, iteration are unchanged. The consumption protocol differs: Iterator[StreamChunk] on the sync surface vs AsyncIterator[StreamChunk] on the async surface. Phase semantics and filtering are identical on both. See how-to: use async.

Filtering¶

Pass include=[...] to chat() or generate() to drop phases:

client.chat("hi", stream=True, include=["generating"])
client.chat("hi", stream=True, include=["thinking", "generating"])
client.chat("hi", stream=True, include=[StreamingContentType.GENERATING])  # enum form

Accepts strings or enum members. Omitting include= yields all phases.

Stream phases¶

Helpers on StreamChunk¶

Typical sequences¶

Async surface¶

Filtering¶

See also¶

Helpers on `StreamChunk`¶