Scene illustration

Optionally, the engine can generate one illustration per turn: it asks the LLM to distil the narrative, the location and the characters present into a visual prompt, then sends that prompt to an image backend. Images are saved under the data root (assets/<save_id>/turn_<n>.png) and follow the save through rewind, fork, export and import.

Image generation never blocks the game: any failure logs a warning and the turn completes without an image.

Configuration

In ~/.config/AxiomAI/settings.json:

{
  "image_generation_enabled": true,
  "image_backend": "gemini",

  "image_api_url": "http://127.0.0.1:7860",
  "image_width": 512,
  "image_height": 512,
  "image_steps": 20,
  "image_cfg_scale": 7.0,
  "image_timeout": 180,
  "image_comfyui_workflow": "",
  "image_gemini_model": "gemini-2.5-flash-image"
}

Backends

gemini — Google Gemini (cloud)

Uses the same API key as the text backend; no local install needed. The model is set by image_gemini_model, and the aspect ratio is derived from image_width/image_height. Note that image models are typically not in the Gemini free tier.

stable_diffusion — Stable Diffusion WebUI (local)

Any AUTOMATIC1111-compatible server (A1111, reForge, Forge…) reachable at image_api_url. The server must be launched with the --api flag — without it the engine gets a 404 and tells you so. image_width, image_height, image_steps and image_cfg_scale map straight to the generation request; image_timeout caps the wait per image (default 180 s — raise it on slow machines).

comfyui — ComfyUI (local)

Point image_api_url at the ComfyUI server. By default the engine submits a minimal text-to-image workflow using the first installed checkpoint; for full control, set image_comfyui_workflow to a workflow JSON file path (or the JSON itself) exported from ComfyUI. Polling is bounded by image_timeout.

From Python

axiom.image_generator.ImageGenerator exposes the two steps — generate_prompt(...) (LLM → visual prompt) and generate_image(prompt, assets_dir, filename) (backend → PNG file, or None on failure). axiom.session.Session calls it automatically at the end of each turn when image_generation_enabled is true.