::] UNIMATRIx - A Sandbox for Emergent Behavior in LLM-Driven Agent Societies

May 4, 2026•682 words

Multi-agent LLM systems have spent the last two years drifting from research curiosity to load-bearing infrastructure. Most of these systems are teleological, agents are pointed at a goal and the framework grinds toward it. UNIMATRIx takes the opposite stance. It is a Python project that wires up a small population of LLM-backed agents, gives each one a personality, a role, and a social class, and then steps back to see what happens. No win condition. No reward signal. Just conversations, votes, and the slow self-organization of a synthetic society.

[::] UNIMATRIx on GitHub](https://github.com/gslf/UNIMATRIx)

What the System Actually Does

The simulation populates a world with N agents (the shipped configurations use 30 and 50). Each agent carries a personality, a role drawn from a small ontology and a social class, which is mutable during the run.

The orchestrator runs a tick loop. On each tick, agents engage in one-to-one conversations, group conversations, or broadcasts. They form impressions of one another. Periodically, volountary or mandatory votes occur on proposals that affect class membership. Coalitions form, opinions update, mobility happens, polarization happens.

Architectural Overview

The repository's src/unimatrix/ directory is divided into nine subsystems, each with a single responsibility.

src/unimatrix/
  config/         pydantic schema + JSON loader
  persistence/    SQLite stores + run registry
  memory/         short / medium / long-term + per-person impressions
  inference/      HTTP client (vLLM / llama.cpp / stub)
  agents/         agent runtime, system prompts
  conversations/  1-to-1, group, broadcast
  voting/         proposals, mandatory votes, tally
  orchestrator/   main loop, social-need decay, anti-silence trigger
  graphs/         matplotlib renderers
  web/            FastAPI control panel + HTML UI

A few high-level observations before drilling down. First, there is no agent framework — no LangGraph, no CrewAI, no AutoGen. The author wrote the orchestration loop by hand, which keeps the moving parts inspectable. Second, the persistence and inference layers are deliberately mundane (SQLite, HTTP), which makes the system reproducible and trivially deployable. Third, the stub inference backend is a first-class citizen, not an afterthought — and that single decision underpins the entire test strategy.

The Inference Layer: Three Backends Behind One Seam

inference/ exposes a single client interface with two

stub — produces deterministic fake replies. No GPU, no network, no LLM. This is what the test suite runs against, and what you use the first time you clone the repo to verify the wiring.
vllm — any OpenAI-compatible HTTP endpoint. The naming is historical; in practice this drives vLLM, LM Studio, llama.cpp's OpenAI-compatible mode, or a remote OpenAI-API-compatible cloud, all with the same client.

The recommended local development path is LM Studio. LM Studio handles GGUF model downloads from HuggingFace, GPU offload, quantization, and chat-template wiring, then exposes the model on http://localhost:1234 with an OpenAI-compatible API. UNIMATRIx points at that endpoint and stays out of the model-serving business entirely. This is the right separation of concerns: the orchestrator does not need to know whether it is talking to Qwen2.5-3B locally or GPT-4o in the cloud.

The CLI flags --backend, --endpoint, and --model apply as overrides on top of whichever JSON config is selected at start time, which means you can keep one config file and swap inference targets per invocation.

Configuration and Deployment

Run definitions live in config/*.json and are validated by Pydantic schemas. Drop any new *.json into the directory and it appears in the UI dropdown on the next refresh.

Bootstrap is super easy

py -3 -m venv .venv
venv/Scripts/activate # Windows
pip install -e .


python -m unimatrix.main --backend stub


python -m unimatrix.main \
    --backend vllm \
    --endpoint http://localhost:1234 \
    --model phi-4-reasoning-plus

The shipped standard.json already points at LM Studio's default endpoint, so once a model is loaded you can drop the flags entirely.

Optional extras follow Python packaging conventions: pip install -e ".[embed]" for real embeddings, pip install -e ".[dev]" for the pytest suite. The test suite runs against the stub backend, which is why that backend exists in the first place — every test is fast, deterministic, and free.

[::] UNIMATRIx on GitHub](https://github.com/gslf/UNIMATRIx)

::] UNIMATRIx - A Sandbox for Emergent Behavior in LLM-Driven Agent Societies

What the System Actually Does

Architectural Overview

The Inference Layer: Three Backends Behind One Seam

Configuration and Deployment

More from GSLF
All posts

[UNIMATRIx] Concept of an AI society simulation

How Agent Memory Works in UNIMATRIx

::] UNIMATRIx - A Sandbox for Emergent Behavior in LLM-Driven Agent Societies

What the System Actually Does

Architectural Overview

The Inference Layer: Three Backends Behind One Seam

Configuration and Deployment

More from GSLFAll posts

[UNIMATRIx] Concept of an AI society simulation

How Agent Memory Works in UNIMATRIx

More from GSLF
All posts