At the end of July 2025, Consilium debuted at the Gradio Agents & MCP hackathon as a platform where four large language models sit around a virtual poker table and discuss a query in real time. Instead of each model delivering its own answer like a solo act, they hold a mini‑discussion and then select a winner by vote or weighted rating.

In practice, a request is sent to an MCP server that distributes it among several models—OpenAI, Claude and others. Each generates a response variant, "discusses" it in a visual Gradio component, and the system produces a final output according to a pre‑defined rule. The result is essentially a virtual expert panel with no salaries or coffee breaks.

Tests showed accuracy rising to 85.5 % on medical diagnostic tasks, while practicing physicians achieved only about 20 % accuracy on the same benchmarks. For businesses this translates into fewer erroneous recommendations and faster decision making. Integration through MCP and Gradio does not require rewriting code; you simply connect the server to an existing pipeline, and the visual interface helps debug the process on the fly.

Why it matters now: additional model costs grow linearly, but after three to four months of operation the savings from reduced errors and accelerated workflows offset those expenses. Orchestrating LLMs becomes a competitive edge without massive infrastructure investment.

LLMOrchestrationAIBusinessProcessAutomationConsilium