On the epistemics of model uncertainty in consequential decisions

The confidence scores displayed by AI systems rarely correspond to epistemic reliability. This asymmetry creates measurable risk in domains where decisions compound.

Large language models produce outputs with probabilistic fluency that masks fundamental uncertainty. When a model generates text with apparent conviction, it offers no meaningful signal about whether the underlying claim corresponds to external reality. This distinction—between linguistic confidence and epistemic confidence—remains poorly understood by most institutional adopters.

The practical consequence: professionals who integrate AI outputs into high-stakes workflows must develop calibration frameworks that their tools cannot provide. A model that states "the regulatory deadline is 31 March" with identical fluency to "the regulatory deadline is likely in Q1" provides no mechanism for distinguishing fact from inference. Both statements emerge from the same generative process, weighted by training distributions rather than verified knowledge.

Swiss financial institutions navigating this terrain have begun implementing structured verification protocols: AI-generated intelligence marked as provisional until cross-referenced against authoritative sources, confidence indicators derived from source plurality rather than model self-assessment, and clear delineation between synthesis (where models excel) and factual assertion (where they remain unreliable).

Implications

Institutional AI adoption without epistemic scaffolding creates liability exposure. The question is not whether AI-generated errors will occur in regulated contexts, but whether organizations have demonstrable processes for catching them before they compound into decisions.