Formally verified evaluation

Reasoning you can
audit

FR-OS is a formally verified evaluation engine that resolves what words mean, what diagnoses fit, and what rules are satisfied. Each ruling is mathematically proven, and errors are traceable. The verifier has zero learned parameters, and the verdict never guesses.

Join the Waitlist See Applications

The difference

Every answer is a proof

How AI works today

Language models approximate meaning through statistical patterns learned from massive datasets. They can't explain their decisions, they hallucinate, and they require enormous compute. When they're wrong, there's no mechanism to find out why.

How FR-OS works

FR-OS checks structured inputs against mathematically proven rules. Every ruling comes with a complete record of what was checked and why. When the data is insufficient, the engine reports exactly what is missing. The process is deterministic, inspectable, and final.

What the engine produces

A verdict you can read

Every ruling from the engine looks like this. A diagnosis ruled in, the clinical findings that supported it, and the competing diagnoses ruled out with the specific reason each one failed. No confidence score. No probability. A record a physician can read and a compliance officer can audit.

cert #3a7f·02 cardiovascular / emergency

engine only

Patient, 58-year-old male, arrives with crushing substernal chest pain, profuse diaphoresis, nausea, pain radiating to the left arm, onset one hour ago.

Ruled in

✓ Acute myocardial infarction

Because

Crushing substernal chest pain
Pain radiating to left arm
Diaphoresis
Nausea with sudden onset

Ruled out

✗ Pulmonary embolism No pleuritic component, no tachypnea, no hypoxia.
✗ Aortic dissection Pain not tearing; no inter-arm blood pressure differential.
✗ GERD Pain not postprandial and not relieved by antacids; diaphoresis is not expected.

Illustrative facsimile of an exclusion certificate. Every medical ruling on this site is produced by the same engine, in the same format.

“A device that cannot explain its own output does not belong in a hospital, a courtroom, or a compliance workflow. A language model is no different.”

Daniel Bearden, founder

By the numbers

Verified performance

94.5%

WSD accuracy

Word sense disambiguation on the standard Raganato benchmark (7,253 instances). Exceeds all published neural systems by over 12 points.

30,981

Medical diagnoses

Verified against formally proven rules. Ranked differential with exclusion certificates.

Learned parameters in the verifier

The engine that decides yes or no is a deterministic function with no weights and no training. An optional LLM co-processor may rank candidate fixes with a small (approximately 22 MB) sentence-embedding model, and only the engine itself renders a verdict.

Two layers, one authority. The co-processor is a ranker; the engine is the judge. Swap the ranker for a dice roll and the engine still returns the same verdicts. It only sees more or fewer good candidates first. The 94.5% WSD number and all medical determinations on this site reflect the engine alone.

Applications

One engine, many domains

FR-OS is a general-purpose evaluation engine. The same formal guarantees apply whether the input is a sentence, a patient record, or an AI response.

NLP

Language Understanding

Resolves which meaning a word carries in context. 94.5% F1 on the standard Raganato benchmark, exceeding GPT-4 and all supervised neural systems by over 12 points. The engine that makes each call has zero learned parameters.

View benchmark results

DDx

Medical Reasoning

Patient symptoms checked against 30,981 diagnoses. Returns a ranked differential with exclusion certificates showing why each diagnosis was ruled in or out.

See the demo

Gov

AI Governance

Every AI response is checked against your policies before it reaches users, with clear pass/fail verdicts and detailed reports. Deterministic, consistent, and final.

Learn how it works

Chat

Grounded Assistant

A conversational assistant where each answer cites a checkable proof. It tells you what it knows, what it doesn't, and points to the evidence. Each conversation teaches the engine something new. No retraining. No model releases. Just a system that gets smarter the more you use it.

Meet the assistant

Coming soon

Legal

Legal Language

Contract term disambiguation, compliance verification, and ambiguity detection. The same formal guarantees applied to legal text.

Learn more

Coming soon

CMVL

Verified Model Training

Models trained on structured proofs instead of raw text. The engine teaches, the model learns, the engine verifies. Domain-agnostic reasoning at neural network speed.

Learn more

How it works

Evaluate. Determine. Explain.

Define the domain

Specify the rules, knowledge, or policies for your domain. FR-OS compiles them into machine-checked rules. Medical diagnoses, word senses, content policies: the same engine handles all of them.

Evaluate inputs

Send structured inputs to the engine. It checks each candidate against the evidence and produces a definitive ruling, with no confidence scores and no thresholds. A proof, every time.

Inspect and improve

Every ruling comes with a full audit trail. When data is insufficient, the engine identifies exactly what is missing. A self-correcting loop closes the gap automatically.

Early access

Get on the waitlist

Be the first to know when FR-OS launches. We'll notify you when API access is available.

Reasoning you canaudit