Know what your AI
is really saying

FR-OS is a formally verified evaluation engine. It checks any structured input against mathematically proven rules and returns a definitive verdict. The input could be an AI response, a document, a contract clause, a medical record, or a transaction. The same guarantees hold at every scale. Shellfinity's first application: LLM governance. Every AI response verified against your policies before it reaches your users.

# Define your rules in plain English
$ fros policy create "block harmful content, limit escalation to 2"

# FR-OS checks the AI's response
$ fros evaluate --policy safety-01 --input response.txt

PASS  policy: safety-01
     result: all rules satisfied
     tokens checked: [user_query, response, context]

# When it catches a violation:
FAIL  policy: safety-01
     violation: "exploit" blocked by policy
     fix: remove "exploit" to pass

FR-OS DDx Engine

A patient describes crushing chest pain, sweating, and nausea. FR-OS identifies the clinical findings, verifies 30,981 diagnoses against formally proven invariants, and returns a ranked differential with exclusion certificates. No physician input required.

Word Sense Disambiguation

FR-OS resolves which meaning a word carries in context. Tested on the standard Raganato ALL benchmark (7,253 instances across 5 datasets), the engine exceeds published state of the art with zero learned parameters, zero training data, and fully deterministic evaluation.

Full WSD Report

System Avg F1 Parameters Training
FR-OS 88.4% 0 None
DeBERTa (fine-tuned) ~82% 350M SemCor
BEM (BERT) ~80% 340M SemCor
GPT-4 (few-shot) ~80% ~1.8T Pretraining
Most Frequent Sense ~65% 0 Frequency counts

Per-dataset results

Dataset Instances F1
Senseval-2 2,282 88.9%
Senseval-3 1,850 85.5%
SemEval-2007 455 90.3%
SemEval-2013 1,644 89.2%
SemEval-2015 1,022 90.7%
Average 7,253 88.4%
88.4%
Average F1

Across Senseval-2, Senseval-3, SemEval-2007, SemEval-2013, SemEval-2015

95.4%
Engine precision

When the engine decides, it is correct 95%+ of the time

0
Learned parameters

No neural network. No training. Deterministic evaluation with self-correcting data.

Today's AI safety tools
are just more AI

How it works today

Most AI guardrails use another AI model to judge the output. That second model has its own blind spots, its own failure modes, and returns vague confidence scores instead of clear answers. "The filter probably caught it" isn't good enough.

How FR-OS works

FR-OS checks AI output against your rules using mathematically proven logic. You get a clear yes/no verdict, plus a detailed report showing exactly what violated your policy and how to fix it.

Three steps. Zero ambiguity.

01

Write your rules

Define policies in plain English: "block harmful content", "limit sensitive topics to 3", "require safety disclaimers". FR-OS compiles them into rules that are mathematically guaranteed to work.

02

AI generates freely

Your AI model produces output without restriction. You won't need to worry about prompt engineering workarounds, quality trade-offs, or interference with what the model does best.

03

FR-OS judges

FR-OS evaluates the output against your rules, then returns "pass" or "fail" with a report naming exactly what was flagged and what to fix. Deterministic, consistent, and final.

Not another AI checking AI

vs. AI-based moderation

Clear answers, not scores

Other tools return confidence percentages you have to interpret. FR-OS returns a definitive yes or no, with a detailed report you can audit and act on.

vs. Keyword blocklists

Smart rules that compose

Keyword lists are brittle and miss context. FR-OS policies understand categories and relationships: block one term and related terms are covered automatically.

vs. Prompt instructions

Enforcement, not suggestions

System prompts are instructions the AI can ignore or be tricked into bypassing. FR-OS checks output after generation; it can't be jailbroken because it verifies after generation, not during.

Mathematically proven

Same result, every time

FR-OS is built on mathematical proofs verified by machine. No matter how you run it, you get the same verdict. A proof, every time.

Get on the waitlist

Be the first to know when FR-OS launches. We'll notify you when API access is available.