Probabilistic Finance™ is the discipline of running corporate finance with uncertainty made explicit. Where legacy FP&A treats forecasting as a presentation problem — how to display the number — FinHelm treats it as a measurement problem: how to score the number’s reliability. The first design produces tools. The second produces a category.
FIG. I.a — The Probability Helm — six bezier spokes, one center, one ring. Symbol of the measurement layer that holds the forecast steady against uncertainty.
The technology stack required to fix this — MCP-native ledger integration, large language models reasoning over financial data, Monte Carlo simulation in milliseconds — converged across 2024 and 2025. Before that, probabilistic FP&A required a quant team and a six-figure budget. After that, it requires a CFO, an OAuth flow, and twelve minutes.
The methodology that operationalizes the technology has a name: Uncertainty-Aware FP&A™. The architecture that produces the methodology has a name: the Probability Helm Stack. The intelligence it surfaces has a number, a polarity, and a reading.
FinHelm transforms FP&A from narrators of variance into architects of confidence.
You know your credit score. You know your customer satisfaction index. You know your team’s engagement score. Ask a CFO what their forecast accuracy score is, and the room goes quiet. Not because the question is new — because the answer, for most organizations, does not exist.
The single most consequential output of FP&A — the number the board approves capital against — has no quality metric attached to it. Sales has win rates. Manufacturing has defect rates. Customer success has retention curves. Finance has nothing.
The cost is not the miss. The cost is the absence of measurement infrastructure that would have told the CFO, in advance, that the projected number had a meaningful probability of arriving somewhere else. That information was computable. It was not surfaced because no software in the category was designed to surface it. The 2026 AFP FP&A Benchmarking Survey — combined with a review of the planning-software category — establishes the gap:
FIG. II.a — Sources: AFP 2026 FP&A Benchmarking Survey (Brisbane, April 2026); FinHelm category review.
The Probability Helm Stack is the operational architecture. Six pillars radiate from a central calibrated forecast — a P10/P50/P90 distribution with explicit confidence bands. Four pillars surface as intelligence metrics. Two are infrastructure substrates without which the four cannot compute. Each pillar draws from a distinct multi-decade-validated scientific tradition. The combination is uniquely FinHelm’s.
FIG. III.a — The Probability Helm Stack. Six axes of intelligence — four user-facing metrics, two infrastructure substrates — arranged around a single calibrated hub. The architecture and the brand mark are the same artifact.
A 0–100 score of forecast magnitude — lower is tighter. The headline metric a CFO reads first and the figure that makes uncertainty comparable across business units, across calendar quarters, and across companies.
A 0–100 score of trajectory convergence across revisions — higher is steadier. FSI computes without actuals; it reads only the path the forecast has taken across its revision history. The answer to the question a board has begun asking and the legacy stack cannot answer: is the forecast settling, or is it bouncing?
A ranked list of drivers carrying the uncertainty. AMI answers what is driving the variance — which subscription cohort, which expense category, which macro assumption. The attribution surface that turns variance commentary from narrative into evidence.
Every actual is an experiment. The Reflection Engine compares forecasts to outcomes, scores the fidelity of the prior distribution, identifies systematic biases, and updates the parameters that feed every forward forecast. Six months of calibration history is not portable backwards onto a competing platform. The moat compounds at the calibration layer.
Beneath the four metrics, the Monte Carlo Engine runs ten thousand stochastic simulations per forecast in under three seconds, and the ProbabilisticCell™ substrate stores every cell as a probability distribution rather than a single number. The engine produces the distributions; the substrate remembers them. Patent Family A covers the substrate.
The architecture and the FinHelm mark are the same artifact — six spokes meeting a wheel rim, organized around a single hub. The diagram on this page is the diagram in the product UI, the diagram in the patent applications, and the diagram on every keynote slide. Coherence at this layer is what category creation looks like when it is working.
The Probability Helm Stack produces four user-facing intelligence metrics. A CFO reading FinHelm’s output reads four numbers — each answering a different executive question against a different data substrate using a different mathematical tradition. None of the four can substitute for any other.
Two of the four — FSI and the Forecast Fidelity Score — are 0–100 scalar scores with higher-is-better polarity. They are not redundant. FSI measures the forecasting process: is the trajectory converging across revisions? FFS measures the forecasting output: has the system been historically accurate when actuals landed? A steady-handed marksman can still miss. A shaky-handed marksman can occasionally hit. Reading only one tells you only half the story.
Plotted against each other, FSI × FFS produces the canonical CFO diagnostic frame of the entire platform: the Forecast Quality Matrix. Four quadrants. Mathematically distinct. Operationally actionable.
Stable trajectory, systematic bias
The forecast is converging — but on the wrong number. Investigate model specification before trusting the point estimate.
Stable and historically accurate
Process and output are aligned. The forecast is suitable for board commitment. Standard monthly review cadence.
Chaotic process, inaccurate output
Both dimensions are failing. Do not commit decisions. Address data quality and model specification before re-running.
Chaotic trajectory, accurate history
Historical accuracy reflects past calibration — current revisions are bouncing. Wait for the trajectory to stabilize before committing.
FIG. IV.a — The Forecast Quality Matrix. Three of the four quadrants are diagnostic states where one metric without the other would mislead. AMI™ explains why a forecast lives in any quadrant by attributing the responsible drivers.
The matrix scales. The same diagnostic frame reads a single forecast inside a single company and an arrayed set of forecasts across a book of them. The shape of the array is the diagnostic — clustering in Trustworthy is the value-creation outcome; clustering in Confidently Wrong is the intervention list; clustering in Broken is the data-quality remediation list. The frame is the same. The reading scales.
FinHelm sits above the systems of record where actuals live and the systems of plan where forecasts are built. It reads from both. It enriches both with probability. It replaces neither.
Modern corporate finance is built on two tiers. The ERP — QuickBooks, NetSuite, Sage Intacct, Microsoft Dynamics, SAP, Workday, DualEntry — stores the actuals. The planning tool — Planful, Adaptive, Anaplan, Vena, OneStream, Pigment — produces the forecast. Each tier ships deterministic, single-number-per-cell output. Neither is structured to compute probability. The variance between the two tiers is calculated as the gap between two numbers, with no measurement of what either number’s reliability was when it was written.
The category exists because the layer between them has never existed before — until MCP made it economically possible. Connecting a probabilistic layer to N ERPs and M planning systems through bespoke REST APIs would be an integration shop disguised as a product. MCP collapses this. A new system enters the layer when someone writes its MCP server; the layer itself does not change. N times M becomes N plus M, and N plus M scales — across ERPs, across planning tools, across any number of entities in a single consolidated reading.
“The PortCo doesn’t implement FinHelm. The PortCo’s operating partner reads one number, and the rest of the methodology arrives. The implementation IS the reading.”
FIG. V.a — Headless FP&A is the consequence of the layer architecture. The methodology travels to whichever surface the practitioner is already working in — the ERP dashboard, the planning canvas, an Excel sheet, a Power BI workbook, a Claude conversation. There is no central application to log into because the application is not the center.
Three commercial tiers. Three categorically different jobs. Beacon for visibility. Compass for planning. Command for governance. The narrative builds — be the signal, find your direction, take command. Each tier maps to the Application architecture and to a different layer of the operating discipline.
Be the signal.
$49 / month
For controllers and finance leads who need to broadcast forecast quality before the team or budget for full planning tooling exists. The entire Probabilistic Analysis category — Dashboard with UES™ headline, FSI™ stability trend, top AMI™ drivers; Variance Analyzer with attribution; Probabilistic Cash Runway at P10/P50/P80/P90. The Forecast Health Check Agent inside Claude. One ledger connector. 1,000 Monte Carlo simulations per month. Single user, community support, the shareable UES™ badge for board decks.
Choose Beacon →Find your direction.
$499 / month
For FP&A teams ready to plan, not just observe. Adds the entire Probabilistic Planning category — Scenario Builder with probability-weighted scenarios, the Planning Grid where every cell is a ProbabilisticCell™, Driver Sensitivity as an interactive AMI™ surface. Three ledger connectors. 10,000 simulations per month. Five users included. The probabilistic planning layer for $7M–$50M companies that have outgrown spreadsheets but cannot stomach a six-month implementation.
Choose Compass →Take command.
$2,500+ / month
For mid-market, multi-entity, and governance-ready operating disciplines. Adds full Reflection Engine™ closed-loop calibration, multi-entity rollups, the five governance-ready capabilities (Portfolio UES Rollup, Exit Readiness Diagnostic, Variance Taxonomy, Maturity Assessment, Skills Framework), SSO/SAML, dedicated success management. The tier where six dimensions of forecast reliability are read across an arrayed set of operating companies. The tier where the moat compounds — calibration history is not portable backwards.
Talk to Command →FIG. VI.a — The price discontinuity from Compass to Command is intentional. The buyer crossing from $499 to $2,500+ is making a categorically different commitment, not a marginal upgrade. Read the full tier-by-capability crosswalk at finhelm.ai/pricing.
Three calibrated entry points. Pick the one that matches where you are.
Get your UES™ score in sixty seconds. No credit card. The fastest path to seeing the methodology against your own data.
Read the methodology in full. The Probability Helm Stack, the Forecast Quality Matrix, the architecture. The detailed argument for why Probabilistic Finance™ exists as a category and why the architecture is structurally unreplicable by incumbents.
Compare Beacon, Compass, and Command. The full tier-by-capability matrix, FAQ, annual versus monthly pricing, and the path from free demonstration to first paid tier — including the governance-ready capabilities at Command.