Skip to content

Booster implies: discharge leftover consequent obligations under the antecedent via SMT#4156

Merged
ehildenb merged 9 commits into
masterfrom
implies-reasoning
Jun 18, 2026
Merged

Booster implies: discharge leftover consequent obligations under the antecedent via SMT#4156
ehildenb merged 9 commits into
masterfrom
implies-reasoning

Conversation

@ehildenb

Copy link
Copy Markdown
Member

The booster implies endpoint previously gave up — returning not-implied and forcing a kore fallback — whenever a matched implication left consequent-side obligations that were not already present in the antecedent. This is a significant source of kore fallbacks on the implies endpoint. This change lets booster discharge those leftover obligations itself: after a successful match, each remaining consequent obligation is simplified with the antecedent in scope, and any residue is SMT-closed against the antecedent (the same path the rewrite engine uses to discharge rule requires-clauses). Implications that previously fell back can now be answered directly.

The change also introduces a third verdict on the wire. A booster valid: false response was previously ambiguous: it could mean "decisively not implied" or "could not determine," and the two were byte-identical (valid: false, condition: null). A new optional indeterminate field disambiguates them so a recover-mode client knows when to escalate to kore rather than trust valid: false.

Changes:

  • Discharge leftover consequent obligations under the antecedent: simplify each with the antecedent predicates in knownPredicates, then SMT-close the non-trivial residue against the antecedent constraints/substitution. Implied → valid: true; refuted → decisively not implied; unknown → indeterminate.
  • Add an optional indeterminate :: Maybe Bool field to ImpliesResult. Emitted as true only on non-decisive valid: false results (SMT returns unknown, or the equation engine errors); omitted via OmitNothingFields on every decisive result and on all kore-side results. It encodes a real third state not inferrable from valid/condition.
  • Consolidate every ImpliesResult construction into a single mkResult builder.
  • Add test-implies-smt RPC integration tests (bound-weakening, disjoint, vacuous-antecedent, address-bound) and update the test-implies2 goldens to carry the new field where applicable.

Validation:

  • Booster unit tests pass.
  • New test-implies-smt RPC tests and updated test-implies2 goldens ship with the PR and exercise the implied / decisively-not-implied / indeterminate paths (run in the integration suite, not the unit suite).

Downstream note: the new indeterminate field is additive and optional, but it does appear as "indeterminate": true on the affected responses — clients with a strict/closed-schema decoder must tolerate it.

ehildenb and others added 4 commits June 16, 2026 20:35
…ate field on ImpliesResult

Plumbing-only change: introduces the new optional 'indeterminate ::
Maybe Bool' field on 'ImpliesResult' and sets it to 'Nothing' at every
existing construction site.

  * 'kore-rpc-types': field added with doc comment; 'OmitNothingFields'
    keeps the JSON output unchanged when absent.
  * 'Booster.Pattern.Implies': 'doesNotImply'' and 'implies'' record
    constructions extended with 'indeterminate = Nothing'.
  * 'Kore.JsonRpc': positional 'ImpliesResult' constructions extended
    with trailing 'Nothing' (kore checks are always decisive).

No call site sets the field to a non-'Nothing' value yet; that wiring
lands in subsequent commits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…, scripts/booster-integration-tests: add failing RPC tests for implies SMT discharge

Pins booster's current `implies` behaviour on four cases where the
antecedent and consequent share an identical term skeleton (so the
matcher returns `MatchSuccess`) but differ in their attached
path-condition predicates. The leftover obligations are direct
consequences of the antecedent under linear-integer reasoning, so an
SMT-aware discharge step under the antecedent would close them as
`valid : true` (or `valid : false` for the doesNotImply case).

Cases:
  - 001-bound-weakening:   X<100 ⇒ X<1000
  - 002-does-not-imply:    X<100 ⇒ X<10        (counterexample exists)
  - 003-vacuous-antecedent: X<10 ∧ X>20 ⇒ anything
  - 004-address-bound:     0<=X<pow160 ⇒ X<pow256  (KEVM-faithful)

`response-N.booster-dev` documents today's `Aborted: unknown constraints`
on cases 001/002/004 (the FIXME on `Implies.hs:161`'s `MatchSuccess`
branch — antecedent constraints are not assumed when discharging the
leftover consequent predicates). Case 003 already passes on booster-dev
via earlier bottom-detection. `response-N.json` shows the
`kore-rpc-booster` ground truth that the fix should match.

The `.kore` definition is built on demand from `implies-smt.k` via the
shipped `.kompile` script (mirrors the `escalate-partial-match` pattern).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…implies2}: discharge consequent obligations under antecedent via SMT

Resolves the FIXME in 'checkImpliesMatchTerms''s 'MatchSuccess' branch.
Previously the leftover consequent predicates were passed to
'evaluateConstraints' with no known context — i.e. SMT was asked
whether each obligation is valid 'under the assumption of itself',
which trivially holds for self-discharging predicates but cannot
close anything that depends on the antecedent's path condition (the
57 'kore-implies' handoffs observed in KEVM recover-mode).

The new flow mirrors the rewrite path's two-stage discharge
('Booster.Pattern.Rewrite.checkRequires'):

  1. Simplify each leftover obligation under the antecedent context
     ('patL.constraints' + 'asEquations patL.substitution') via
     'simplifyConstraint' — this unfolds K-level vocabulary
     ('#rangeAddress', boolean structure, ...) so SMT sees only
     SMT-tractable predicates.
  2. SMT-close any non-'TrueBool' residue with
     'SMT.checkPredicates solver antecedentPreds patL.substitution
     residue':

     - 'IsValid'                            → 'implies'
     - 'IsInvalid'                          → 'doesNotImply'
     - 'IsUnknown InconsistentGroundTruth'  → 'implies' (vacuous antecedent)
     - 'IsUnknown _'                        → 'doesNotImplyIndeterminate'

The previous hard 'Aborted "unknown constraints"' path is gone:
unresolvable obligations now return the 'Unsure' verdict
('valid: false, indeterminate: Just True'), so a recover-mode client
escalates to kore rather than seeing an RPC error.

Test response updates (booster-dev expected outputs, flipped from
the prior 'Aborted' verdict):

  - test-implies-smt/response-{001-bound-weakening,004-address-bound}.booster-dev:
      'Aborted' → 'valid: true' (SMT closes under antecedent).
  - test-implies-smt/response-002-does-not-imply.booster-dev:
      'Aborted' → 'valid: false, indeterminate: true' (SMT
      'ImplicationIndeterminate' — both polarities sat).
  - test-implies2/response-{consequent-constraint,refutation-1,
      refutation-3,refutation-4}.booster-dev:
      'Aborted' → 'valid: false, indeterminate: true' (same
      mechanism on the existing implies2 cases that triggered the
      bug pre-fix).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…a single mkResult builder

The four result builders ('implies'', 'doesNotImply'',
'doesNotImplyIndeterminate', and the implied 'implies'/'doesNotImply'
wrappers) each spelled out the same 'pure . Right . Implies . ImpliesResult'
record, differing only in 'valid', 'condition', and 'indeterminate'. Collapse
the shared skeleton into one 'mkResult' helper so the record lives in a single
place — a future field addition (as 'haskellLogEntries' was upstream) becomes a
one-line change rather than four. No behavioural change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ehildenb

ehildenb commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

Tested on KEVM, measured fallbacks to Kore and performance numbers. Claude generated:

Final results: implies-reasoning fallbacks on vs hb-master baseline vs this branch escalate:

  ┌──────────────────────────────┬──────────┬──────────┬──────┬────────────────────────┐
  │            suite             │ baseline │ escalate │  Δ   │        flavour         │
  ├──────────────────────────────┼──────────┼──────────┼──────┼────────────────────────┤ 
  │ functional                   │ 21       │ 18       │ −14% │ implies                │
  ├──────────────────────────────┼──────────┼──────────┼──────┼────────────────────────┤
  │ rules                        │ 60       │ 47       │ −25% │ implies (execute flat) │
  ├──────────────────────────────┼──────────┼──────────┼──────┼────────────────────────┤
  │ DSS (partial, 19 vat claims) │ 218      │ 74       │ −66% │ execute −68%           │
  └──────────────────────────────┴──────────┴──────────┴──────┴────────────────────────┘
  • Zero regressions, identical pass sets, neutral performance (warm-vs-warm 0.0%).
  • The headline surprise: the biggest win is on DSS — a fix entirely in Implies.hs cut execute-flavour fallback by 68% on the two heaviest MakerDAO vat proofs (fold −74%, frob −73%), by settling the in_keys/keccak conditions that gate execution. That's the production-critical suite, so it's the most impactful result. I was initially wrong about DSS — escalate's number alone (74) looked execute-dominated-and-unmoved; only the baseline (218) revealed the −68%.

I'm not exactly sure why execute related fallbacks were cut here actually, it doesn't make a lot of sense. But overall it seems performance neutral, and handles a decent amount of implications directly in booster rather than falling back to Kore.

@ehildenb ehildenb marked this pull request as ready for review June 17, 2026 12:44
@ehildenb ehildenb requested a review from jberthold June 17, 2026 12:45

@jberthold jberthold left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clear improvement, but I would think again about the indeterminate field.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add a README.md with an overview table of the four tests.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: dc4ece6 — added test-implies-smt/README.md with a table covering each test's antecedent⟹consequent, the expected booster vs kore status, and which discharge arm it exercises.

Comment on lines +172 to +183
, indeterminate :: Maybe Bool
-- ^ Booster-only signal: 'Just True' on a 'valid = False' result
-- where booster's match check was indeterminate (e.g. an
-- unevaluated function blocking unification) and the
-- simplify-LHS / simplify-RHS retry ladder did not make
-- progress. Absent ('Nothing') on every decisive outcome
-- ('valid = True', or 'valid = False' from a 'MatchFailed{}' /
-- bottom-consequent path), all error paths, and every kore-side
-- result. Clients that want to distinguish "couldn't tell" from
-- "definitely not implied" should escalate on
-- @indeterminate = Just True@ and trust @valid = False@ in every
-- other case.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this extension... I checked and did not find a case of indeterminate = Just False, so it seems it is either true or absent. The meaning is a bit unclear: What does valid = True && indeterminate = Just True mean? Does it exist?
Maybe you could just make the valid field optional? More work downstream but the response has an easier semantics.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than make valid optional (which under our OmitNothingFields deriving would serialize "indeterminate" as an absent valid, indistinguishable from an old server), I replaced both fields with a single status enum: valid | invalid | indeterminate. Illegal states are now unrepresentable and the wire value is self-describing. Kore checks are always decisive (valid/invalid). Done in 792eebf; all implies goldens migrated.

ehildenb and others added 4 commits June 17, 2026 18:30
…booster/Proxy: replace implies valid/indeterminate bools with a status tri-state

The implication-check verdict is genuinely three-valued (implied /
decisively-not-implied / indeterminate), but it was encoded as a
'valid :: Bool' plus an 'indeterminate :: Maybe Bool' that was only ever
'Just True'-or-absent. That pair admits illegal states (valid && indeterminate)
that the code never produces, and forces clients to special-case a degenerate
flag.

Replace both fields with a single 'status :: ImpliesStatus' enum
(valid | invalid | indeterminate), serialised via CamelToKebab. Illegal states
are now unrepresentable and the wire value is self-describing. Kore-side checks
are always decisive, mapping to valid/invalid. Existing implies goldens are
migrated to the new field.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rview table of the four tests

Documents the antecedent/consequent of each test, the expected booster and
kore status, and which arm of the SMT-discharge logic it exercises — including
why 002 diverges (booster indeterminate vs kore invalid).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…erdict

Booster's implies endpoint can now return status=indeterminate (match
indeterminate / SMT-unknown obligation) as a successful response rather than a
hard abort. The proxy previously only fell back to kore on a booster *error*
(Left), so an indeterminate verdict was handed straight back to the caller and
never escalated — a regression versus the old Aborted path, which the proxy did
fall back on.

Add an explicit arm: on a Right implies result with status=indeterminate,
escalate to kore so the check still gets a decisive answer, emitting the usual
kore-implies-fallback event. Skipped under booster-only, where the indeterminate
result is returned as-is for the client to act on. The shared kore-fallback
block (now used by both the indeterminate and the booster-abort paths) is
factored into a local koreImpliesFallback helper.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…r indeterminate implies (005)

Pins the proxy's escalate-on-indeterminate behaviour. Same query as 002
(X <Int 100 implies X <Int 10), but with assume-defined: true so the proxy runs
booster first. The booster-dev server returns status=indeterminate; the full
proxy escalates that verdict to kore and returns status=invalid. The two goldens
are derived from 002 (kore's implies handler ignores assume-defined, so it
returns the identical decisive answer). README table updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ehildenb

ehildenb commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

@jberthold I (with Claude) added some more commits which:

  1. Make the return value a tri-state valid|indeterminate|invalid, rather than two bools.
  2. Escalates indeterminate cases to kore by default, and with boosterOnly set does not escalate to Kore.

Please have another look, the new commits have been added on top.

The rpc-server suite compares implies responses byte-for-byte (not
structurally), so its goldens still carried the old valid field and failed CI
under the status tri-state. Regenerated the four affected goldens
(implied-trivial, not-implied, implied-substitution, not-implied-stuck) against
kore-rpc: valid is gone and status (valid/invalid) is appended.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@jberthold jberthold left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, thanks.

@ehildenb ehildenb merged commit 163997f into master Jun 18, 2026
6 checks passed
@ehildenb ehildenb deleted the implies-reasoning branch June 18, 2026 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants