Skip to content

Stop flagging snake_case is_error results as tool errors in OTel span#2971

Merged
Kludex merged 2 commits into
mainfrom
otel-drop-snakecase-tool-error
Jun 25, 2026
Merged

Stop flagging snake_case is_error results as tool errors in OTel span#2971
Kludex merged 2 commits into
mainfrom
otel-drop-snakecase-tool-error

Conversation

@Kludex

@Kludex Kludex commented Jun 25, 2026

Copy link
Copy Markdown
Member

Follow-up to #2970.

OpenTelemetryMiddleware matched three result shapes as a tool error: CallToolResult(is_error=True), {"isError": True}, and {"is_error": True}. The snake_case dict arm is wrong: serialize_server_result validates the handler result with by_name=False (alias-only), so a snake_case is_error key is silently dropped and the client receives a success response with no isError field. The span then reported error.type=tool_error / StatusCode.ERROR for a request whose wire response was a success - false-positive error spans.

Verified against the repo's pydantic:

{'content': [], 'is_error': True}  ->  {'content': []}                   # success on the wire
{'content': [], 'isError': True}   ->  {'content': [], 'isError': True}  # real error

Only CallToolResult and the camelCase {"isError": True} dict survive serialization as wire errors, so the match now covers just those two. The previous test is repurposed to assert the snake_case shape is not flagged and that the result serializes to {"content": []}.

Thanks to the claude[bot] review on #2970 for catching this.

AI Disclaimer

This PR was developed with the assistance of either Claude or Codex. I've reviewed and verified the changes.

Review in cubic

The tools/call result match arm treated {"is_error": True} as a tool error,
but serialize_server_result validates with by_name=False (alias-only), so the
snake_case key is dropped and the client receives a success. The span then
contradicted the wire response. Only CallToolResult and the camelCase
{"isError": True} dict survive serialization as errors; drop the snake_case
arm and assert the result stays a success.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/mcp/server/_otel.py">

<violation number="1" location="src/mcp/server/_otel.py:64">
P2: Tool-error span detection now misses coercible `isError` inputs that serialize to `true`, so telemetry can still diverge from the actual wire error result.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

Comment thread src/mcp/server/_otel.py
# Only shapes that survive wire serialization (alias-only) are real tool errors.
match result:
case CallToolResult(is_error=True) | {"isError": True} | {"is_error": True}:
case CallToolResult(is_error=True) | {"isError": True}:

@cubic-dev-ai cubic-dev-ai Bot Jun 25, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Tool-error span detection now misses coercible isError inputs that serialize to true, so telemetry can still diverge from the actual wire error result.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp/server/_otel.py, line 64:

<comment>Tool-error span detection now misses coercible `isError` inputs that serialize to `true`, so telemetry can still diverge from the actual wire error result.</comment>

<file context>
@@ -59,8 +59,9 @@ async def __call__(self, ctx: ServerRequestContext[Any, Any], call_next: CallNex
+                # Only shapes that survive wire serialization (alias-only) are real tool errors.
                 match result:
-                    case CallToolResult(is_error=True) | {"isError": True} | {"is_error": True}:
+                    case CallToolResult(is_error=True) | {"isError": True}:
                         span.set_attribute("error.type", "tool_error")
                         span.set_status(StatusCode.ERROR)
</file context>
Fix with cubic

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — small, correct fix that aligns the OTel tool-error detection with what actually serializes onto the wire.

Extended reasoning...

Overview

This PR is a two-line follow-up to #2970: it removes the {"is_error": True} arm from the tool-error match in OpenTelemetryMiddleware (src/mcp/server/_otel.py) and repurposes the corresponding test to assert that a snake_case dict result is not flagged as a tool error and serializes to a success on the wire.

Correctness

I verified the PR's central claim against the codebase: serialize_server_result (src/mcp/types/methods.py:615-617) validates handler results with by_name=False (alias-only) on a surface model with extra="ignore", so a snake_case is_error key is silently dropped and the client receives a success response. Flagging that case as error.type=tool_error / StatusCode.ERROR would have produced false-positive error spans, so removing the arm is correct. The remaining arms (CallToolResult(is_error=True) and {"isError": True}) are exactly the shapes that survive serialization as wire errors.

Security risks

None. The change only affects which OpenTelemetry span attributes/status are set; it does not touch request handling, auth, or wire behavior.

Level of scrutiny

Low. Observability-only logic, two changed files, narrow and well-understood semantics, with a test asserting both the wire result and the span state. The bug-hunting system found no issues, and the test changes directly cover the modified branch.

Other factors

The PR description includes a verification of the serialization behavior that matches what I confirmed in the repo, and the test asserts both result == {"content": []} and StatusCode.UNSET, which pins the intended behavior.

maxisbey
maxisbey previously approved these changes Jun 25, 2026
Expand the comment to note raw-dict isError is matched as a literal bool;
non-bool coercible values that serialize to an error are left undetected,
since no hand predicate matches pydantic's wire coercion.
@Kludex Kludex enabled auto-merge (squash) June 25, 2026 15:22
@Kludex Kludex merged commit 96bf22e into main Jun 25, 2026
33 checks passed
@Kludex Kludex deleted the otel-drop-snakecase-tool-error branch June 25, 2026 15:24

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — small, well-scoped telemetry fix that removes a false-positive tool-error span, with the test updated to verify the actual wire behavior.

Extended reasoning...

Overview

This PR is a small follow-up to #2970 touching only src/mcp/server/_otel.py and its test file. It removes the {"is_error": True} arm from the tool-error match in OpenTelemetryMiddleware and repurposes the corresponding test to assert that a snake_case is_error dict result is not flagged as a tool error and serializes to a success ({"content": []}) on the wire.

Correctness

The PR's premise checks out against the codebase: serialize_server_result (src/mcp/types/methods.py) validates handler results with by_name=False and dumps with by_alias=True, so a raw-dict is_error key is dropped and the client receives a success response. Flagging that as error.type=tool_error / StatusCode.ERROR was a false positive; removing the arm aligns the span with the actual wire result. The updated test exercises the full serialization path and asserts both the wire payload and the span status, which gives good end-to-end coverage of the new behavior. The cubic bot's P2 note about coercible truthy isError values (e.g. 1, "true") is acknowledged and documented as a deliberate limitation in the new code comment.

Security risks

None — the change only affects OpenTelemetry span attributes/status for observability; no auth, crypto, input handling, or wire-protocol behavior is modified.

Level of scrutiny

Low: a two-line logic change in telemetry middleware plus a test update, with the behavior verified by the included test. The remaining edge case is a minor telemetry-only gap that is explicitly documented, not a correctness or security concern.

@github-actions

Copy link
Copy Markdown
Contributor

This pull request is included in pre-release v2.0.0a3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants