Skip to content

Fix _hashlib.compare_digest to reject non-ASCII strings#7280

Merged
youknowone merged 1 commit intoRustPython:mainfrom
ever0de:fix/hashlib-compare-digest-ascii-check
Mar 1, 2026
Merged

Fix _hashlib.compare_digest to reject non-ASCII strings#7280
youknowone merged 1 commit intoRustPython:mainfrom
ever0de:fix/hashlib-compare-digest-ascii-check

Conversation

@ever0de
Copy link
Contributor

@ever0de ever0de commented Feb 28, 2026

Add non-ASCII string check to _hashlib.compare_digest, matching the behavior of _operator._compare_digest. When both arguments are strings, non-ASCII characters now correctly raise TypeError.

Also replace the non-constant-time == comparison with constant_time_eq for proper timing-attack resistance, and return PyResult instead of PyResult.

Summary by CodeRabbit

  • Chores
    • Added a dependency to support constant-time cryptographic comparisons.
  • Bug Fixes
    • Digest comparison now yields a direct boolean result and uses constant-time comparison under the hood.
    • Stricter input validation with clearer errors for non-ASCII strings and incompatible/bytes-like inputs.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 959cb04 and c100b7f.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/test/test_hmac.py is excluded by !Lib/**
📒 Files selected for processing (2)
  • crates/stdlib/Cargo.toml
  • crates/stdlib/src/hashlib.rs

📝 Walkthrough

Walkthrough

Adds the constant_time_eq dependency and refactors compare_digest in hashlib to return PyResult<bool> and perform constant-time comparisons for ASCII strings and byte buffers, with updated error handling for type and encoding mismatches.

Changes

Cohort / File(s) Summary
Dependency Management
crates/stdlib/Cargo.toml
Added constant_time_eq dependency (workspace = true).
Hash Comparison Refactor
crates/stdlib/src/hashlib.rs
Changed compare_digest signature to return PyResult<bool>; replaced previous comparison logic with constant_time_eq for constant-time comparisons of ASCII strings and byte buffers; updated error handling/messages and removed previous PyObject construction paths.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 I hopped through bytes in moonlit code,
I learned to time my steps just so,
Now secrets hide where they belong,
Constant and steady — safe and slow. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: rejecting non-ASCII strings in _hashlib.compare_digest, which is the primary objective of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/stdlib/src/hashlib.rs`:
- Around line 738-745: The mixed-type error message currently uses
b.as_object().class().name() which reports the wrong type; update the match arms
handling ArgStrOrBytesLike variants (e.g., the (ArgStrOrBytesLike::Buf(a),
ArgStrOrBytesLike::Buf(b)) branch and the fallback) so that when one operand is
Str and the other is Buf you return vm.new_type_error with the incompatible type
set to "str" (or dynamically detect which of the two is ArgStrOrBytesLike::Str
and use "str") instead of using b.as_object().class().name(); keep the
constant_time_eq branch unchanged and only change the error construction for
mixed Str/Buf cases.

ℹ️ Review info

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2833cd8 and 959cb04.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/test/test_hmac.py is excluded by !Lib/**
📒 Files selected for processing (2)
  • crates/stdlib/Cargo.toml
  • crates/stdlib/src/hashlib.rs

Comment on lines +738 to 745
(ArgStrOrBytesLike::Buf(a), ArgStrOrBytesLike::Buf(b)) => {
Ok(a.with_ref(|a| b.with_ref(|b| constant_time_eq(a, b))))
}
_ => Err(vm.new_type_error(format!(
"a bytes-like object is required, not '{}'",
b.as_object().class().name()
)));
))),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, find and read the hashlib.rs file to understand the context
find . -type f -name "hashlib.rs" | head -5

Repository: RustPython/RustPython

Length of output: 95


🏁 Script executed:

# Get the file and examine lines around 738-745
if [ -f "crates/stdlib/src/hashlib.rs" ]; then
  wc -l "crates/stdlib/src/hashlib.rs"
fi

Repository: RustPython/RustPython

Length of output: 97


🏁 Script executed:

# Read the relevant section with context
sed -n '700,760p' crates/stdlib/src/hashlib.rs | cat -n

Repository: RustPython/RustPython

Length of output: 2418


🏁 Script executed:

# Search for the function containing this code and understand ArgStrOrBytesLike
rg -n "ArgStrOrBytesLike" crates/stdlib/src/hashlib.rs -B5 -A5 | head -60

Repository: RustPython/RustPython

Length of output: 1651


🏁 Script executed:

# Search for similar error messages or patterns with mixed types in RustPython
rg -n "bytes-like object is required" . --type rust -B5 -A2

Repository: RustPython/RustPython

Length of output: 2034


🏁 Script executed:

# Also check what ArgStrOrBytesLike::Buf actually is
rg -n "enum ArgStrOrBytesLike\|struct.*Buf" . --type rust -A10 | head -40

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there are tests for compare_digest with mixed types
rg -n "compare_digest" . --type python -B3 -A3 | head -80

Repository: RustPython/RustPython

Length of output: 95


🏁 Script executed:

# Let me check if there are Python test files for compare_digest
find . -type f -name "*.py" | xargs grep -l "compare_digest" 2>/dev/null | head -5

Repository: RustPython/RustPython

Length of output: 197


🏁 Script executed:

# Look at the definition of ArgStrOrBytesLike to understand what Buf is
fd -e rs | xargs grep -l "ArgStrOrBytesLike" | head -3

Repository: RustPython/RustPython

Length of output: 149


🏁 Script executed:

# Check the function.rs file to understand ArgStrOrBytesLike definition
if [ -f "vm/src/function.rs" ]; then
  rg -n "enum ArgStrOrBytesLike\|impl.*ArgStrOrBytesLike" vm/src/function.rs -A15 | head -60
elif [ -f "crates/vm/src/function.rs" ]; then
  rg -n "enum ArgStrOrBytesLike\|impl.*ArgStrOrBytesLike" crates/vm/src/function.rs -A15 | head -60
fi

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Search more broadly for ArgStrOrBytesLike definition
rg -n "enum ArgStrOrBytesLike" --type rust -A10

Repository: RustPython/RustPython

Length of output: 730


🏁 Script executed:

# Also check the test file for compare_digest
if [ -f "Lib/test/test_hmac.py" ]; then
  rg -n "compare_digest" Lib/test/test_hmac.py -B2 -A5
fi

Repository: RustPython/RustPython

Length of output: 2968


🏁 Script executed:

# Check the hashlib tests or docstrings for compare_digest behavior
find . -type f -name "*.py" -exec grep -l "compare_digest" {} \; | xargs grep -l "bytes\|str" | head -3

Repository: RustPython/RustPython

Length of output: 131


🏁 Script executed:

# Let's check the actual Python test to understand what error CPython produces
grep -n "str.*bytes\|bytes.*str" Lib/test/test_hmac.py -B2 -A2 | head -30

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Let's write a Python script to test CPython's behavior
python3 << 'EOF'
import hmac
import sys

# Test compare_digest with mixed types
test_cases = [
    ("string", b"bytes"),
    (b"bytes", "string"),
]

for a, b in test_cases:
    try:
        result = hmac.compare_digest(a, b)
    except TypeError as e:
        print(f"compare_digest({type(a).__name__}, {type(b).__name__}): {e}")
EOF

Repository: RustPython/RustPython

Length of output: 206


🏁 Script executed:

# Also check what the current RustPython implementation actually produces
# by looking at how the error case is structured in more detail
sed -n '721,746p' crates/stdlib/src/hashlib.rs | cat -n

Repository: RustPython/RustPython

Length of output: 1205


🏁 Script executed:

# Let me verify the exact behavior by looking at what happens in each case
# The key insight: ArgStrOrBytesLike::Str means string, Buf means bytes-like
# Looking at the match logic:
# Case 1: (Str, Str) -> OK (both strings)
# Case 2: (Buf, Buf) -> OK (both bytes-like)
# Case 3: everything else -> ERROR using b's type

# So if a=Str and b=Buf:
#   - Goes to case 3 (neither case 1 nor case 2 matches)
#   - Error uses b.as_object().class().name() which is bytes
#   - Message: "a bytes-like object is required, not 'bytes'" <- WRONG

# If a=Buf and b=Str:
#   - Goes to case 3
#   - Error uses b.as_object().class().name() which is str
#   - Message: "a bytes-like object is required, not 'str'" <- matches CPython

python3 << 'EOF'
# Verify CPython's behavior more thoroughly
import hmac

test_cases = [
    ("string", b"bytes", "str first"),
    (b"bytes", "string", "bytes first"),
    ("s1", "s2", "both str"),
    (b"b1", b"b2", "both bytes"),
]

for a, b, desc in test_cases:
    try:
        result = hmac.compare_digest(a, b)
        print(f"{desc}: OK - {result}")
    except TypeError as e:
        print(f"{desc}: {e}")
EOF

Repository: RustPython/RustPython

Length of output: 218


Fix error message to report 'str' instead of the second argument's type.

The error message in the mixed-type case uses b.as_object().class().name(), which is incorrect. When a is a string and b is bytes, this produces "a bytes-like object is required, not 'bytes'" which is misleading. CPython always reports "not 'str'" for mixed-type comparisons (both str+bytes and bytes+str cases), because str is the incompatible type. Change the error to report 'str' in all mixed-type cases, or dynamically determine which argument is Str and report that type.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/stdlib/src/hashlib.rs` around lines 738 - 745, The mixed-type error
message currently uses b.as_object().class().name() which reports the wrong
type; update the match arms handling ArgStrOrBytesLike variants (e.g., the
(ArgStrOrBytesLike::Buf(a), ArgStrOrBytesLike::Buf(b)) branch and the fallback)
so that when one operand is Str and the other is Buf you return
vm.new_type_error with the incompatible type set to "str" (or dynamically detect
which of the two is ArgStrOrBytesLike::Str and use "str") instead of using
b.as_object().class().name(); keep the constant_time_eq branch unchanged and
only change the error construction for mixed Str/Buf cases.


@unittest.expectedFailure # TODO: RUSTPYTHON; AssertionError: TypeError not raised by compare_digest
def test_exceptions(self):
return super().test_exceptions()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please remove the entire test?


@unittest.expectedFailure # TODO: RUSTPYTHON; AssertionError: TypeError not raised by compare_digest
def test_exceptions(self):
return super().test_exceptions()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same:)

Add non-ASCII string check to _hashlib.compare_digest, matching the
behavior of _operator._compare_digest. When both arguments are strings,
non-ASCII characters now correctly raise TypeError.

Also replace the non-constant-time == comparison with constant_time_eq
for proper timing-attack resistance, and return PyResult<bool> instead
of PyResult<PyObjectRef>.
@ever0de ever0de force-pushed the fix/hashlib-compare-digest-ascii-check branch from 959cb04 to c100b7f Compare February 28, 2026 16:57
@github-actions
Copy link
Contributor

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

Error: 'frozenset' object has no attribute 'discard'

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@youknowone youknowone merged commit be29462 into RustPython:main Mar 1, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants