Fix allow_threads and EINTR handling#7457
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThis PR modifies blocking syscall handling in two multiprocessing and time functions. Windows semaphore acquisition now executes the blocking Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
22f581b to
a854bfa
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/stdlib/time.rs`:
- Around line 122-123: Replace the unchecked addition Instant::now() + dur when
computing deadline with Instant::now().checked_add(dur) and handle the None case
by returning/raising an OverflowError; specifically, change the code that
assigns deadline (the Instant::now() + dur expression) to use
Instant::now().checked_add(dur).ok_or_else(|| OverflowError::new("timeout
overflow")) (or the crate's equivalent VM error constructor) before entering the
loop so that overflow is detected and an OverflowError is returned instead of
panicking.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 09351446-0678-43bb-8c3f-b13062527278
⛔ Files ignored due to path filters (1)
Lib/test/_test_eintr.pyis excluded by!Lib/**
📒 Files selected for processing (2)
crates/stdlib/src/multiprocessing.rscrates/vm/src/stdlib/time.rs
| let deadline = Instant::now() + dur; | ||
| loop { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== sleep deadline computation =="
rg -nP --type=rust 'Instant::now\(\)\s*\+\s*dur|checked_add\(' crates/vm/src/stdlib/time.rs -C2
echo
echo "== Duration conversion path used by time.sleep =="
rg -nP --type=rust 'try_into_value::<Duration>|from_secs_f64|try_from_secs_f64|Duration' crates/vm/src/stdlib/time.rs crates/vm/src -C2
echo
echo "== Existing oversized sleep tests/messages =="
rg -nP --type=rust --type=py 'sleep.*(Overflow|too large|overflow|huge)' -C2Repository: RustPython/RustPython
Length of output: 29666
Use Instant::checked_add() to prevent panic on overflow.
Line 122 uses unchecked Instant::now() + dur, which will panic if the duration is large enough to overflow the Instant. The Duration conversion at line 101 does not validate that the resulting value is within Instant's safe range. Replace with checked_add() and raise OverflowError.
Proposed fix
use std::time::Instant;
- let deadline = Instant::now() + dur;
+ let deadline = Instant::now()
+ .checked_add(dur)
+ .ok_or_else(|| vm.new_overflow_error("sleep length is too large"))?;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let deadline = Instant::now() + dur; | |
| loop { | |
| let deadline = Instant::now() | |
| .checked_add(dur) | |
| .ok_or_else(|| vm.new_overflow_error("sleep length is too large"))?; | |
| loop { |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/stdlib/time.rs` around lines 122 - 123, Replace the unchecked
addition Instant::now() + dur when computing deadline with
Instant::now().checked_add(dur) and handle the None case by returning/raising an
OverflowError; specifically, change the code that assigns deadline (the
Instant::now() + dur expression) to use
Instant::now().checked_add(dur).ok_or_else(|| OverflowError::new("timeout
overflow")) (or the crate's equivalent VM error constructor) before entering the
loop so that overflow is detected and an OverflowError is returned instead of
panicking.
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [ ] test: cpython/Lib/test/test_class.py (TODO: 15) dependencies: dependent tests: (no tests depend on class) Legend:
|
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
crates/derive-impl/src/pyclass.rs (1)
914-1000:⚠️ Potential issue | 🟠 Major
__del__is still missing from the forbidden slot-method list.
slot_defs.rsexposes__del__viaTpDel, so this validation still allows a plain#[pymethod] fn __del__even though the rest of the slot-backed dunders are forced onto traits. That leaves finalizers easy to wire as normal methods instead of the destructor slot.Suggested fix
// Constructor/Initializer traits ("__new__", "Constructor"), ("__init__", "Initializer"), + ("__del__", "Destructor"), // Representable trait ("__repr__", "Representable"),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/derive-impl/src/pyclass.rs` around lines 914 - 1000, The FORBIDDEN_SLOT_METHODS list (used when !args.context.is_trait) is missing the destructor slot __del__; add an entry like ("__del__", "Destructor/TpDel") to the const FORBIDDEN_SLOT_METHODS so plain #[pymethod] fn __del__ is rejected just like the other slot-backed dunders (see FORBIDDEN_SLOT_METHODS, args.context.is_trait and the slot_defs::TpDel relation).crates/vm/src/builtins/float.rs (1)
121-154:⚠️ Potential issue | 🟡 MinorKeep the float-specific zero-division messages to match Python's exception contract.
Using the same
"division by zero"text for/,%,//, anddivmod()diverges from Python's standard behavior. Python uses operation-specific messages:
1.0 / 0.0→ZeroDivisionError: float division by zero1.0 % 0.0→ZeroDivisionError: float modulo1.0 // 0.0→ZeroDivisionError: float floor division by zerodivmod(1.0, 0.0)→ZeroDivisionError: float divmod()Update the messages in
inner_div,inner_mod,inner_floordiv, andinner_divmodto match these expected values.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/float.rs` around lines 121 - 154, The zero-division error messages for float ops are generic; update inner_div, inner_mod, inner_floordiv, and inner_divmod to raise vm.new_zero_division_error with Python-specific messages: inner_div should use "float division by zero", inner_mod "float modulo", inner_floordiv "float floor division by zero", and inner_divmod "float divmod()". Locate these functions (inner_div, inner_mod, inner_floordiv, inner_divmod) and replace the current "division by zero" strings with the operation-specific messages so exceptions match CPython behavior.scripts/generate_opcode_metadata.py (1)
4-5:⚠️ Potential issue | 🟡 MinorInconsistent CPython version in comments.
Line 4 states "compatible with CPython 3.13" but line 136 states "CPython 3.14 compatible opcode numbers". Please update line 4 to reflect the correct version.
✏️ Proposed fix
-This file generates opcode metadata that is compatible with CPython 3.13. +This file generates opcode metadata that is compatible with CPython 3.14.Also applies to: 136-136
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/generate_opcode_metadata.py` around lines 4 - 5, The module docstring at the top currently reads "compatible with CPython 3.13" but the module later declares "CPython 3.14 compatible opcode numbers"; update the top docstring to "compatible with CPython 3.14" so the file consistently references CPython 3.14 (search for the top-level module docstring and the literal "CPython 3.14 compatible opcode numbers" to locate the lines to change).crates/vm/src/builtins/complex.rs (1)
97-118:⚠️ Potential issue | 🟠 MajorUnreachable code path after line 109.
The
if let Some(ret)block at line 97 returns at line 109, so theelseblock starting at line 110 will only execute whenresult.downcast_ref::<PyComplex>()returnsNone. However, at line 111, you attempt the samedowncast_ref::<PyComplex>()again, which will also returnNone, causing the code to always fall into the error branch.This appears to be leftover logic from a refactor. The warning should only be issued for subclasses of complex, not for exact complex instances.
🐛 Proposed fix
let ret_class = result.class().to_owned(); - if let Some(ret) = result.downcast_ref::<PyComplex>() { + if !result.class().is(vm.ctx.types.complex_type) { + if let Some(ret) = result.downcast_ref::<PyComplex>() { _warnings::warn( vm.ctx.exceptions.deprecation_warning, format!( "__complex__ returned non-complex (type {ret_class}). \ The ability to return an instance of a strict subclass of complex \ is deprecated, and may be removed in a future version of Python." ), 1, vm, )?; return Ok(Some((ret.value, true))); + } + } + if let Some(complex_obj) = result.downcast_ref::<PyComplex>() { + return Ok(Some((complex_obj.value, true))); } else { - return match result.downcast_ref::<PyComplex>() { - Some(complex_obj) => Ok(Some((complex_obj.value, true))), - None => Err(vm.new_type_error(format!( - "__complex__ returned non-complex (type '{}')", - result.class().name() - ))), - }; + return Err(vm.new_type_error(format!( + "__complex__ returned non-complex (type '{}')", + result.class().name() + ))); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/complex.rs` around lines 97 - 118, The current branch is unreachable because it redundantly re-checks result.downcast_ref::<PyComplex>(); instead, first attempt downcast_ref::<PyComplex>() once and if it yields Some(py_complex) then check whether the object's class is exactly the built-in complex type (compare result.class() or py_complex.class() to vm.ctx.types.complex_type); if it is the exact complex class return Ok((py_complex.value, true)), otherwise call _warnings::warn(...) and then return Ok((py_complex.value, true)); if the initial downcast_ref::<PyComplex>() returns None then return the vm.new_type_error(...) about __complex__ returning non-complex. Ensure you update the logic around result.downcast_ref::<PyComplex>(), the _warnings::warn call, and the vm.new_type_error path accordingly.crates/vm/src/vm/mod.rs (1)
1566-1619:⚠️ Potential issue | 🟠 Major
with_frame_untraced()suppresses profiling too, not just tracing.
dispatch_traced_frame()fires bothTraceEvent::CallandTraceEvent::Return, both of which route throughtrace_event()to handle profile callbacks. Thetraced == falsebranch skipsdispatch_traced_frame()entirely, making callers invisible tosys.setprofile()in addition tosys.settrace(). The function is used once (internal cleanup frame inframe.rs:1294), where this behavior appears intentional, but the namewith_frame_untraced()will mislead future callers. Rename to reflect that both tracing and profiling are suppressed—e.g.,with_frame_uninstrumented()orwith_frame_no_trace()and add a doc comment clarifying what is suppressed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/vm/mod.rs` around lines 1566 - 1619, The function with_frame_untraced actually suppresses both tracing and profiling because the traced == false branch bypasses dispatch_traced_frame (which emits TraceEvent::Call/Return and triggers trace_event/profile callbacks); rename with_frame_untraced to a clearer name (e.g., with_frame_uninstrumented or with_frame_no_trace) and update its declaration and all call sites (including the internal cleanup usage in frame.rs) to the new name, and add a concise doc comment on the renamed function explaining that it suppresses both sys.settrace() and sys.setprofile() instrumentation (mentioning with_frame_impl and dispatch_traced_frame/trace_event for context).
🧹 Nitpick comments (14)
crates/stdlib/src/_asyncio.rs (1)
1305-1320: Consider reusingPyFuture::make_cancelled_error_implto reduce duplication.This method is nearly identical to
PyFuture::make_cancelled_error_impl. SincePyTaskcontainsbase: PyFuture, you could delegate to the base implementation:♻️ Suggested refactor
fn make_cancelled_error_impl(&self, vm: &VirtualMachine) -> PyBaseExceptionRef { - // If a saved CancelledError exists, take it (clearing the stored reference) - if let Some(exc) = self.base.fut_cancelled_exc.write().take() - && let Ok(exc) = exc.downcast::<PyBaseException>() - { - return exc; - } - - let msg = self.base.fut_cancel_msg.read().clone(); - let args = if let Some(m) = msg { vec![m] } else { vec![] }; - - match get_cancelled_error_type(vm) { - Ok(cancelled_error) => vm.new_exception(cancelled_error, args), - Err(_) => vm.new_runtime_error("cancelled"), - } + self.base.make_cancelled_error_impl(vm) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/stdlib/src/_asyncio.rs` around lines 1305 - 1320, The PyTask::make_cancelled_error_impl implementation duplicates PyFuture::make_cancelled_error_impl; replace the body to delegate to the base PyFuture by returning self.base.make_cancelled_error_impl(vm) (adjust visibility if needed so PyFuture::make_cancelled_error_impl is callable from PyTask), keeping the same signature (fn make_cancelled_error_impl(&self, vm: &VirtualMachine) -> PyBaseExceptionRef) and preserving behavior with VirtualMachine/get_cancelled_error_type.crates/vm/src/gc_state.rs (1)
426-431: Consider extracting duplicated reset logic into a helper method.The generation count reset pattern appears 4 times in
collect_inner(lines 426-431, 539-542, 560-562, 707-710). Per the coding guidelines about avoiding duplicate code, consider extracting this into a helper method.♻️ Suggested refactor
Add a helper method to
GcState:/// Reset generation counts after collection. /// For gen2 (oldest), survivors stay in-place so don't reset gen2 count. fn reset_generation_counts(&self, generation: usize) { let reset_end = if generation >= 2 { 2 } else { generation + 1 }; for i in 0..reset_end { self.generations[i].count.store(0, Ordering::SeqCst); } }Then replace each occurrence with:
self.reset_generation_counts(generation);As per coding guidelines: "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/gc_state.rs` around lines 426 - 431, Extract the duplicated generation-reset logic in collect_inner into a GcState helper method (e.g., fn reset_generation_counts(&self, generation: usize)) that computes reset_end = if generation >= 2 { 2 } else { generation + 1 } and loops over self.generations[i].count.store(0, Ordering::SeqCst); then replace each repeated block in collect_inner with a call to self.reset_generation_counts(generation) so all four occurrences use the single helper.crates/vm/src/stdlib/gc.rs (1)
202-210: Verify safety of frame pointer dereference.The
unsafe { fp.as_ref() }assumesfpis a validNonNull<PyObject>(or similar) that is safe to dereference. While the frames are borrowed fromvm.framesand should be valid during this scope, consider adding a brief safety comment explaining why this is sound.🔧 Suggested safety comment
let stack_frames: HashSet<usize> = vm .frames .borrow() .iter() .map(|fp| { + // SAFETY: Frame pointers in vm.frames are valid for the duration + // of the borrow, and we only read the pointer value. let frame: &crate::PyObject = unsafe { fp.as_ref() }.as_ref(); frame as *const crate::PyObject as usize }) .collect();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/gc.rs` around lines 202 - 210, The unsafe dereference unsafe { fp.as_ref() } in the stack_frames construction must be annotated with a short safety comment: explain that vm.frames is borrowed for the duration of this scope, that it stores NonNull<crate::PyObject> (or equivalent raw pointers) which are owned/kept alive by the VM/GC root set, and thus each fp is valid and not concurrently mutated while borrowed—so calling fp.as_ref() and converting to a usize is sound; add this comment immediately above the map closure referencing vm.frames, fp.as_ref(), and stack_frames.crates/vm/src/builtins/weakproxy.rs (1)
243-290: Please add regression coverage for the new numeric slot surface.This now changes unary, binary, in-place, and ternary dispatch. A few focused cases for lhs/rhs weak proxies, dead proxies, and
pow(proxy, x, mod)would make regressions much easier to catch.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/weakproxy.rs` around lines 243 - 290, Add regression tests that exercise the new numeric slot surface introduced in AS_NUMBER: cover unary (proxy_unary_slot! handlers like _neg/_pos/_abs), binary (proxy_binary_slot! handlers like _add/_sub/_mul), in-place (inplace_* handlers such as _iadd/_imul) and ternary (proxy_ternary_slot! handlers like _pow/_ipow) dispatch; specifically include tests where the weak proxy is the lhs and rhs of operations, where the underlying referent has been dropped (dead proxy), and a pow(proxy, x, mod) case to validate three-argument pow dispatch. Use the proxy creation/upgrade utilities (proxy_upgrade and the weak proxy type in weakproxy.rs) to construct proxies and drop targets to assert correct results or TypeError/ReferenceError behavior so regressions in AS_NUMBER numeric slots are detected.crates/vm/src/protocol/object.rs (1)
373-379: Refactorascii()to avoid duplicated branch wrapping.The two branches only differ in value; return
Ok(...)once after selecting the value.♻️ Proposed refactor
pub fn ascii(&self, vm: &VirtualMachine) -> PyResult<PyRef<PyStr>> { let repr = self.repr(vm)?; - if repr.as_wtf8().is_ascii() { - Ok(repr) - } else { - Ok(vm.ctx.new_str(to_ascii(repr.as_wtf8()))) - } + let ascii_repr = if repr.as_wtf8().is_ascii() { + repr + } else { + vm.ctx.new_str(to_ascii(repr.as_wtf8())) + }; + Ok(ascii_repr) }As per coding guidelines: "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/protocol/object.rs` around lines 373 - 379, The ascii() method duplicates the Ok(...) wrapper across two branches; refactor by first obtaining repr via self.repr(vm) and then compute a single PyRef<PyStr> result value: if repr.as_wtf8().is_ascii() use repr, otherwise create vm.ctx.new_str(to_ascii(repr.as_wtf8())); finally return Ok(result) once. Update the function body around ascii(), repr(), to_ascii() and vm.ctx.new_str(...) usage so only the differing value is chosen in the conditional and the Ok(...) is returned a single time.extra_tests/snippets/builtin_ascii.py (1)
24-27: Consider covering the%aformatting path too.These assertions lock in
ascii()itself, but the same PR also changed the%a/bytes conversion path incrates/vm/src/cformat.rs. One small formatting regression here would protect the cross-file behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extra_tests/snippets/builtin_ascii.py` around lines 24 - 27, Add tests that exercise the "%a" formatting path in addition to direct ascii() calls: for the custom types Foo and Bar, assert that "%a" % Foo() produces the same MyStr-typed result as ascii(Foo()) and that "%a" % Bar() yields the same byte-escaped string as ascii(Bar()); also add the analogous bytes-format checks (e.g. b"%a" % Foo()/Bar()) to cover the bytes conversion path touched in cformat.rs so the PR protects cross-file behavior changes.scripts/generate_opcode_metadata.py (1)
75-118: Consider adding error handling for missingdeoptfunction.The
build_deoptsfunction uses.group(1)on the regex search result without checking if the match succeeded. If thedeoptfunction is not found in the source, this will raise anAttributeError.♻️ Proposed fix
def build_deopts(contents: str) -> dict[str, list[str]]: - raw_body = re.search( + match = re.search( r"fn deopt\(self\) -> Option<Self>(.*)", contents, re.DOTALL - ).group(1) + ) + if not match: + return {} + raw_body = match.group(1)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/generate_opcode_metadata.py` around lines 75 - 118, The build_deopts function blindly calls .group(1) on the re.search result which will raise if the deopt regex doesn't match; change it to capture the search result (e.g., m = re.search(...)) and check if m is None before using m.group(1), and if missing either return an empty dict or raise a clear ValueError/RuntimeError mentioning the missing deopt function and include enough context (e.g., a snippet or filename) to aid debugging; update build_deopts to use that guarded match variable instead of calling .group(1) directly.crates/vm/src/builtins/code.rs (2)
474-489: Simplify constant comparison with iterator.The loop-based comparison can be replaced with a more idiomatic approach.
♻️ Proposed simplification
&& { let a_consts: Vec<_> = a.constants.iter().map(|c| c.0.clone()).collect(); let b_consts: Vec<_> = b.constants.iter().map(|c| c.0.clone()).collect(); - if a_consts.len() != b_consts.len() { - false - } else { - let mut eq = true; - for (ac, bc) in a_consts.iter().zip(b_consts.iter()) { - if !vm.bool_eq(ac, bc)? { - eq = false; - break; - } - } - eq - } + a_consts.len() == b_consts.len() + && a_consts + .iter() + .zip(b_consts.iter()) + .try_fold(true, |_, (ac, bc)| { + vm.bool_eq(ac, bc).map(|eq| eq) + })? };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/code.rs` around lines 474 - 489, Replace the manual loop that compares a_consts and b_consts with an iterator-based check: after collecting a.constants and b.constants (used in a_consts and b_consts) and verifying lengths match, use an iterator zip over a_consts.iter().zip(b_consts.iter()) combined with a fallible iterator adapter such as try_for_each or try_fold to call vm.bool_eq(ac, bc)? for each pair and aggregate to a single boolean (true only if all pairs are equal), returning early on error; update the block that currently sets and returns eq to this iterator-based approach while keeping the existing length check and using the same vm.bool_eq call.
495-517: Hash implementation creates temporary objects on every call.The
hashimplementation allocates a new tuple containing multiple new Python objects (strings, ints, bytes, nested tuple) for every hash computation. This could be expensive for repeated hashing operations (e.g., using code objects as dict keys).Consider caching the hash value in
PyCode(lazy computation with atomic storage), similar to how Python strings cache their hash. Alternatively, compute the hash directly from the fields without creating Python objects.♻️ Example of direct hash computation
impl Hashable for PyCode { fn hash(zelf: &Py<Self>, _vm: &VirtualMachine) -> PyResult<crate::common::hash::PyHash> { use crate::common::hash; use std::hash::{Hash, Hasher}; let code = &zelf.code; let mut hasher = hash::HashSecret::new(); code.obj_name.as_str().hash(&mut hasher); code.arg_count.hash(&mut hasher); code.posonlyarg_count.hash(&mut hasher); code.kwonlyarg_count.hash(&mut hasher); code.varnames.len().hash(&mut hasher); code.flags.bits().hash(&mut hasher); code.first_line_number.map_or(0, |n| n.get()).hash(&mut hasher); code.instructions.original_bytes().hash(&mut hasher); // Constants need special handling for Python equality semantics Ok(hash::fix_sentinel(hasher.finish() as i64)) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/code.rs` around lines 495 - 517, The current Hashable::hash impl for PyCode allocates many temporary Python objects (vm.ctx.new_* and a tuple) on every call; change it to avoid allocations by either (A) adding a cached, lazily-computed atomic hash field to the PyCode struct and computing/storing it once in PyCode::hash, or (B) computing the hash directly from the Rust fields without creating Python objects: use crate::common::hash::HashSecret (or the project's hashing utilities) and feed code.obj_name.as_str(), code.arg_count, code.posonlyarg_count, code.kwonlyarg_count, code.varnames.len(), code.flags.bits(), code.first_line_number.map_or(0, …), and code.instructions.original_bytes() into the hasher, handle code.constants with the same semantics as equality (or include their precomputed hashes), then return crate::common::hash::fix_sentinel(hasher.finish() as i64) as the PyHash; implement atomic lazy init if choosing caching to be thread-safe.crates/compiler-core/src/bytecode.rs (1)
321-335: BlanketIndeximpl on[T]may cause conflicts.Implementing
Index<oparg::VarNum>for[T]is a blanket implementation on a standard library type. While this works within this crate, it could lead to coherence issues or unexpected behavior if other crates also implement similar traits. The TODO comments correctly note that a newtype wrapper forCodeObject.varnameswould be the proper solution.Consider prioritizing the newtype approach to avoid potential issues as the codebase grows.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/compiler-core/src/bytecode.rs` around lines 321 - 335, Remove the blanket impls of Index/IndexMut for [T] that take oparg::VarNum and instead introduce a newtype wrapper for the CodeObject.varnames field (e.g., a Varnames<T> or VarNames(Vec<T>/Box<[T]>) type) and move the Index/IndexMut implementations onto that newtype so only the newtype can be indexed by oparg::VarNum; implement Index/IndexMut for Varnames<T> using var_num.as_usize() to access the underlying storage and update CodeObject.varnames to use the new type.crates/stdlib/src/_tokenize.rs (3)
640-644: Consider using a named struct instead of tuple for pending token parts.The 6-element tuple
(u8, String, usize, usize, usize, usize)is hard to reason about. A named struct would improve readability and maintainability:struct PendingTokenPart { token_type: u8, text: String, start_line: usize, start_col: usize, end_line: usize, end_col: usize, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/stdlib/src/_tokenize.rs` around lines 640 - 644, Replace the opaque 6-tuple type used in pending_fstring_parts with a named struct to improve clarity: define a struct (e.g., PendingTokenPart) with fields token_type: u8, text: String, start_line: usize, start_col: usize, end_line: usize, end_col: usize and update the pending_fstring_parts field to Vec<PendingTokenPart>; update any places that construct, deconstruct, or index into the tuple to use the struct field names (also keep pending_empty_fstring_middle as-is unless you want a parallel refactor) so callers use PendingTokenPart.token_type, .text, .start_line, etc., instead of tuple positions.
107-147: Potential race condition in state update pattern.The state is cloned on line 109, modified locally, then written back on line 138. If the iterator is accessed concurrently (e.g., via threading with released GIL), modifications between the read and write could be lost. Consider using a single write lock throughout, or document that concurrent access is unsupported.
♻️ Alternative using write lock throughout
impl IterNext for PyTokenizerIter { fn next(zelf: &Py<Self>, vm: &VirtualMachine) -> PyResult<PyIterReturn> { - let mut state = zelf.state.read().clone(); + let mut state_guard = zelf.state.write(); loop { - match &mut state.phase { + match &mut state_guard.phase { TokenizerPhase::Reading { source } => { let line = zelf.readline(vm)?; if line.is_empty() { let accumulated = core::mem::take(source); // ... rest of logic - state.phase = TokenizerPhase::Yielding { ... }; + state_guard.phase = TokenizerPhase::Yielding { ... }; } else { source.push_str(&line); } } TokenizerPhase::Yielding { .. } => { - let result = emit_next_token(&mut state, zelf.extra_tokens, vm)?; - *zelf.state.write() = state; + let result = emit_next_token(&mut state_guard, zelf.extra_tokens, vm)?; return Ok(result); } // ... } } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/stdlib/src/_tokenize.rs` around lines 107 - 147, The next method on IterNext for PyTokenizerIter clones the state via zelf.state.read().clone() and later writes it back, which can lose concurrent updates; change to acquire a write lock once (use zelf.state.write()) and mutate the state in place while calling readline, updating TokenizerPhase variants, and invoking emit_next_token so modifications are atomic; alternatively, document that concurrent iterator access is unsupported if you prefer not to change locking, but do not retain the read-clone-write pattern used with zelf.state.read()/clone() and write().
248-256: Repeatedsource.lines().count()calls could be cached.
source.lines().count()is called multiple times throughout the function (lines 248, 375, 386). Sincesourcedoesn't change during the Yielding phase, this value could be computed once and stored in the state.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/stdlib/src/_tokenize.rs` around lines 248 - 256, Compute source.lines().count() once and reuse it instead of calling it multiple times: introduce a local variable (e.g., total_lines or last_line) near the start of the function or before the Yielding phase and replace all occurrences of source.lines().count() (used to compute last_line/default_pos at the block with extra_tokens and the later uses around lines referenced) with that cached variable; ensure existing logic that builds default_pos and the call to next_non_dedent_info(tokens, *index, source, line_index, default_pos) uses the cached value so behavior is unchanged.crates/compiler-core/src/bytecode/instruction.rs (1)
1231-1247: Extract a shared formatter for the paired fast-local disassembly paths.These arms now unpack the same
VarNumsshape in three places, andSTORE_FAST_LOAD_FASThas already drifted by printing raw indexes instead of resolved var names. A tiny helper that takes the opcode label would keep the output consistent and avoid future divergence.As per coding guidelines: "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code"
Also applies to: 1354-1369
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 1231 - 1247, The two arms (Self::LoadFastLoadFast and Self::LoadFastBorrowLoadFastBorrow) (and the similar STORE_FAST_LOAD_FAST case) duplicate the same logic of fetching var_nums.get(arg), calling oparg.indexes(), resolving varname(idx) for both indexes and writing the formatted string; extract a small helper function (e.g., format_paired_vars or fmt_paired_fast_locals) that takes the opcode label string, the formatter f, &var_nums and arg, performs var_nums.get(arg) -> oparg.indexes() -> varname(idx1/idx2) and does the write! call, then replace the duplicated blocks to call that helper to keep output consistent and avoid divergence.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/ci.yaml:
- Around line 142-145: The workflow step named "Install dependencies" uses the
composite action ./.github/actions/install-linux-deps but incorrectly sets with:
to the scalar expression `${{ matrix.dependencies || fromJSON('{}') }}`; change
the step so that with: is a YAML mapping of the action's input keys (the inputs
defined by the composite action) and pass values from matrix.dependencies
explicitly (e.g., map each expected input key to `${{ matrix.dependencies.<key>
}}` or a default), ensuring the with: block is a proper mapping rather than a
single expression.
In @.github/workflows/pr-format.yaml:
- Around line 58-63: The formatting check uses astral-sh/ruff-action@v3.6.1 but
pins Ruff to 0.15.4 which conflicts with ci.yaml's 0.15.5; update the Ruff
version string used in the job that runs "Check for formatting changes" (the
block using astral-sh/ruff-action) to match the CI workflow (change '0.15.4' to
'0.15.5' or whatever version ci.yaml uses) so both formatting and lint workflows
use the same Ruff patch release.
In `@Cargo.toml`:
- Around line 103-104: Replace the moving branch override for parking_lot_core
under [patch.crates-io] by pinning it to a specific commit: locate the
parking_lot_core entry that currently uses git =
"https://github.com/youknowone/parking_lot" and branch = "rustpython" and change
it to use rev = "<commit-sha>" (a specific commit hash) instead of branch to
ensure deterministic resolution and reproducible lockfile generation.
In `@crates/compiler-core/src/bytecode/oparg.rs`:
- Around line 372-377: The TryFrom<u32> impl for MakeFunctionFlag currently
narrows with value as u8 which silently truncates out-of-range operands; change
it to validate before narrowing by using u8::try_from(value).map_err(|_|
MarshalError::InvalidBytecode) and then call Self::try_from on the validated u8,
so malformed extended-arg bytecode yields MarshalError::InvalidBytecode (follow
the same pattern used by the oparg_enum! helpers).
In `@crates/stdlib/src/_tokenize.rs`:
- Around line 519-531: The end_pos closure is miscounting column positions by
adding byte length per char; update the closure (end_pos with parameters
current, start_line, start_col) to increment ec by 1 for each non-newline
character instead of ec += ch.len_utf8() so columns are counted as characters
(matching Python tokenize behavior) while preserving newline handling and
returning (el, ec).
In `@crates/stdlib/src/ssl/compat.rs`:
- Around line 1182-1241: The helper currently consumes a partial TLS record
fragment (using peeked_bytes.len()) which causes subsequent peeks to start
mid-record and corrupt framing; update the logic in the helper that uses
socket.sock_peek / socket.sock_recv (and the types ArgBytesLike,
TLS_RECORD_HEADER_SIZE, SSL3_RT_MAX_PLAIN_LENGTH) so you do not consume a
partial record: either (A) request and consume the entire record by calling
socket.sock_recv(total_record_size) when peek shows peeked_bytes.len() <
total_record_size (ensuring you allow for TLS_RECORD_HEADER_SIZE +
record_body_len and that SSL3_RT_MAX_PLAIN_LENGTH accounts for the 5-byte
header), or (B) implement a per-connection buffer/remaining-byte state to store
the partial fragment and return only when the full total_record_size has been
accumulated; also adjust the fixed peek cap so it includes the 5-byte header
when computing max wire size.
In `@crates/vm/src/builtins/map.rs`:
- Around line 77-80: In __setstate__, don't swallow conversion failures from
ArgIntoBool::try_from_object; instead propagate the error so bad pickle state
surfaces. Replace the current if-let that ignores Err with a fallible binding
(e.g. let obj = ArgIntoBool::try_from_object(vm, state)?;) or explicitly match
and return the Err, then call zelf.strict.store(obj.into(),
atomic::Ordering::Release) and return Ok(()). This ensures failures in
ArgIntoBool::try_from_object are returned from __setstate__ rather than silently
ignored.
In `@crates/vm/src/builtins/type.rs`:
- Around line 491-494: The race occurs because modified() invalidates
specialization_cache then releases the lock before zeroing tp_version_tag,
letting concurrent
cache_init_for_specialization/cache_getitem_for_specialization (which re-acquire
write_lock) repopulate the cache with the old tp_version; fix by making the
cache invalidation and the tp_version_tag update a single critical section:
acquire the same write_lock that protects specialization_cache and
tp_version_tag inside modified() (and the other affected sites around lines
903-923 and 945-970), perform
ext.specialization_cache.invalidate_for_type_modified() and then write
tp_version_tag = 0 (or the new value) while holding the lock, then release the
lock so no concurrent initializer can insert stale entries under the old
version.
In `@crates/vm/src/builtins/weakproxy.rs`:
- Around line 138-140: The TypeError message thrown when a weakref proxy
references a non-iterator omits the referent's type; modify the error creation
in the block that checks obj.class().slots.iternext (the code using
obj.class().slots.iternext.load() and vm.new_type_error) to include the class
name by formatting the message with obj.class().name() (e.g., use format! to
produce "Weakref proxy referenced a non-iterator '<type>' object") so the
exception matches CPython style and other files like tuple.rs/template.rs.
In `@crates/vm/src/stdlib/atexit.rs`:
- Around line 25-63: The current unregister loop can skip or miss callbacks when
vm.bool_eq() mutates vm.state.atexit_funcs; fix by snapshotting the candidates
while holding the lock and then iterating over that snapshot after releasing the
lock. Specifically, in unregister() capture a Vec of (cloned callback
PyObjectRef and its entry pointer, e.g., (&**entry as *const (PyObjectRef,
FuncArgs))) from vm.state.atexit_funcs.lock(), release the lock, iterate the
snapshot calling vm.bool_eq(&func, &cb) for each candidate, and when matched
search the full live vm.state.atexit_funcs list (no min clipping) by pointer
(core::ptr::eq) to remove the entry; do not rely on decrementing a shared index
tied to the live list length. This ensures insertions at index 0 (register())
during __eq__ won’t cause skipping and finds moved entries by identity.
In `@crates/vm/src/types/slot_defs.rs`:
- Around line 454-469: The derived type can end up with mismatched tp_call and
vectorcall handlers because inherit_main!(call) picks the first MRO entry with
call while the current find_map may pick a later MRO entry for vectorcall; fix
by selecting vectorcall from the same MRO entry that inherit_main!(call) chose.
Concretely, locate where inherit_main!(call) determines the chosen base (the mro
iteration that provides cls.slots.call.load()) and change the vectorcall lookup
to use that exact cls (or its vectorcall value) instead of independently
scanning mro; update the code that sets typ.slots.vectorcall.store(...) to use
the vectorcall from the same base whose call was inherited (referencing
inherit_main!, call, mro, cls.slots.call.load, cls.slots.vectorcall.load, and
typ.slots.vectorcall.store).
In `@scripts/generate_opcode_metadata.py`:
- Around line 79-91: Rename the ambiguous lambda parameter `l` to a descriptive
name (e.g., `line`) in the comprehension that builds `body` to satisfy PEP8
E741: replace both occurrences of `lambda l: not l.startswith("_ =>")` and
`lambda l: (not l.startswith(("//", "Some(match"))) )` with `lambda line: ...`
(or a similar descriptive name) so the `map(str.strip, raw_body.splitlines())`,
`filter(...)`, and `itertools.takewhile(...)` calls remain functionally
identical but use `line` for clarity.
---
Outside diff comments:
In `@crates/derive-impl/src/pyclass.rs`:
- Around line 914-1000: The FORBIDDEN_SLOT_METHODS list (used when
!args.context.is_trait) is missing the destructor slot __del__; add an entry
like ("__del__", "Destructor/TpDel") to the const FORBIDDEN_SLOT_METHODS so
plain #[pymethod] fn __del__ is rejected just like the other slot-backed dunders
(see FORBIDDEN_SLOT_METHODS, args.context.is_trait and the slot_defs::TpDel
relation).
In `@crates/vm/src/builtins/complex.rs`:
- Around line 97-118: The current branch is unreachable because it redundantly
re-checks result.downcast_ref::<PyComplex>(); instead, first attempt
downcast_ref::<PyComplex>() once and if it yields Some(py_complex) then check
whether the object's class is exactly the built-in complex type (compare
result.class() or py_complex.class() to vm.ctx.types.complex_type); if it is the
exact complex class return Ok((py_complex.value, true)), otherwise call
_warnings::warn(...) and then return Ok((py_complex.value, true)); if the
initial downcast_ref::<PyComplex>() returns None then return the
vm.new_type_error(...) about __complex__ returning non-complex. Ensure you
update the logic around result.downcast_ref::<PyComplex>(), the _warnings::warn
call, and the vm.new_type_error path accordingly.
In `@crates/vm/src/builtins/float.rs`:
- Around line 121-154: The zero-division error messages for float ops are
generic; update inner_div, inner_mod, inner_floordiv, and inner_divmod to raise
vm.new_zero_division_error with Python-specific messages: inner_div should use
"float division by zero", inner_mod "float modulo", inner_floordiv "float floor
division by zero", and inner_divmod "float divmod()". Locate these functions
(inner_div, inner_mod, inner_floordiv, inner_divmod) and replace the current
"division by zero" strings with the operation-specific messages so exceptions
match CPython behavior.
In `@crates/vm/src/vm/mod.rs`:
- Around line 1566-1619: The function with_frame_untraced actually suppresses
both tracing and profiling because the traced == false branch bypasses
dispatch_traced_frame (which emits TraceEvent::Call/Return and triggers
trace_event/profile callbacks); rename with_frame_untraced to a clearer name
(e.g., with_frame_uninstrumented or with_frame_no_trace) and update its
declaration and all call sites (including the internal cleanup usage in
frame.rs) to the new name, and add a concise doc comment on the renamed function
explaining that it suppresses both sys.settrace() and sys.setprofile()
instrumentation (mentioning with_frame_impl and
dispatch_traced_frame/trace_event for context).
In `@scripts/generate_opcode_metadata.py`:
- Around line 4-5: The module docstring at the top currently reads "compatible
with CPython 3.13" but the module later declares "CPython 3.14 compatible opcode
numbers"; update the top docstring to "compatible with CPython 3.14" so the file
consistently references CPython 3.14 (search for the top-level module docstring
and the literal "CPython 3.14 compatible opcode numbers" to locate the lines to
change).
---
Nitpick comments:
In `@crates/compiler-core/src/bytecode.rs`:
- Around line 321-335: Remove the blanket impls of Index/IndexMut for [T] that
take oparg::VarNum and instead introduce a newtype wrapper for the
CodeObject.varnames field (e.g., a Varnames<T> or VarNames(Vec<T>/Box<[T]>)
type) and move the Index/IndexMut implementations onto that newtype so only the
newtype can be indexed by oparg::VarNum; implement Index/IndexMut for
Varnames<T> using var_num.as_usize() to access the underlying storage and update
CodeObject.varnames to use the new type.
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 1231-1247: The two arms (Self::LoadFastLoadFast and
Self::LoadFastBorrowLoadFastBorrow) (and the similar STORE_FAST_LOAD_FAST case)
duplicate the same logic of fetching var_nums.get(arg), calling oparg.indexes(),
resolving varname(idx) for both indexes and writing the formatted string;
extract a small helper function (e.g., format_paired_vars or
fmt_paired_fast_locals) that takes the opcode label string, the formatter f,
&var_nums and arg, performs var_nums.get(arg) -> oparg.indexes() ->
varname(idx1/idx2) and does the write! call, then replace the duplicated blocks
to call that helper to keep output consistent and avoid divergence.
In `@crates/stdlib/src/_asyncio.rs`:
- Around line 1305-1320: The PyTask::make_cancelled_error_impl implementation
duplicates PyFuture::make_cancelled_error_impl; replace the body to delegate to
the base PyFuture by returning self.base.make_cancelled_error_impl(vm) (adjust
visibility if needed so PyFuture::make_cancelled_error_impl is callable from
PyTask), keeping the same signature (fn make_cancelled_error_impl(&self, vm:
&VirtualMachine) -> PyBaseExceptionRef) and preserving behavior with
VirtualMachine/get_cancelled_error_type.
In `@crates/stdlib/src/_tokenize.rs`:
- Around line 640-644: Replace the opaque 6-tuple type used in
pending_fstring_parts with a named struct to improve clarity: define a struct
(e.g., PendingTokenPart) with fields token_type: u8, text: String, start_line:
usize, start_col: usize, end_line: usize, end_col: usize and update the
pending_fstring_parts field to Vec<PendingTokenPart>; update any places that
construct, deconstruct, or index into the tuple to use the struct field names
(also keep pending_empty_fstring_middle as-is unless you want a parallel
refactor) so callers use PendingTokenPart.token_type, .text, .start_line, etc.,
instead of tuple positions.
- Around line 107-147: The next method on IterNext for PyTokenizerIter clones
the state via zelf.state.read().clone() and later writes it back, which can lose
concurrent updates; change to acquire a write lock once (use zelf.state.write())
and mutate the state in place while calling readline, updating TokenizerPhase
variants, and invoking emit_next_token so modifications are atomic;
alternatively, document that concurrent iterator access is unsupported if you
prefer not to change locking, but do not retain the read-clone-write pattern
used with zelf.state.read()/clone() and write().
- Around line 248-256: Compute source.lines().count() once and reuse it instead
of calling it multiple times: introduce a local variable (e.g., total_lines or
last_line) near the start of the function or before the Yielding phase and
replace all occurrences of source.lines().count() (used to compute
last_line/default_pos at the block with extra_tokens and the later uses around
lines referenced) with that cached variable; ensure existing logic that builds
default_pos and the call to next_non_dedent_info(tokens, *index, source,
line_index, default_pos) uses the cached value so behavior is unchanged.
In `@crates/vm/src/builtins/code.rs`:
- Around line 474-489: Replace the manual loop that compares a_consts and
b_consts with an iterator-based check: after collecting a.constants and
b.constants (used in a_consts and b_consts) and verifying lengths match, use an
iterator zip over a_consts.iter().zip(b_consts.iter()) combined with a fallible
iterator adapter such as try_for_each or try_fold to call vm.bool_eq(ac, bc)?
for each pair and aggregate to a single boolean (true only if all pairs are
equal), returning early on error; update the block that currently sets and
returns eq to this iterator-based approach while keeping the existing length
check and using the same vm.bool_eq call.
- Around line 495-517: The current Hashable::hash impl for PyCode allocates many
temporary Python objects (vm.ctx.new_* and a tuple) on every call; change it to
avoid allocations by either (A) adding a cached, lazily-computed atomic hash
field to the PyCode struct and computing/storing it once in PyCode::hash, or (B)
computing the hash directly from the Rust fields without creating Python
objects: use crate::common::hash::HashSecret (or the project's hashing
utilities) and feed code.obj_name.as_str(), code.arg_count,
code.posonlyarg_count, code.kwonlyarg_count, code.varnames.len(),
code.flags.bits(), code.first_line_number.map_or(0, …), and
code.instructions.original_bytes() into the hasher, handle code.constants with
the same semantics as equality (or include their precomputed hashes), then
return crate::common::hash::fix_sentinel(hasher.finish() as i64) as the PyHash;
implement atomic lazy init if choosing caching to be thread-safe.
In `@crates/vm/src/builtins/weakproxy.rs`:
- Around line 243-290: Add regression tests that exercise the new numeric slot
surface introduced in AS_NUMBER: cover unary (proxy_unary_slot! handlers like
_neg/_pos/_abs), binary (proxy_binary_slot! handlers like _add/_sub/_mul),
in-place (inplace_* handlers such as _iadd/_imul) and ternary
(proxy_ternary_slot! handlers like _pow/_ipow) dispatch; specifically include
tests where the weak proxy is the lhs and rhs of operations, where the
underlying referent has been dropped (dead proxy), and a pow(proxy, x, mod) case
to validate three-argument pow dispatch. Use the proxy creation/upgrade
utilities (proxy_upgrade and the weak proxy type in weakproxy.rs) to construct
proxies and drop targets to assert correct results or TypeError/ReferenceError
behavior so regressions in AS_NUMBER numeric slots are detected.
In `@crates/vm/src/gc_state.rs`:
- Around line 426-431: Extract the duplicated generation-reset logic in
collect_inner into a GcState helper method (e.g., fn
reset_generation_counts(&self, generation: usize)) that computes reset_end = if
generation >= 2 { 2 } else { generation + 1 } and loops over
self.generations[i].count.store(0, Ordering::SeqCst); then replace each repeated
block in collect_inner with a call to self.reset_generation_counts(generation)
so all four occurrences use the single helper.
In `@crates/vm/src/protocol/object.rs`:
- Around line 373-379: The ascii() method duplicates the Ok(...) wrapper across
two branches; refactor by first obtaining repr via self.repr(vm) and then
compute a single PyRef<PyStr> result value: if repr.as_wtf8().is_ascii() use
repr, otherwise create vm.ctx.new_str(to_ascii(repr.as_wtf8())); finally return
Ok(result) once. Update the function body around ascii(), repr(), to_ascii() and
vm.ctx.new_str(...) usage so only the differing value is chosen in the
conditional and the Ok(...) is returned a single time.
In `@crates/vm/src/stdlib/gc.rs`:
- Around line 202-210: The unsafe dereference unsafe { fp.as_ref() } in the
stack_frames construction must be annotated with a short safety comment: explain
that vm.frames is borrowed for the duration of this scope, that it stores
NonNull<crate::PyObject> (or equivalent raw pointers) which are owned/kept alive
by the VM/GC root set, and thus each fp is valid and not concurrently mutated
while borrowed—so calling fp.as_ref() and converting to a usize is sound; add
this comment immediately above the map closure referencing vm.frames,
fp.as_ref(), and stack_frames.
In `@extra_tests/snippets/builtin_ascii.py`:
- Around line 24-27: Add tests that exercise the "%a" formatting path in
addition to direct ascii() calls: for the custom types Foo and Bar, assert that
"%a" % Foo() produces the same MyStr-typed result as ascii(Foo()) and that "%a"
% Bar() yields the same byte-escaped string as ascii(Bar()); also add the
analogous bytes-format checks (e.g. b"%a" % Foo()/Bar()) to cover the bytes
conversion path touched in cformat.rs so the PR protects cross-file behavior
changes.
In `@scripts/generate_opcode_metadata.py`:
- Around line 75-118: The build_deopts function blindly calls .group(1) on the
re.search result which will raise if the deopt regex doesn't match; change it to
capture the search result (e.g., m = re.search(...)) and check if m is None
before using m.group(1), and if missing either return an empty dict or raise a
clear ValueError/RuntimeError mentioning the missing deopt function and include
enough context (e.g., a snippet or filename) to aid debugging; update
build_deopts to use that guarded match variable instead of calling .group(1)
directly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: d1d61b06-e7c2-4008-995a-914ba671db48
⛔ Files ignored due to path filters (44)
Cargo.lockis excluded by!**/*.lockLib/_opcode_metadata.pyis excluded by!Lib/**Lib/modulefinder.pyis excluded by!Lib/**Lib/poplib.pyis excluded by!Lib/**Lib/test/_code_definitions.pyis excluded by!Lib/**Lib/test/_test_atexit.pyis excluded by!Lib/**Lib/test/_test_eintr.pyis excluded by!Lib/**Lib/test/_test_gc_fast_cycles.pyis excluded by!Lib/**Lib/test/_test_multiprocessing.pyis excluded by!Lib/**Lib/test/test__opcode.pyis excluded by!Lib/**Lib/test/test_asyncio/test_unix_events.pyis excluded by!Lib/**Lib/test/test_builtin.pyis excluded by!Lib/**Lib/test/test_code.pyis excluded by!Lib/**Lib/test/test_codeop.pyis excluded by!Lib/**Lib/test/test_collections.pyis excluded by!Lib/**Lib/test/test_complex.pyis excluded by!Lib/**Lib/test/test_concurrent_futures/test_process_pool.pyis excluded by!Lib/**Lib/test/test_concurrent_futures/test_wait.pyis excluded by!Lib/**Lib/test/test_decorators.pyis excluded by!Lib/**Lib/test/test_descr.pyis excluded by!Lib/**Lib/test/test_dis.pyis excluded by!Lib/**Lib/test/test_format.pyis excluded by!Lib/**Lib/test/test_ftplib.pyis excluded by!Lib/**Lib/test/test_gc.pyis excluded by!Lib/**Lib/test/test_generator_stop.pyis excluded by!Lib/**Lib/test/test_generators.pyis excluded by!Lib/**Lib/test/test_inspect/test_inspect.pyis excluded by!Lib/**Lib/test/test_marshal.pyis excluded by!Lib/**Lib/test/test_math.pyis excluded by!Lib/**Lib/test/test_modulefinder.pyis excluded by!Lib/**Lib/test/test_multiprocessing_fork/test_manager.pyis excluded by!Lib/**Lib/test/test_multiprocessing_fork/test_misc.pyis excluded by!Lib/**Lib/test/test_multiprocessing_fork/test_threads.pyis excluded by!Lib/**Lib/test/test_poplib.pyis excluded by!Lib/**Lib/test/test_sqlite3/test_dbapi.pyis excluded by!Lib/**Lib/test/test_str.pyis excluded by!Lib/**Lib/test/test_symtable.pyis excluded by!Lib/**Lib/test/test_tabnanny.pyis excluded by!Lib/**Lib/test/test_tokenize.pyis excluded by!Lib/**Lib/test/test_types.pyis excluded by!Lib/**Lib/test/test_unpack_ex.pyis excluded by!Lib/**Lib/test/test_weakref.pyis excluded by!Lib/**Lib/test/test_yield_from.pyis excluded by!Lib/**Lib/tokenize.pyis excluded by!Lib/**
📒 Files selected for processing (140)
.cspell.dict/rust-more.txt.cspell.dict/rustpython.txt.cspell.json.github/actions/install-linux-deps/action.yml.github/workflows/ci.yaml.github/workflows/cron-ci.yaml.github/workflows/lib-deps-check.yaml.github/workflows/pr-format.yaml.github/workflows/release.yml.github/workflows/update-doc-db.yml.github/workflows/update-libs-status.yaml.github/workflows/upgrade-pylib.lock.ymlCargo.tomlcrates/codegen/src/compile.rscrates/codegen/src/ir.rscrates/codegen/src/symboltable.rscrates/compiler-core/Cargo.tomlcrates/compiler-core/src/bytecode.rscrates/compiler-core/src/bytecode/instruction.rscrates/compiler-core/src/bytecode/oparg.rscrates/derive-impl/src/pyclass.rscrates/jit/src/instructions.rscrates/jit/tests/common.rscrates/stdlib/Cargo.tomlcrates/stdlib/src/_asyncio.rscrates/stdlib/src/_opcode.rscrates/stdlib/src/_sqlite3.rscrates/stdlib/src/_tokenize.rscrates/stdlib/src/array.rscrates/stdlib/src/faulthandler.rscrates/stdlib/src/fcntl.rscrates/stdlib/src/lib.rscrates/stdlib/src/math.rscrates/stdlib/src/multiprocessing.rscrates/stdlib/src/openssl.rscrates/stdlib/src/select.rscrates/stdlib/src/socket.rscrates/stdlib/src/ssl.rscrates/stdlib/src/ssl/compat.rscrates/vm/Cargo.tomlcrates/vm/src/builtins/bool.rscrates/vm/src/builtins/classmethod.rscrates/vm/src/builtins/code.rscrates/vm/src/builtins/complex.rscrates/vm/src/builtins/dict.rscrates/vm/src/builtins/float.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/genericalias.rscrates/vm/src/builtins/int.rscrates/vm/src/builtins/list.rscrates/vm/src/builtins/map.rscrates/vm/src/builtins/set.rscrates/vm/src/builtins/singletons.rscrates/vm/src/builtins/slice.rscrates/vm/src/builtins/str.rscrates/vm/src/builtins/tuple.rscrates/vm/src/builtins/type.rscrates/vm/src/builtins/union.rscrates/vm/src/builtins/weakproxy.rscrates/vm/src/cformat.rscrates/vm/src/format.rscrates/vm/src/frame.rscrates/vm/src/gc_state.rscrates/vm/src/object/ext.rscrates/vm/src/ospath.rscrates/vm/src/protocol/callable.rscrates/vm/src/protocol/number.rscrates/vm/src/protocol/object.rscrates/vm/src/signal.rscrates/vm/src/stdlib/_ast.rscrates/vm/src/stdlib/_ast/argument.rscrates/vm/src/stdlib/_ast/basic.rscrates/vm/src/stdlib/_ast/constant.rscrates/vm/src/stdlib/_ast/elif_else_clause.rscrates/vm/src/stdlib/_ast/exception.rscrates/vm/src/stdlib/_ast/expression.rscrates/vm/src/stdlib/_ast/module.rscrates/vm/src/stdlib/_ast/node.rscrates/vm/src/stdlib/_ast/operator.rscrates/vm/src/stdlib/_ast/other.rscrates/vm/src/stdlib/_ast/parameter.rscrates/vm/src/stdlib/_ast/pattern.rscrates/vm/src/stdlib/_ast/pyast.rscrates/vm/src/stdlib/_ast/python.rscrates/vm/src/stdlib/_ast/repr.rscrates/vm/src/stdlib/_ast/statement.rscrates/vm/src/stdlib/_ast/string.rscrates/vm/src/stdlib/_ast/type_ignore.rscrates/vm/src/stdlib/_ast/type_parameters.rscrates/vm/src/stdlib/_ast/validate.rscrates/vm/src/stdlib/_codecs.rscrates/vm/src/stdlib/_collections.rscrates/vm/src/stdlib/_ctypes.rscrates/vm/src/stdlib/_ctypes/array.rscrates/vm/src/stdlib/_ctypes/base.rscrates/vm/src/stdlib/_ctypes/function.rscrates/vm/src/stdlib/_ctypes/library.rscrates/vm/src/stdlib/_ctypes/pointer.rscrates/vm/src/stdlib/_ctypes/simple.rscrates/vm/src/stdlib/_ctypes/structure.rscrates/vm/src/stdlib/_ctypes/union.rscrates/vm/src/stdlib/_functools.rscrates/vm/src/stdlib/_imp.rscrates/vm/src/stdlib/_io.rscrates/vm/src/stdlib/_operator.rscrates/vm/src/stdlib/_signal.rscrates/vm/src/stdlib/_sre.rscrates/vm/src/stdlib/_stat.rscrates/vm/src/stdlib/_string.rscrates/vm/src/stdlib/_symtable.rscrates/vm/src/stdlib/_sysconfig.rscrates/vm/src/stdlib/_sysconfigdata.rscrates/vm/src/stdlib/_thread.rscrates/vm/src/stdlib/_typing.rscrates/vm/src/stdlib/_warnings.rscrates/vm/src/stdlib/_weakref.rscrates/vm/src/stdlib/_winapi.rscrates/vm/src/stdlib/atexit.rscrates/vm/src/stdlib/builtins.rscrates/vm/src/stdlib/gc.rscrates/vm/src/stdlib/mod.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rscrates/vm/src/stdlib/sys.rscrates/vm/src/stdlib/time.rscrates/vm/src/stdlib/typevar.rscrates/vm/src/types/slot.rscrates/vm/src/types/slot_defs.rscrates/vm/src/types/structseq.rscrates/vm/src/types/zoo.rscrates/vm/src/vm/context.rscrates/vm/src/vm/mod.rscrates/vm/src/vm/thread.rscrates/vm/src/vm/vm_new.rscrates/vm/src/vm/vm_ops.rsextra_tests/snippets/builtin_ascii.pyextra_tests/snippets/builtin_none.pyextra_tests/test_manager_fork_debug.pyscripts/generate_opcode_metadata.pywapm.toml
💤 Files with no reviewable changes (1)
- crates/vm/src/types/structseq.rs
✅ Files skipped from review due to trivial changes (17)
- .cspell.dict/rustpython.txt
- .cspell.dict/rust-more.txt
- crates/compiler-core/Cargo.toml
- crates/vm/src/stdlib/_ast/expression.rs
- crates/vm/src/stdlib/_ast.rs
- crates/vm/src/stdlib/_ctypes/structure.rs
- crates/vm/src/stdlib/_ast/repr.rs
- crates/vm/src/stdlib/_imp.rs
- crates/vm/src/stdlib/_ast/python.rs
- crates/stdlib/Cargo.toml
- crates/vm/src/stdlib/_ast/module.rs
- crates/vm/src/stdlib/typevar.rs
- crates/vm/src/stdlib/_ctypes/union.rs
- wapm.toml
- crates/vm/src/signal.rs
- crates/vm/src/vm/vm_new.rs
- .github/workflows/upgrade-pylib.lock.yml
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/vm/src/stdlib/time.rs
- crates/stdlib/src/multiprocessing.rs
👮 Files not reviewed due to content moderation or server errors (2)
- crates/codegen/src/compile.rs
- crates/vm/src/frame.rs
| - name: Install dependencies | ||
| uses: ./.github/actions/install-linux-deps | ||
| with: ${{ matrix.dependencies || fromJSON('{}') }} | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -n -C2 '^\s+with:\s+\$\{\{' .github/workflows/ci.yamlRepository: RustPython/RustPython
Length of output: 274
🏁 Script executed:
cat .github/actions/install-linux-deps/action.yamlRepository: RustPython/RustPython
Length of output: 143
🏁 Script executed:
fd -t f "action\.(yaml|yml)" .github/Repository: RustPython/RustPython
Length of output: 156
🏁 Script executed:
cat .github/actions/install-linux-deps/action.ymlRepository: RustPython/RustPython
Length of output: 1730
with: must be a mapping, not an expression.
GitHub Actions requires with: to be a mapping of key-value pairs, not a scalar expression. Line 144's with: ${{ matrix.dependencies || fromJSON('{}') }} violates YAML schema and will prevent the workflow from loading.
Pass the action's defined inputs explicitly instead:
Corrected step
- name: Install dependencies
uses: ./.github/actions/install-linux-deps
with:
+ gcc-multilib: ${{ matrix.dependencies['gcc-multilib'] || 'false' }}
+ musl-tools: ${{ matrix.dependencies['musl-tools'] || 'false' }}
+ gcc-aarch64-linux-gnu: ${{ matrix.dependencies['gcc-aarch64-linux-gnu'] || 'false' }}
+ clang: ${{ matrix.dependencies['clang'] || 'false' }}🧰 Tools
🪛 actionlint (1.7.11)
[error] 144-144: "with" section is scalar node but mapping node is expected
(syntax-check)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/ci.yaml around lines 142 - 145, The workflow step named
"Install dependencies" uses the composite action
./.github/actions/install-linux-deps but incorrectly sets with: to the scalar
expression `${{ matrix.dependencies || fromJSON('{}') }}`; change the step so
that with: is a YAML mapping of the action's input keys (the inputs defined by
the composite action) and pass values from matrix.dependencies explicitly (e.g.,
map each expected input key to `${{ matrix.dependencies.<key> }}` or a default),
ensuring the with: block is a proper mapping rather than a single expression.
| - name: Check for formatting changes | ||
| run: | | ||
| if ! git diff --exit-code; then | ||
| echo "::error::Formatting changes detected. Please run 'cargo fmt --all', 'ruff format', and 'ruff check --select I --fix' locally." | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -n -C2 'astral-sh/ruff-action|version:\s*"0\.15\.[0-9]+"' .github/workflows/pr-format.yaml .github/workflows/ci.yamlRepository: RustPython/RustPython
Length of output: 899
Align Ruff versions across formatting and lint workflows.
This job now fails the PR on formatter output using astral-sh/ruff-action@v3.6.1, but specifies Ruff 0.15.4 while .github/workflows/ci.yaml specifies 0.15.5. A patch-level formatter drift can cause the two checks to disagree on the same code, leading to contradictory CI results.
Suggested change
- version: "0.15.4"
+ version: "0.15.5"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/pr-format.yaml around lines 58 - 63, The formatting check
uses astral-sh/ruff-action@v3.6.1 but pins Ruff to 0.15.4 which conflicts with
ci.yaml's 0.15.5; update the Ruff version string used in the job that runs
"Check for formatting changes" (the block using astral-sh/ruff-action) to match
the CI workflow (change '0.15.4' to '0.15.5' or whatever version ci.yaml uses)
so both formatting and lint workflows use the same Ruff patch release.
| [patch.crates-io] | ||
| parking_lot_core = { git = "https://github.com/youknowone/parking_lot", branch = "rustpython" } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n '^\[patch\.crates-io\]|parking_lot_core\s*=\s*\{[^}]*branch\s*=' Cargo.toml
if [[ -f Cargo.lock ]]; then
rg -n -A2 -B2 '^name = "parking_lot_core"$|^source = "git\+https://github.com/youknowone/parking_lot' Cargo.lock
fiRepository: RustPython/RustPython
Length of output: 423
Pin the parking_lot_core override to a commit.
A workspace-wide branch = "rustpython" patch makes fresh lockfile generation and cargo update runs resolve to a moving target. While the current Cargo.lock commits a specific revision, pinning to a fixed rev in Cargo.toml is the correct practice and improves reproducibility and source auditability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Cargo.toml` around lines 103 - 104, Replace the moving branch override for
parking_lot_core under [patch.crates-io] by pinning it to a specific commit:
locate the parking_lot_core entry that currently uses git =
"https://github.com/youknowone/parking_lot" and branch = "rustpython" and change
it to use rev = "<commit-sha>" (a specific commit hash) instead of branch to
ensure deterministic resolution and reproducible lockfile generation.
| impl TryFrom<u32> for MakeFunctionFlag { | ||
| type Error = MarshalError; | ||
|
|
||
| fn try_from(value: u32) -> Result<Self, Self::Error> { | ||
| Self::from_bits(value as u8).ok_or(Self::Error::InvalidBytecode) | ||
| Self::try_from(value as u8).map_err(|_| MarshalError::InvalidBytecode) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the file to understand the MakeFunctionFlag definition
cat -n crates/compiler-core/src/bytecode/oparg.rs | head -100Repository: RustPython/RustPython
Length of output: 3069
🏁 Script executed:
# Get the full file to understand the structure
wc -l crates/compiler-core/src/bytecode/oparg.rsRepository: RustPython/RustPython
Length of output: 111
🏁 Script executed:
# Look at the specific section around lines 353-383
sed -n '340,390p' crates/compiler-core/src/bytecode/oparg.rsRepository: RustPython/RustPython
Length of output: 1405
🏁 Script executed:
# Search for the MakeFunctionFlag definition
rg "MakeFunctionFlag" crates/compiler-core/src/bytecode/oparg.rs -B 3 -A 5Repository: RustPython/RustPython
Length of output: 952
🏁 Script executed:
# Check if there's a TryFrom<u8> implementation for MakeFunctionFlag or if bitflagset provides it
rg "impl.*TryFrom.*u8.*MakeFunctionFlag" crates/compiler-core/src/bytecode/oparg.rs -A 5Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Look for bitflagset documentation or how it implements TryFrom
rg "bitflagset" crates/ -l | head -5Repository: RustPython/RustPython
Length of output: 139
🏁 Script executed:
# Check the Cargo.toml to understand the bitflagset dependency
rg "bitflagset" Cargo.tomlRepository: RustPython/RustPython
Length of output: 85
🏁 Script executed:
# Search for any other TryFrom implementations in the file to understand the pattern
rg "impl.*TryFrom" crates/compiler-core/src/bytecode/oparg.rs -B 2 -A 5Repository: RustPython/RustPython
Length of output: 1024
🏁 Script executed:
# Check what bitflagset::bitflag! generates by looking at documentation or testing if TryFrom<u8> exists
# Let's examine the full context around MakeFunctionFlag to understand if bitflagset provides TryFrom<u8>
sed -n '350,390p' crates/compiler-core/src/bytecode/oparg.rsRepository: RustPython/RustPython
Length of output: 1040
🏁 Script executed:
# Look for uses of MakeFunctionFlag to understand how it's consumed
rg "MakeFunctionFlag" crates/ --type rust -B 2 -A 2 | head -50Repository: RustPython/RustPython
Length of output: 2900
🏁 Script executed:
# Check if bitflagset generates TryFrom<u8> by examining the macro or searching for its usage pattern
web_search
What does bitflagset::bitflag! macro generate for TryFrom implementationsRepository: RustPython/RustPython
Length of output: 158
🌐 Web query:
bitflagset crate Rust TryFrom<u8> implementation
💡 Result:
No Rust crate named "bitflagset" exists on crates.io or GitHub based on comprehensive searches. The query likely refers to bitflags-style crates handling u8 bitflags with TryFrom<u8> implementations. Popular alternatives: - bitflags (most common): Macro generates structs like struct Flags: u8. Use Flags::from_bits(value) (returns Option) or Flags::from_bits_truncate(value) for TryFrom-like behavior. No direct TryFrom<u8> for the flags struct, but bits and from_bits methods provided. Docs: https://docs.rs/bitflags Example: use bitflags::bitflags; bitflags! { pub struct Flags: u8 { const A = 0b00000001; const B = 0b00000010; } } let flags = Flags::from_bits(0b11).unwrap; // TryFrom equivalent - flagset: Uses flags! macro for enum flags with underlying type (e.g., u8). FlagSet<F> has FlagSet::new(u8_value): Result<FlagSet<F>, InvalidBitsError> which is exactly TryFrom<u8> semantics. Example: use flagset::{FlagSet, flags}; flags! { pub enum Flag: u8 { A = 0b01, B = 0b10, } } let set = FlagSet::<Flag>::new(0b11).unwrap; // Direct TryFrom<u8> - bitflag: Attribute macro #[bitflag(u8)] enum Flags { ... } generates flags type with similar API to bitflags. To implement custom TryFrom<u8> for enum/bitflags: use std::convert::TryFrom; #[repr(u8)] enum MyFlags { A = 1, B = 2 } impl TryFrom<u8> for MyFlags { type Error = ; fn try_from(v: u8) -> Result<Self, > { match v { x if x == Self::A as u8 => Ok(Self::A), x if x == Self::B as u8 => Ok(Self::B), _ => Err(), } } } (Source: StackOverflow example)
Citations:
- 1: https://docs.rs/flagset/latest/flagset
- 2: https://docs.rs/flagset/latest/flagset/struct.FlagSet.html
- 3: https://crates.io/crates/bitflags
- 4: https://docs.rs/bitflags
- 5: https://docs.rs/bitflag/latest/bitflag
- 6: https://crates.io/crates/flagset
- 7: https://github.com/bitflags/bitflags
- 8: https://github.com/enarx/flagset
- 9: https://stackoverflow.com/questions/71167454/rust-implement-try-from-for-u8-enum
🏁 Script executed:
# Let's check what bitflagset actually provides by examining any imports or test cases
rg "bitflagset" crates/compiler-core/src/ -B 5 -A 10 | head -80Repository: RustPython/RustPython
Length of output: 2161
🏁 Script executed:
# Search for how MakeFunctionFlag is decoded in the codebase - this will show the actual usage pattern
rg "Arg.*try_get\|try_from.*u32" crates/compiler-core/src/bytecode/oparg.rs -B 3 -A 3Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check for actual bytecode decoding that would call this TryFrom
rg "MakeFunctionFlag.*try_from\|operand.*MakeFunctionFlag" crates/ --type rust -B 2 -A 2Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's check if bitflagset is a local crate or workspace member
fd bitflagset crates/ -type dRepository: RustPython/RustPython
Length of output: 235
🏁 Script executed:
# Check if bitflagset is defined locally
find crates -name "Cargo.toml" -exec grep -l "name.*bitflagset" {} \;Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Search for where this TryFrom implementation is actually called
rg "Arg::try_get\|try_from.*MakeFunctionFlag" crates/ --type rust -B 3 -A 3Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's look at the oparg_enum! macro to see the correct pattern that's being used elsewhere
sed -n '1,300p' crates/compiler-core/src/bytecode/oparg.rs | grep -A 30 "macro_rules! oparg_enum"Repository: RustPython/RustPython
Length of output: 770
🏁 Script executed:
# Let's see the full impl_oparg_enum macro to understand the proper TryFrom pattern
sed -n '100,250p' crates/compiler-core/src/bytecode/oparg.rsRepository: RustPython/RustPython
Length of output: 3963
🏁 Script executed:
# Check if bitflagset is imported or if there's a macro definition for it
rg "bitflagset\|bitflag!" crates/compiler-core/src -B 5 -A 15 | head -100Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Look for the bitflagset crate in Cargo.lock to understand what version is used
rg "name = \"bitflagset\"" Cargo.lock -A 5Repository: RustPython/RustPython
Length of output: 278
🏁 Script executed:
# Check if there's a custom implementation or tests that show what TryFrom<u8> does for MakeFunctionFlag
rg "MakeFunctionFlag.*0x\|MakeFunctionFlag.*255\|MakeFunctionFlag.*256" crates/ -B 2 -A 2Repository: RustPython/RustPython
Length of output: 47
Reject out-of-range MakeFunctionFlag operands before narrowing them.
value as u8 truncates anything above 0xff, so malformed extended-arg bytecode can be decoded as a different flag instead of raising InvalidBytecode. This is inconsistent with the safe pattern used in oparg_enum! elsewhere in this file, which validates the bounds first using u8::try_from(value).
Suggested fix
impl TryFrom<u32> for MakeFunctionFlag {
type Error = MarshalError;
fn try_from(value: u32) -> Result<Self, Self::Error> {
- Self::try_from(value as u8).map_err(|_| MarshalError::InvalidBytecode)
+ let value = u8::try_from(value).map_err(|_| MarshalError::InvalidBytecode)?;
+ Self::try_from(value).map_err(|_| MarshalError::InvalidBytecode)
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/compiler-core/src/bytecode/oparg.rs` around lines 372 - 377, The
TryFrom<u32> impl for MakeFunctionFlag currently narrows with value as u8 which
silently truncates out-of-range operands; change it to validate before narrowing
by using u8::try_from(value).map_err(|_| MarshalError::InvalidBytecode) and then
call Self::try_from on the validated u8, so malformed extended-arg bytecode
yields MarshalError::InvalidBytecode (follow the same pattern used by the
oparg_enum! helpers).
| let end_pos = |current: &str, start_line: usize, start_col: usize| -> (usize, usize) { | ||
| let mut el = start_line; | ||
| let mut ec = start_col; | ||
| for ch in current.chars() { | ||
| if ch == '\n' { | ||
| el += 1; | ||
| ec = 0; | ||
| } else { | ||
| ec += ch.len_utf8(); | ||
| } | ||
| } | ||
| (el, ec) | ||
| }; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does Python tokenize module count columns by bytes or characters?
💡 Result:
The Python tokenize module counts columns by characters, not bytes. The official documentation states that tokenize takes lines as bytes via readline, detects the encoding (defaulting to UTF-8), decodes them to Unicode strings, and reports positions as (row, column) where columns are character positions in the decoded string. Examples show ASCII text with 1:1 byte-to-char mapping (e.g., 'def' at 1,0-1,3), but for multi-byte UTF-8 characters, columns advance by character count (1 per Unicode code point), not byte count. The Python implementation in Lib/tokenize.py processes decoded lines, using methods like line.find which return character indices. The C tokenizer (used internally via _tokenize.TokenizerIter) also computes logical column positions based on decoded characters, handling tabs with TABSIZE=8 and translating newlines, confirming character-based counting after decoding.
Citations:
- 1: https://docs.python.org/3/library/tokenize.html
- 2: https://docs.python.org/3.12/library/tokenize.html
- 3: https://github.com/python/cpython/blob/main/Lib/tokenize.py
- 4: https://github.com/python/cpython/blob/3.10/Lib/tokenize.py
Fix column tracking to count characters instead of bytes.
The end_pos closure increments ec by ch.len_utf8() (byte length), but Python's tokenize module counts columns by character position. For multi-byte UTF-8 characters (e.g., emoji), this produces incorrect column values. Change ec += ch.len_utf8() to ec += 1 to count characters.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/stdlib/src/_tokenize.rs` around lines 519 - 531, The end_pos closure
is miscounting column positions by adding byte length per char; update the
closure (end_pos with parameters current, start_line, start_col) to increment ec
by 1 for each non-newline character instead of ec += ch.len_utf8() so columns
are counted as characters (matching Python tokenize behavior) while preserving
newline handling and returning (el, ec).
| if obj.class().slots.iternext.load().is_none() { | ||
| return Err(vm.new_type_error("Weakref proxy referenced a non-iterator".to_owned())); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the file exists and examine the relevant lines
if [ -f "crates/vm/src/builtins/weakproxy.rs" ]; then
echo "=== Lines 135-145 of weakproxy.rs ==="
sed -n '135,145p' crates/vm/src/builtins/weakproxy.rs
echo ""
echo "=== Checking for context around IterNext::next ==="
sed -n '130,150p' crates/vm/src/builtins/weakproxy.rs
else
echo "File not found"
fiRepository: RustPython/RustPython
Length of output: 1263
🏁 Script executed:
#!/bin/bash
# Check the entire IterNext implementation for PyWeakProxy
echo "=== IterNext implementation ==="
sed -n '130,160p' crates/vm/src/builtins/weakproxy.rsRepository: RustPython/RustPython
Length of output: 1021
🏁 Script executed:
#!/bin/bash
# Search for other error messages in the file to see patterns
echo "=== All TypeError messages in weakproxy.rs ==="
rg "new_type_error|TypeError" crates/vm/src/builtins/weakproxy.rs -A 2 -B 2Repository: RustPython/RustPython
Length of output: 350
🌐 Web query:
CPython weakref proxy iternext error message type name
💡 Result:
In CPython, when calling next (tp_iternext) on a weakref proxy to an iterator whose referent is dead (weakly-referenced object no longer exists), the proxy_check_ref fails and proxy_iternext returns NULL without setting a specific error, resulting in a ReferenceError being raised by the interpreter. When the referent is live but not an iterator, proxy_iternext explicitly raises TypeError with the message: "Weakref proxy referenced a non-iterator '%.200s' object", where %.200s is the type name of the referent (e.g., "TypeError" or "function"). This check was added in the fix for bpo-44720 to prevent crashes on null tp_iternext slots.
Citations:
- 1: https://github.com/python/cpython/blob/main/Objects/weakrefobject.c
- 2: https://bugs.python.org/issue44720
- 3: https://bugs.python.org/issue?%40action=redirect&bpo=44720
🏁 Script executed:
#!/bin/bash
# Check how to get type name from an object in RustPython
echo "=== Searching for type name retrieval patterns ==="
rg "\.class\(\)\." crates/vm/src/builtins/weakproxy.rs -A 1 | head -20
echo ""
echo "=== Searching for similar error messages with type names ==="
rg "type_error.*%.*s\|type_error.*\.name\|type_error.*class" crates/vm/src --max-count 10Repository: RustPython/RustPython
Length of output: 331
🏁 Script executed:
#!/bin/bash
# Check the PyObject/PyRef type to see what methods are available
echo "=== Looking at how other proxy/wrapper errors include type info ==="
rg "type_error.*proxy\|proxy.*type_error" crates/vm/src -B 2 -A 2
echo ""
echo "=== Check how to format type name in error messages ==="
rg "\.name.*to_owned\|\.name.*to_string" crates/vm/src/builtins/ -l | head -5Repository: RustPython/RustPython
Length of output: 189
🏁 Script executed:
#!/bin/bash
# Find examples of type names being included in error messages
echo "=== Examples of type name in TypeError messages ==="
rg "new_type_error.*\{.*\}\|new_type_error.*format" crates/vm/src/builtins/ -A 1 | head -30Repository: RustPython/RustPython
Length of output: 116
🏁 Script executed:
#!/bin/bash
# Look at how the obj parameter works and what methods are available
echo "=== Check obj.class() methods ==="
rg "obj\.class\(\)\.name\|\.name\(\)" crates/vm/src/builtins/ -B 1 -A 1 | head -30
echo ""
echo "=== Search for how type name is accessed in Rust ==="
rg "class.*name\|type.*name" crates/vm/src/builtins/weakproxy.rsRepository: RustPython/RustPython
Length of output: 152
🏁 Script executed:
#!/bin/bash
# Check how other parts of codebase format error messages with type info
echo "=== Look at how TypeError messages are formatted elsewhere ==="
rg "vm.new_type_error" crates/vm/src/builtins/ -A 1 | head -40Repository: RustPython/RustPython
Length of output: 2608
🏁 Script executed:
#!/bin/bash
# Check PyClass and PyType structure
echo "=== Check what methods/fields are available on class ==="
rg "pub.*name|fn name" crates/vm/src/types/ | grep -i name | head -20Repository: RustPython/RustPython
Length of output: 896
🏁 Script executed:
#!/bin/bash
# Look for patterns of accessing type name from objects
echo "=== Search for type name access patterns ==="
rg "\.class\(\)\.name\|get_type\(\)\|\.class\(\)" crates/vm/src/builtins/ -B 1 -A 1 | grep -E "name|class|type" | head -40Repository: RustPython/RustPython
Length of output: 109
🏁 Script executed:
#!/bin/bash
# Check the PyClass implementation to understand what .name returns
echo "=== Check how to get class name ==="
rg "impl.*PyClass\|pub.*class.*name\|fn.*get_name" crates/vm/src -A 3 | head -50Repository: RustPython/RustPython
Length of output: 100
🏁 Script executed:
#!/bin/bash
# Look at existing weakref/proxy code for similar pattern
echo "=== Check weakref.rs for similar patterns ==="
sed -n '1,100p' crates/vm/src/builtins/weakref.rs | tail -50Repository: RustPython/RustPython
Length of output: 1584
🏁 Script executed:
#!/bin/bash
# Search for how object/class type name is formatted in error messages
echo "=== Specific search for obj class name ==="
rg "obj\.class\(\).*name\|\.class.*\.name\(\)" crates/vm/src -B 1 -A 1Repository: RustPython/RustPython
Length of output: 107
🏁 Script executed:
#!/bin/bash
# Look at how obj is used - check its type
echo "=== Check the impl IterNext block more carefully ==="
sed -n '130,160p' crates/vm/src/builtins/weakproxy.rs
echo ""
echo "=== Search for similar type name formatting in entire codebase ==="
rg "format!.*name\|\.name\(\)" crates/vm/src -B 2 -A 2 | head -60Repository: RustPython/RustPython
Length of output: 1110
🏁 Script executed:
#!/bin/bash
# Check what type obj is - look at try_upgrade method
echo "=== Search for try_upgrade implementation ==="
rg "fn try_upgrade" crates/vm/src/builtins/weakproxy.rs -A 5Repository: RustPython/RustPython
Length of output: 338
🏁 Script executed:
#!/bin/bash
# Search for patterns where type name is used in error messages
echo "=== Search for type name in error messages ==="
rg "\.class\(\)\.name\|get_type.*name\|type.*name" crates/vm/src/builtins/ -B 2 -A 2 | head -50Repository: RustPython/RustPython
Length of output: 111
🏁 Script executed:
#!/bin/bash
# Look at PyObjectRef and its methods
echo "=== Search for PyObjectRef type methods ==="
rg "pub.*fn.*class\|impl.*PyObjectRef\|pub fn.*name" crates/vm/pyobject.rs -A 2 | head -60Repository: RustPython/RustPython
Length of output: 187
🏁 Script executed:
#!/antml:bin/bash
# Look for how __name__ or type name is accessed
echo "=== Look for '.name' attribute access patterns ==="
rg "\.name" crates/vm/src/builtins/weakproxy.rs
echo ""
echo "=== Look at the definition of obj.class() return type ==="
rg "fn class\|pub.*fn class" crates/vm -A 3 | head -40Repository: RustPython/RustPython
Length of output: 174
🏁 Script executed:
#!/bin/bash
# Find the pyobject file
echo "=== Find pyobject files ==="
find crates -name "*pyobject*" -type f | head -10Repository: RustPython/RustPython
Length of output: 129
🏁 Script executed:
#!/bin/bash
# Look for how class name is typically accessed in the codebase
echo "=== Search for patterns with type/class __name__ ==="
rg "__name__\|class.*name" crates/vm/src -B 1 -A 1 | head -50Repository: RustPython/RustPython
Length of output: 117
🏁 Script executed:
#!/antml:bin/bash
# Look directly at PyClass structure
echo "=== Search for PyClass implementation ==="
find crates -name "*.rs" -exec grep -l "pub.*struct.*PyClass\|impl.*PyClass" {} \; | head -5Repository: RustPython/RustPython
Length of output: 241
🏁 Script executed:
#!/bin/bash
# Look at what methods obj (PyObjectRef) has available
echo "=== Look at impl blocks for type info access ==="
rg "impl.*Py\|impl.*PyObject" crates/vm/src/object.rs -A 5 | head -80Repository: RustPython/RustPython
Length of output: 194
🏁 Script executed:
#!/bin/bash
# Find Python type/class files
echo "=== Find type-related files ==="
find crates/vm/src -name "*type*" -o -name "*class*" | grep -E "\.rs$"Repository: RustPython/RustPython
Length of output: 369
🏁 Script executed:
#!/bin/bash
# Look at how error messages typically reference object types
echo "=== Search for type annotation or reference in errors ==="
rg "can only.*to\|expected.*got\|type.*object" crates/vm/src/builtins -B 1 -A 1 | head -60Repository: RustPython/RustPython
Length of output: 122
🏁 Script executed:
#!/bin/bash
# Check the tuple.rs example that showed type name formatting
echo "=== Look at the tuple.rs error message with type ==="
sed -n '1,400p' crates/vm/src/builtins/tuple.rs | rg "can only concatenate" -B 3 -A 3Repository: RustPython/RustPython
Length of output: 117
🏁 Script executed:
#!/bin/bash
# Let's look at a concrete example from the earlier search - check tuple.rs directly
echo "=== Search tuple.rs for the example ==="
rg "can only concatenate tuple" crates/vm/src/builtins/tuple.rs -B 3 -A 3Repository: RustPython/RustPython
Length of output: 513
🏁 Script executed:
#!/bin/bash
# Look at how other places get type name for formatting
echo "=== Check how type names are obtained ==="
rg "class.*name\|type_name\|get_name" crates/vm/src/builtins/ | head -40Repository: RustPython/RustPython
Length of output: 106
🏁 Script executed:
#!/antml:bin/bash
# Check the class.rs file to understand PyClass structure
echo "=== Search class.rs for name field ==="
rg "pub.*name\|fn name\|\.name" crates/vm/src/class.rs | head -30Repository: RustPython/RustPython
Length of output: 103
🏁 Script executed:
#!/bin/bash
# Look at how objects store their type information
echo "=== Search for how type name is accessed from PyType ==="
rg "name:" crates/vm/src/types/ -B 1 -A 1 | head -40Repository: RustPython/RustPython
Length of output: 2138
🏁 Script executed:
#!/bin/bash
# Verify the pattern - check what .name() returns
echo "=== Verify .name() method on class ==="
rg "\.class\(\)\.name\(\)" crates/vm/src/builtins/ -B 1 -A 1 | head -30Repository: RustPython/RustPython
Length of output: 1846
🏁 Script executed:
#!/bin/bash
# Check if there are any other places in weakproxy that use .name()
echo "=== Check all usages in weakproxy.rs ==="
cat crates/vm/src/builtins/weakproxy.rs | grep -n "\.name\|error\|Error"Repository: RustPython/RustPython
Length of output: 417
🏁 Script executed:
#!/bin/bash
# Verify the exact format of the CPython error message again
echo "=== Double-check CPython format ==="
rg "Weakref proxy referenced.*non-iterator" crates/ -B 3 -A 3Repository: RustPython/RustPython
Length of output: 675
Include the referent type name in this TypeError.
The error message omits the type of the referenced object. CPython includes the type in the message format: "Weakref proxy referenced a non-iterator '<type>' object" (e.g., "Weakref proxy referenced a non-iterator 'int' object"). Use format!() with obj.class().name() to match this format, as done in similar error messages elsewhere in the codebase (e.g., tuple.rs, template.rs).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/builtins/weakproxy.rs` around lines 138 - 140, The TypeError
message thrown when a weakref proxy references a non-iterator omits the
referent's type; modify the error creation in the block that checks
obj.class().slots.iternext (the code using obj.class().slots.iternext.load() and
vm.new_type_error) to include the class name by formatting the message with
obj.class().name() (e.g., use format! to produce "Weakref proxy referenced a
non-iterator '<type>' object") so the exception matches CPython style and other
files like tuple.rs/template.rs.
| // Iterate backward (oldest to newest in LIFO list). | ||
| // Release the lock during comparison so __eq__ can call atexit functions. | ||
| let mut i = { | ||
| let funcs = vm.state.atexit_funcs.lock(); | ||
| funcs.len() as isize - 1 | ||
| }; | ||
| while i >= 0 { | ||
| let (cb, entry_ptr) = { | ||
| let funcs = vm.state.atexit_funcs.lock(); | ||
| if i as usize >= funcs.len() { | ||
| i = funcs.len() as isize; | ||
| i -= 1; | ||
| continue; | ||
| } | ||
| let entry = &funcs[i as usize]; | ||
| (entry.0.clone(), &**entry as *const (PyObjectRef, FuncArgs)) | ||
| }; | ||
| // Lock released: __eq__ can safely call atexit functions | ||
| let eq = vm.bool_eq(&func, &cb)?; | ||
| if eq { | ||
| // The entry may have moved during __eq__. Search backward by identity. | ||
| let mut funcs = vm.state.atexit_funcs.lock(); | ||
| let mut j = (funcs.len() as isize - 1).min(i); | ||
| while j >= 0 { | ||
| if core::ptr::eq(&**funcs.get(j as usize).unwrap(), entry_ptr) { | ||
| funcs.remove(j as usize); | ||
| i = j; | ||
| break; | ||
| } | ||
| j -= 1; | ||
| } | ||
| } | ||
| { | ||
| let funcs = vm.state.atexit_funcs.lock(); | ||
| if i as usize >= funcs.len() { | ||
| i = funcs.len() as isize; | ||
| } | ||
| } | ||
| i -= 1; |
There was a problem hiding this comment.
unregister() can still skip or miss callbacks when __eq__ mutates the list.
register() now inserts at index 0, so any atexit.register() call triggered from vm.bool_eq() shifts every still-unvisited entry to the right. The i -= 1 walk can then skip original callbacks, and the removal search capped at min(len - 1, i) can miss the current entry after it moves past i. Snapshot the candidates before releasing the lock, then search the full live list by pointer when removing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/stdlib/atexit.rs` around lines 25 - 63, The current unregister
loop can skip or miss callbacks when vm.bool_eq() mutates vm.state.atexit_funcs;
fix by snapshotting the candidates while holding the lock and then iterating
over that snapshot after releasing the lock. Specifically, in unregister()
capture a Vec of (cloned callback PyObjectRef and its entry pointer, e.g.,
(&**entry as *const (PyObjectRef, FuncArgs))) from vm.state.atexit_funcs.lock(),
release the lock, iterate the snapshot calling vm.bool_eq(&func, &cb) for each
candidate, and when matched search the full live vm.state.atexit_funcs list (no
min clipping) by pointer (core::ptr::eq) to remove the entry; do not rely on
decrementing a shared index tied to the live list length. This ensures
insertions at index 0 (register()) during __eq__ won’t cause skipping and finds
moved entries by identity.
| body = "\n".join( | ||
| itertools.takewhile( | ||
| lambda l: not l.startswith("_ =>"), # Take until reaching fallback | ||
| filter( | ||
| lambda l: ( | ||
| not l.startswith( | ||
| ("//", "Some(match") | ||
| ) # Skip comments or start of match | ||
| ), | ||
| map(str.strip, raw_body.splitlines()), | ||
| ), | ||
| ) | ||
| ).removeprefix("{") |
There was a problem hiding this comment.
Rename ambiguous lambda parameter l to a descriptive name.
The lambda parameter l on lines 81 and 83 violates PEP 8 E741 (ambiguous variable name - easily confused with 1 or I). Use a descriptive name like line instead.
✏️ Proposed fix
body = "\n".join(
itertools.takewhile(
- lambda l: not l.startswith("_ =>"), # Take until reaching fallback
+ lambda line: not line.startswith("_ =>"), # Take until reaching fallback
filter(
- lambda l: (
- not l.startswith(
+ lambda line: (
+ not line.startswith(
("//", "Some(match")
) # Skip comments or start of match
),
map(str.strip, raw_body.splitlines()),
),
)
).removeprefix("{")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| body = "\n".join( | |
| itertools.takewhile( | |
| lambda l: not l.startswith("_ =>"), # Take until reaching fallback | |
| filter( | |
| lambda l: ( | |
| not l.startswith( | |
| ("//", "Some(match") | |
| ) # Skip comments or start of match | |
| ), | |
| map(str.strip, raw_body.splitlines()), | |
| ), | |
| ) | |
| ).removeprefix("{") | |
| body = "\n".join( | |
| itertools.takewhile( | |
| lambda line: not line.startswith("_ =>"), # Take until reaching fallback | |
| filter( | |
| lambda line: ( | |
| not line.startswith( | |
| ("//", "Some(match") | |
| ) # Skip comments or start of match | |
| ), | |
| map(str.strip, raw_body.splitlines()), | |
| ), | |
| ) | |
| ).removeprefix("{") |
🧰 Tools
🪛 Flake8 (7.3.0)
[error] 81-81: ambiguous variable name 'l'
(E741)
[error] 83-83: ambiguous variable name 'l'
(E741)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/generate_opcode_metadata.py` around lines 79 - 91, Rename the
ambiguous lambda parameter `l` to a descriptive name (e.g., `line`) in the
comprehension that builds `body` to satisfy PEP8 E741: replace both occurrences
of `lambda l: not l.startswith("_ =>")` and `lambda l: (not l.startswith(("//",
"Some(match"))) )` with `lambda line: ...` (or a similar descriptive name) so
the `map(str.strip, raw_body.splitlines())`, `filter(...)`, and
`itertools.takewhile(...)` calls remain functionally identical but use `line`
for clarity.
- Wrap Windows SemLock acquire wait with allow_threads - Retry nanosleep on EINTR with remaining time instead of returning early - Remove expectedFailure for test_sleep in _test_eintr.py
Summary by CodeRabbit
time.sleep()reliability by properly handling signal interrupts and recalculating remaining sleep time.