try to use AOT for kernels#4
Closed
shssf wants to merge 1 commit into
Closed
Conversation
abagusetty
pushed a commit
to abagusetty/dpnp
that referenced
this pull request
May 27, 2026
… __del__ against shutdown races Closes audit items IntelPython#5 and IntelPython#29 from the prior solver review. Item IntelPython#4 (_matmat default uses a per-column matvec loop) is closed as wontfix: SciPy's scipy.sparse.linalg.LinearOperator and cupyx's analogue both ship the same hstack-of-matvecs default, so dpnp matches the reference exactly and there is no portable improvement to make without subclass-level _matmat overrides (which _CustomLinearOperator already exposes via its matmat= constructor argument). scipy/sparse/linalg/_interface.py - Set __array_ufunc__ = None on the LinearOperator base class. This is the SciPy contract: a host numpy.ndarray on the left of np_array * linop or np_array @ linop previously triggered NumPy's ufunc dispatch first, which would attempt to broadcast the operator element-wise before falling back to its reflected operator method -- producing either an opaque error or a wrong- typed result. With __array_ufunc__ = None NumPy returns NotImplemented from the ufunc protocol and Python's operator dispatch falls through cleanly to LinearOperator.__rmul__ / __rmatmul__. dpnp.ndarray itself sets __array_ufunc__ = None (see dpnp/dpnp_array.py:222) for the same reason, so the two dispatch systems now agree. scipy/sparse/_csr.py, scipy/sparse/linalg/_iterative.py - Harden __del__ in csr_matrix and in _CachedSpMV against the interpreter-shutdown race where the compiled _sparse_impl extension is garbage-collected before the matrix instance whose oneMKL handle it owns. Previous code used a single except Exception: pass which silenced two qualitatively different failure modes: 1. shutdown race -- extension gone, si._sparse_gemv_release evaluates to None or AttributeError; the handle is unrecoverable and leaving the OS to reclaim it at process exit is the only sane option; 2. genuine backend error while the interpreter is healthy -- a real bug we want to surface eventually, but raising from __del__ produces only an 'Exception ignored in:' warning and the handle is gone either way. The new code probes getattr(si, '_sparse_gemv_release', None) explicitly so case (1) takes the fast non-call path, and then splits the except into (AttributeError, TypeError) for case (1)- style residuals (queue / handle attribute access racing the shutdown) versus a final broad except for case (2). Both still return silently from __del__ -- raising is never valid here -- but the intent is now documented and a real backend regression is no longer indistinguishable from the GC race in code review. tests/third_party/cupyx/scipy_tests/sparse_tests/test_linalg.py - test_array_ufunc_opt_out: asserts the __array_ufunc__ = None marker is present on LinearOperator. Mirrors SciPy's own test suite test_interface.py::test_array_ufunc_opt_out. - test_numpy_scalar_times_linop_dispatches_to_rmul: the concrete runtime consequence -- numpy.float64(2.0) * linop must produce a scaled LinearOperator, not raise or yield an array.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
It loook like AOT is usable for development only.
AOT reduces the module loading time from 21sec to 0.5sec.
if use CPU only option, it compiles successfully and run on CPU but failed with GPU run
if use both options: