-
-
Notifications
You must be signed in to change notification settings - Fork 34.5k
gh-115999: Implement thread-local bytecode and enable specialization for BINARY_OP
#123926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
84 commits
Select commit
Hold shift + click to select a range
776a1e1
Assign threads indices into bytecode copies
mpage 2b40870
Replace most usage of PyCode_CODE
mpage 344d7ad
Get bytecode copying working
mpage f203d00
Refactor remove_tools
mpage 82b456a
Refactor remove_line_tools
mpage b021704
Instrument thread-local bytecode
mpage aea69c5
Use locks for instrumentation
mpage 552277d
Add ifdef guards for each specialization family
mpage 50a6089
Specialize BINARY_OP
mpage 3f1d941
Limit the amount of memory consumed by bytecode copies
mpage 7d2eb27
Make thread-local bytecode limits user configurable
mpage d5476b9
Fix a few data races when (de)instrumenting opcodes
mpage e3b367a
Make branch taken recording thread-safe
mpage b2375bf
Lock thread-local bytecode when specializing
mpage 2707f8e
Load bytecode on RESUME_CHECK
mpage 3fdcb28
Load tlbc on generator.throw()
mpage 4a55ce5
Use tlbc instead of thread_local_bytecode
mpage 8b3ff60
Use tlbc everywhere
mpage 862afa1
Explicitly manage tlbc state
mpage 0b4d952
Refactor API for fetching tlbc
mpage 7795e99
Add unit tests
mpage 693a4cc
Fix initconfig in default build
mpage b43531e
Fix instrumentation in default build
mpage 9025f43
Synchronize bytecode modifications between specialization and instrum…
mpage c44c7d9
Add a high-level comment
mpage e2a6656
Fix unused variable warning in default build
mpage e6513d1
Fix test_config in free-threaded builds
mpage a18396f
Fix formatting
mpage 81fe1a2
Remove comment
mpage 837645e
Fix data race in _PyInstruction_GetLength
mpage f13e132
Fix tier2 optimizer
mpage 942f628
Use __VA_ARGS__ for macros
mpage 66cb24d
Update vcxproj files to include newly added files
mpage ad12bd4
Mark unused params
mpage 1bbbbbc
Keep tier2 and the JIT disabled in free-threaded builds
mpage e63e403
Only allow enabling/disabling tlbc
mpage 8b97771
Update libpython for gdb
mpage d34adeb
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage 6d4fe73
Handle out of memory errors
mpage c2d8693
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage b104782
Fix warnings on windows
mpage deb5216
Fix another warning
mpage 2f11cc7
Ugh actually fix it
mpage 04f1ac3
Add high-level comment about index pools
mpage aa330b1
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage 7dfd1ca
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage 7c9da24
Exclude tlbc from refleak counts
mpage dd144d0
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage ad180d1
Regen files
mpage 95d2264
Move `get_tlbc_blocks` into the sys module
mpage b6380de
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage adb59ef
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage 39c947d
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage 2cc5830
Work around `this_instr` now being const
mpage 96ec126
Make RESUME_CHECK cheaper
mpage 5ecebd9
Pass tstate to _PyCode_GetTLBCFast
mpage 815b2fe
Rename test_tlbc.py to test_thread_local_bytecode.py
mpage fb90d23
Remove per-family defines for specialization
mpage 4e42414
Replace bytecode pointer with tlbc_index
mpage 814e4ca
Add a test verifying that we clean up tlbc when the code object is de…
mpage ba3930a
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage cb8a774
Fix indentation
mpage 0f8a55b
Clarify comment
mpage 70ce0fe
Fix TSAN
mpage f512353
Add test for cleaning up tlbc in correct place, not old emacs buffer
mpage 4be2b1f
Remove test_tlbc.py
mpage 61c7aa9
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage ab6222c
Use int32_t instead of Py_ssize_t for tlbc indices
mpage 6bbb220
Use _PyCode_CODE instead of PyFrame_GetBytecode in super_init_without…
mpage 4580e3c
Update comment
mpage b992f44
Consolidate _PyCode_{Quicken,DisableSpecialization} into _PyCode_Init…
mpage 4c040d3
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage 5b7658c
Fix incorrect types
mpage bec5bce
Add command-line tests for enabling TLBC
mpage c9054b7
Update libpython.py for tlbc_index
mpage 1a48ab2
Avoid special casing in _PyEval_GetExecutableCode
mpage b16ae5f
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage 176b24e
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage c107495
Clear TLBC when other caches are cleared
mpage 07f9140
Remove _get_tlbc_blocks
mpage 4cbe237
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage 38ff315
Rename _PyCode_InitCounters back to _PyCode_Quicken
mpage 338f7e5
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage bcd1bb2
Merge branch 'main' into gh-115999-thread-local-bytecode
mpage File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Fix formatting
- Loading branch information
commit a18396fa60ea54b04e3a375c951fa27e0288f31a
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You were storing the bytecode in the frame directly before, IIRC.
This looks more expensive, and is used on at least one fast path:
https://github.com/python/cpython/pull/123926/files#diff-729a985b0cb8b431cb291f1edb561bbbfea22e3f8c262451cd83328a0936a342R4821
Does it makes things faster overall, or is it just more compact?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. You suggested storing
tlbc_indexinstead since it was smaller. I think this is better for a couple of reasons:RESUME_CHECK. Previously, we would have to load the bytecode pointer for the current thread and deopt if it didn't match what was in the frame. Now we only have to compare tlbc indices. This is a cost shift, however, since now the callers of_PyFrame_GetBytecodehave to do the more expensive load of the bytecode. I think the size reduction + simplification ofRESUME_CHECKprobably outweighs the higher cost of_PyFrame_GetBytecode. tier2 is also still disabled in free-threaded builds, so it's a bit hard to evaluate the relative cost of slower trace exits vs fasterRESUME_CHECKs. Once we get tier2 enabled we can reevaluate.