Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

chore: nightly sync main into dev (05_05_2026) Run functional tests Run MBridge tests Attach this for testing this PR against MBridge main
#4635 opened May 5, 2026 by svcnvidia-nemo-ci Draft
Inference: Cache input + position ID views complexity: low
#4634 opened May 5, 2026 by mathemakitten Contributor Loading…
5 tasks
Minimal repro for Mamba's resharding
#4633 opened May 5, 2026 by wujingyue Contributor Draft
Refactor TE fused ops integration into mixin
#4630 opened May 5, 2026 by CarlosGomes98 Contributor Draft
5 tasks
Disable MSC by default; opt in via --enable-msc complexity: low
#4629 opened May 5, 2026 by asolergi-nv Contributor Loading…
5 tasks done
Overlap logprob calculation by putting it on separate stream
#4627 opened May 5, 2026 by tdene Contributor Draft
5 tasks
docs: add bump-dependency skill
#4626 opened May 5, 2026 by ko3n1g Contributor Draft
ci: introduce L-tier scope vocabulary via parser
#4625 opened May 5, 2026 by balasaajay Contributor Draft
5 tasks
feat(attention): Add rotary_base_per_layer for Step-3.5-Flash
#4622 opened May 5, 2026 by shifangx Contributor Draft
5 tasks
[training migration] Migrate config validation
#4618 opened May 5, 2026 by maanug-nv Contributor Draft
5 tasks
One single flag that determines if we are in inference
#4617 opened May 4, 2026 by tdene Contributor Draft
5 tasks
Fix gradient corruption with layerwise param all-gather overlap Approved All necessary approvals have been made complexity: low
#4609 opened May 4, 2026 by deepakn94 Contributor Queued
1 task done
fix tokenizers in respect to newer transformers complexity: low Expert Review [deprecated] Apply this label to indicate that your PR is ready for expert review. Run functional tests
#4608 opened May 4, 2026 by dimapihtar Contributor Loading…
5 tasks
ci: Bump GHA versions Approved All necessary approvals have been made complexity: low
#4606 opened May 4, 2026 by chtruong814 Contributor Loading…
5 tasks
ProTip! Follow long discussions with comments:>50.