Skip to content

chore: upgrade h200 gptoss to latest trtllm#854

Open
cquil11 wants to merge 4 commits intomainfrom
chore/update-gptoss-trtll
Open

chore: upgrade h200 gptoss to latest trtllm#854
cquil11 wants to merge 4 commits intomainfrom
chore/update-gptoss-trtll

Conversation

@cquil11
Copy link
Collaborator

@cquil11 cquil11 commented Mar 2, 2026

No description provided.

@cquil11 cquil11 marked this pull request as ready for review March 2, 2026 18:59
@cquil11 cquil11 requested a review from a team March 2, 2026 18:59
@claude
Copy link
Contributor

claude bot commented Mar 2, 2026

Claude finished @cquil11's task in 51s —— View job


Review of PR #854

  • Gather context and read changed files
  • Validate master config and perf-changelog
  • Validate container image accessibility
  • Validate model prefix
  • Review benchmark script changes
  • Post review summary

LGTM — no blocking issues found.

This is a clean upgrade of the H200 GPT-OSS TensorRT-LLM config from the dev image (gpt-oss-dev) to an official release candidate (v1.3.0rc5). The sed workaround removal and enable_block_reuse default change are both reasonable cleanups enabled by the new container version. Perf changelog is properly updated.

@functionstackx
Copy link
Contributor

hi @jgangani

the performance is wildly worse on the latest trtllm 1.3 rc, can u take a look? nvcr.io#nvidia/tensorrt-llm/release:1.3.0rc5

image image image

@jgangani
Copy link
Collaborator

jgangani commented Mar 3, 2026

@functionstackx thanks for the perf diff. Looking at them. will get back to you

@cquil11
Copy link
Collaborator Author

cquil11 commented Mar 24, 2026

@jgangani update?

@functionstackx
Copy link
Contributor

@jgangani we at semianalysis gonna merge this with this tag by monday

@jgangani
Copy link
Collaborator

@cquil11 @functionstackx Sorry for dropping this without an update. Reason for delay is that TRTLLM rc5 container pytorch update automatically updated Triton to 3.5.1 which had regression for MoE activation kernel. We believe this has been fixed later in Triton 3.6. We are in the process of releasing a new periodic container with pytorch and triton update by next week. This week is short work week at Nvidia (we are off TH-FR). I hope we can hold off merging this until then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants