Conversation
|
Claude finished @cquil11's task in 51s —— View job Review of PR #854
LGTM — no blocking issues found. This is a clean upgrade of the H200 GPT-OSS TensorRT-LLM config from the dev image ( |
|
hi @jgangani the performance is wildly worse on the latest trtllm 1.3 rc, can u take a look?
|
|
@functionstackx thanks for the perf diff. Looking at them. will get back to you |
|
@jgangani update? |
|
@jgangani we at semianalysis gonna merge this with this tag by monday |
|
@cquil11 @functionstackx Sorry for dropping this without an update. Reason for delay is that TRTLLM rc5 container pytorch update automatically updated Triton to 3.5.1 which had regression for MoE activation kernel. We believe this has been fixed later in Triton 3.6. We are in the process of releasing a new periodic container with pytorch and triton update by next week. This week is short work week at Nvidia (we are off TH-FR). I hope we can hold off merging this until then. |



No description provided.