]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
cuda : revert CUDA_SCALE_LAUNCH_QUEUES override until investigated (llama/19227)
authorGaurav Garg <redacted>
Tue, 3 Feb 2026 06:41:02 +0000 (12:11 +0530)
committerGeorgi Gerganov <redacted>
Sun, 8 Feb 2026 07:29:10 +0000 (09:29 +0200)
commit6ec362d2e0d705892d6ded95cafbd7877d332f83
tree0f810c7efd495e8844ea02ad921a1aa27f78b817
parent591072fcc8226e7364ef409fd0b3fb5885638668
cuda : revert CUDA_SCALE_LAUNCH_QUEUES override until investigated (llama/19227)

Hangs were reported on Jetson Orin AGX if we set CUDA_SCALE_LAUNCH_QUEUES=4x. Reverting the previous PR (#19042) and updating the document to consider setting CUDA_SCALE_LAUNCH_QUEUES=4x for faster throughput on multi-GPU systems.
ggml/src/ggml-cuda/ggml-cuda.cu