]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
cuda : revert CUDA_SCALE_LAUNCH_QUEUES override until investigated (#19227)
authorGaurav Garg <redacted>
Tue, 3 Feb 2026 06:41:02 +0000 (12:11 +0530)
committerGitHub <redacted>
Tue, 3 Feb 2026 06:41:02 +0000 (08:41 +0200)
commit41e3f02647be2976c4a302128680ca5983568ae5
tree9d0ea28e2ea6d7dd3ff2f996f0b19ea88d14227d
parent1efb5f7ae120c7cc7a33c4d1d82a05b3c50122f6
cuda : revert CUDA_SCALE_LAUNCH_QUEUES override until investigated (#19227)

Hangs were reported on Jetson Orin AGX if we set CUDA_SCALE_LAUNCH_QUEUES=4x. Reverting the previous PR (#19042) and updating the document to consider setting CUDA_SCALE_LAUNCH_QUEUES=4x for faster throughput on multi-GPU systems.
docs/build.md
ggml/src/ggml-cuda/ggml-cuda.cu