]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
cuda: refactored ssm_scan and use CUB (llama/13291)
authorDavid Zhao <redacted>
Sat, 9 Aug 2025 18:29:43 +0000 (13:29 -0500)
committerGeorgi Gerganov <redacted>
Thu, 14 Aug 2025 11:17:28 +0000 (14:17 +0300)
commit43e6301b01331a7e4c8ee16dc252516dd58cc9e7
treeff78e95a63ed2dc6897d97c118fe547a0c9788cd
parent018cc7281ef4ac3f68cfab2721538aa0b28b1cda
cuda: refactored ssm_scan and use CUB (llama/13291)

* cuda: refactored ssm_scan to use CUB

* fixed compilation error when when not using CUB

* assign L to constant and use size_t instead of int

* deduplicated functions

* change min blocks per mp to 1

* Use cub load and store warp transpose

* suppress clang warning
src/ggml-cuda/ssm-scan.cu