]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
cuda: refactored ssm_scan and use CUB (#13291)
authorDavid Zhao <redacted>
Sat, 9 Aug 2025 18:29:43 +0000 (13:29 -0500)
committerGitHub <redacted>
Sat, 9 Aug 2025 18:29:43 +0000 (20:29 +0200)
commit79c1160b073b8148a404f3dd2584be1606dccc66
tree1872b0aad7ae549e02dc06e969078ec7d696f369
parent34c9d765bf173c551398f1e7fa4595019bc53bab
cuda: refactored ssm_scan and use CUB (#13291)

* cuda: refactored ssm_scan to use CUB

* fixed compilation error when when not using CUB

* assign L to constant and use size_t instead of int

* deduplicated functions

* change min blocks per mp to 1

* Use cub load and store warp transpose

* suppress clang warning
ggml/src/ggml-cuda/ssm-scan.cu