]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: batched+noncont MMQ, refactor bs>1 MoE code (llama/13199)
authorJohannes Gäßler <redacted>
Wed, 30 Apr 2025 21:12:59 +0000 (23:12 +0200)
committerGeorgi Gerganov <redacted>
Thu, 1 May 2025 07:39:34 +0000 (10:39 +0300)
commitb5ed2968f3415ecb2152adce9bdb92fcff5406bb
treeced1ba298c9db34e462ce2ec992c7f8430b312c2
parentb2b4e0b0e55e588044db4c76623bdb85a7c2446e
CUDA: batched+noncont MMQ, refactor bs>1 MoE code (llama/13199)
src/ggml-cuda/getrows.cu
src/ggml-cuda/getrows.cuh
src/ggml-cuda/ggml-cuda.cu
src/ggml-cuda/mmq.cu
src/ggml-cuda/mmq.cuh
src/ggml-cuda/mmvq.cu
src/ggml-cuda/quantize.cu
src/ggml-cuda/quantize.cuh
tests/test-backend-ops.cpp