]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: fix crash on large batch size for quant. MoE (#13537)
authorJohannes Gäßler <redacted>
Wed, 14 May 2025 14:41:02 +0000 (16:41 +0200)
committerGitHub <redacted>
Wed, 14 May 2025 14:41:02 +0000 (16:41 +0200)
commit4696d5674999dc10a7fb8c27b33406a929f7463a
tree780299c8c82dd9eef9b26269c5f45c688787542a
parentb7d26720821823e23e2273a99e38398d511242e9
CUDA: fix crash on large batch size for quant. MoE (#13537)
ggml/src/ggml-cuda/mmq.cu
ggml/src/ggml-cuda/quantize.cu