]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
metal: SSM kernel improvements (llama/17876)
authorGabe Goodhart <redacted>
Tue, 9 Dec 2025 19:30:02 +0000 (12:30 -0700)
committerGeorgi Gerganov <redacted>
Thu, 11 Dec 2025 13:33:00 +0000 (15:33 +0200)
commit019442facd4dac72a3c2dfe1cb3f625a57d489c1
treeb1ad29816057455f3cbb1cf9344d7631858026ab
parent9652e04ad1a5391fa3e535d0a668e08e40a11dab
metal: SSM kernel improvements (llama/17876)

* feat: Add a batched version of ssm_conv

This was done using Claude Code. It found a number of optimizations around
how the threads were organized, resulting in a huge performance boost!

Branch: Mamba2SSD

Signed-off-by: Gabe Goodhart <redacted>
* feat: Optimized SSM_SCAN kernel for metal

This used Claude Code and resulted in a modest performance improvement
while maintaining correctness.

Branch: Mamba2SSD

Signed-off-by: Gabe Goodhart <redacted>
* test: Add test-backend-ops perf tests for SSM_CONV

Branch: SSMKernelImprovements

Signed-off-by: Gabe Goodhart <redacted>
* test: Real representitive tests for SSM_CONV

Branch: SSMKernelImprovements

Signed-off-by: Gabe Goodhart <redacted>
* refactor: Use function constant for ssm_conv batch size

Branch: SSMKernelImprovements

Signed-off-by: Gabe Goodhart <redacted>
* test: backend op tests for ssm_scan from granite4 1b-h

Branch: SSMKernelImprovements

Signed-off-by: Gabe Goodhart <redacted>
* style: remove commented out templates

Branch: SSMKernelImprovements

Signed-off-by: Gabe Goodhart <redacted>
* feat: float4 version of ssm_conv_batched

Branch: SSMKernelImprovements

Signed-off-by: Gabe Goodhart <redacted>
* fix: Add missing ggml_metal_cv_free

Signed-off-by: Gabe Goodhart <redacted>
Co-authored-by: Georgi Gerganov <redacted>
---------

Signed-off-by: Gabe Goodhart <redacted>
Co-authored-by: Georgi Gerganov <redacted>
src/ggml-metal/ggml-metal-device.cpp
src/ggml-metal/ggml-metal-device.h
src/ggml-metal/ggml-metal-impl.h
src/ggml-metal/ggml-metal-ops.cpp
src/ggml-metal/ggml-metal.metal
tests/test-backend-ops.cpp