]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: Optimize PAD_REFLECT_1D (llama/15957)
authorBowen Han <redacted>
Thu, 18 Sep 2025 18:26:03 +0000 (11:26 -0700)
committerGeorgi Gerganov <redacted>
Sat, 20 Sep 2025 10:46:38 +0000 (13:46 +0300)
commitfce6354e0f8ae74a7bb02412588dfb663c57b154
tree329d726412ba0738626ae4ccce7f53411f840202
parent05bdfd438045cf905ba990de81e442bd865a910d
CUDA: Optimize PAD_REFLECT_1D (llama/15957)

* CUDA: Optimize PAD_REFLECT_1D
feat: add more test cases for PAD_REFLECT_1D

* use fast_div to improve performance

* Apply suggestion from JohannesGaessler

Co-authored-by: Johannes Gäßler <redacted>
* Apply suggestion from JohannesGaessler

Co-authored-by: Johannes Gäßler <redacted>
* optimize

* use a concise expression to further speedup the cuda kernel

---------

Co-authored-by: Johannes Gäßler <redacted>
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/pad_reflect_1d.cu