llava: Add ACC OP for GPU acceleration to the Vulkan backend in the LLAVA CLIP model. (#8984)
* llava: Add ACC OP for GPU acceleration to the Vulkan backend in the LLAVA CLIP model.
- The CLIP model now prioritizes the Vulkan backend over the CPU when vulkan available.
- A GGML_OP_ACC shader has been added.
- The encoding performance of the CLIP model improved from 4.2s on the CPU to 0.9s on the GPU.
Signed-off-by: Changyeon Kim <redacted>
* fix-up coding style.
Signed-off-by: Changyeon Kim <redacted>
* Fix-up the missing initial parameter to resolve the compilation warning.
Signed-off-by: Changyeon Kim <redacted>
* [fix] Add missing parameters.
Signed-off-by: Changyeon Kim <redacted>
* [fix] Use nb1 and nb2 for dst.
Signed-off-by: Changyeon Kim <redacted>
* Fix check results ggml_acc call
---------
Signed-off-by: Changyeon Kim <redacted> Co-authored-by: 0cc4m <redacted>