]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
model : Apertus model implementation (llama/15852)
authorPiotr Wilkin (ilintar) <redacted>
Thu, 2 Oct 2025 17:43:22 +0000 (19:43 +0200)
committerGeorgi Gerganov <redacted>
Sun, 12 Oct 2025 04:57:25 +0000 (07:57 +0300)
commit24ab95634d3969e81d2d5f1508d7b38628df0d2e
tree41a5f69244eae4d624166a34ac992e3cbe1b1095
parent24ea5e07a2344688cb4bdb686ac3dd6d85869492
model : Apertus model implementation (llama/15852)

* First attempt

* No permute during convert (fixes qk tensors), proper norm application.

* RoPE = NeoX

* Coherence!

* Migrate xielu params from tensors to hyperparameters

* Simple CUDA kernel

* Revert stupid LLM refactorings

* Chat template support

* configchecker / flake8 errors

* Reorder unary.cu

* I do conclude that LLMs are, in fact, stupid.

* Fix after merge

* Final newline

* Make xIELU an UNARY_OP

* Final newline

* Correctly account for parameter shift

* Argh.

* Update ggml/src/ggml-cpu/unary-ops.cpp

Co-authored-by: Georgi Gerganov <redacted>
* Refactor: remove unused methods, inline and factorize softplus, add const modifiers

* Revert CUDA changes, implement xIELU as a separate OP

* Pesky newline

* Add float2half / half2float for F16 inputs/outputs

* CUDA variants, attempt 2

* Actually, attempt 3

* Update ggml/src/ggml-cuda/unary.cu

Co-authored-by: Johannes Gäßler <redacted>
* Missing convert header

* Proper formula and reference for xIELU in the comments.

* Modify unary-ops.cpp to add the functor-based logic besides the template system to retain optimizations

* Apply suggestions from code review

Co-authored-by: Sigbjørn Skjæret <redacted>
* Add tensor mappings for Apertus to global list instead

* Fix lazy on scalars

* Update ggml/src/ggml-cuda/unary.cu

Co-authored-by: Johannes Gäßler <redacted>
* Add comment about the constraints on positive/negative alpha

* Change `softplus` to `ggml_softplus`

---------

Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: Johannes Gäßler <redacted>
Co-authored-by: Sigbjørn Skjæret <redacted>
include/ggml.h
src/ggml-cpu/ggml-cpu.c
src/ggml-cpu/ops.cpp
src/ggml-cpu/unary-ops.cpp
src/ggml-cpu/unary-ops.h
src/ggml-cuda/ggml-cuda.cu
src/ggml-cuda/unary.cu
src/ggml-cuda/unary.cuh
src/ggml-impl.h
src/ggml.c