]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
gpt-j : update inference to match latest llama.cpp insights
authorGeorgi Gerganov <redacted>
Tue, 11 Apr 2023 18:33:17 +0000 (21:33 +0300)
committerGeorgi Gerganov <redacted>
Tue, 11 Apr 2023 18:33:17 +0000 (21:33 +0300)
commit0265f0813492602fec0e1159fe61de1bf0ccaf78
treea65d2c4c13177efc09461bccb3aa261edfe56ff1
parent553929cf771634fc29b0700967a0621d56647f09
gpt-j : update inference to match latest llama.cpp insights

- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy
examples/gpt-j/main.cpp