git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Georgi Gerganov <redacted>
	Thu, 13 Mar 2025 10:35:44 +0000 (12:35 +0200)
committer	GitHub <redacted>
	Thu, 13 Mar 2025 10:35:44 +0000 (12:35 +0200)
commit	e0dbec0bc6cd4b6230cda7a6ed1e9dac08d1600b
tree	e3ee4e085042df7a76d51f691ae46450f656860b	tree
parent	2048b5913d51beab82dfe29955f9008130b936c0	commit \| diff

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)

* llama : refactor llama_context, llama_kv_cache, llm_build_context

ggml-ci

* graph : don't mutate the KV cache during defrag

ggml-ci

* context : reduce virtuals + remove test function

ggml-ci

* context : move interface implementation to source file + factory

ggml-ci

* graph : move KV cache build functions to llama_context impl

ggml-ci

* graph : remove model reference from build_pooling

ggml-ci

* graph : remove llama_model reference

ggml-ci

* kv_cache : provide rope factors

ggml-ci

* graph : rework inputs to use only unique_ptr, remove attn input abstraction

ggml-ci

* context : remove llama_context_i abstraction

ggml-ci

* context : clean-up

ggml-ci

* graph : clean-up

ggml-ci

* llama : remove redundant keywords (struct, enum)

ggml-ci

* model : adapt gemma3

ggml-ci

* graph : restore same attention ops as on master

ggml-ci

* llama : remove TODO + fix indent

ggml-ci

46 files changed:

common/common.cpp		diff \| blob \| history
common/speculative.cpp		diff \| blob \| history
examples/batched-bench/batched-bench.cpp		diff \| blob \| history
examples/batched.swift/Sources/main.swift		diff \| blob \| history
examples/cvector-generator/cvector-generator.cpp		diff \| blob \| history
examples/embedding/embedding.cpp		diff \| blob \| history
examples/gritlm/gritlm.cpp		diff \| blob \| history
examples/imatrix/imatrix.cpp		diff \| blob \| history
examples/infill/infill.cpp		diff \| blob \| history
examples/llama-bench/llama-bench.cpp		diff \| blob \| history
examples/llama.android/llama/src/main/cpp/llama-android.cpp		diff \| blob \| history
examples/llama.swiftui/llama.cpp.swift/LibLlama.swift		diff \| blob \| history
examples/llava/gemma3-cli.cpp		diff \| blob \| history
examples/lookahead/lookahead.cpp		diff \| blob \| history
examples/lookup/lookup.cpp		diff \| blob \| history
examples/main/main.cpp		diff \| blob \| history
examples/parallel/parallel.cpp		diff \| blob \| history
examples/passkey/passkey.cpp		diff \| blob \| history
examples/perplexity/perplexity.cpp		diff \| blob \| history
examples/quantize-stats/quantize-stats.cpp		diff \| blob \| history
examples/retrieval/retrieval.cpp		diff \| blob \| history
examples/run/run.cpp		diff \| blob \| history
examples/save-load-state/save-load-state.cpp		diff \| blob \| history
examples/server/server.cpp		diff \| blob \| history
examples/server/tests/utils.py		diff \| blob \| history
examples/simple-chat/simple-chat.cpp		diff \| blob \| history
examples/speculative-simple/speculative-simple.cpp		diff \| blob \| history
examples/speculative/speculative.cpp		diff \| blob \| history
include/llama.h		diff \| blob \| history
src/CMakeLists.txt		diff \| blob \| history
src/llama-adapter.cpp		diff \| blob \| history
src/llama-adapter.h		diff \| blob \| history
src/llama-batch.h		diff \| blob \| history
src/llama-context.cpp		diff \| blob \| history
src/llama-context.h		diff \| blob \| history
src/llama-graph.cpp	[new file with mode: 0644]	blob
src/llama-graph.h	[new file with mode: 0644]	blob
src/llama-io.cpp	[new file with mode: 0644]	blob
src/llama-io.h	[new file with mode: 0644]	blob
src/llama-kv-cache.cpp		diff \| blob \| history
src/llama-kv-cache.h		diff \| blob \| history
src/llama-memory.cpp	[new file with mode: 0644]	blob
src/llama-memory.h	[new file with mode: 0644]	blob
src/llama-model.cpp		diff \| blob \| history
src/llama-model.h		diff \| blob \| history
src/llama.cpp		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom