]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-05-22 Justine Tunneyllama : add missing model type names (#7445)
2024-05-22 Georgi Gerganovcuda : fix compile warning (#7454)
2024-05-22 Johannes GäßlerCUDA: remove incorrect precision check (#7454)
2024-05-22 Georgi Gerganovcuda : fix rope + add tests (#7452)
2024-05-21 liuwei-gitllama : add phi3 128K model support (#7225)
2024-05-21 Georgi Gerganovmetal : handle F16 inf values, fix FA partial offload...
2024-05-21 Olivier Chafik`grammars`: fix resampling logic regression (#7424)
2024-05-21 Johannes GäßlerCUDA: fix unused warning in mmq.cu (#7442)
2024-05-21 Georgi Gerganovtests : test-tokenizer-0.sh print more info (#7402)
2024-05-21 Amirexamples: cache hf model when --model not provided...
2024-05-21 Johannes GäßlerCUDA: deduplicate mmq code (#7397)
2024-05-21 jaime-m-pTokenizer SPM fixes for phi-3 and llama-spm (bugfix...
2024-05-20 jaime-m-pTokenizer SPM fixes for phi-3 and llama-spm (#7375)
2024-05-20 Georgi Gerganovllama : remove Persimmon (#7408)
2024-05-20 Johannes Gäßlerperplexity: update README FP16 results [no ci] (#7413)
2024-05-20 Radoslav Gerganovrpc : track allocated buffers (#7411)
2024-05-20 Georgi Gerganovserver : fix temperature + disable some tests (#7409)
2024-05-20 AidanBeltonS[SYCL] Update SYCL upscale operation (#7321)
2024-05-20 BinganUpdate README.md (#7410)
2024-05-20 Herman Semenovggml-opencl, llama: using reserve() if count already...
2024-05-20 junchao-loongsonggml : add loongarch lsx and lasx support (#6454)
2024-05-20 Georgi Gerganovserver : tuning tests (#7388)
2024-05-20 Georgi Gerganovserver : return error on too large embedding input...
2024-05-20 Georgi Gerganovtests : fix --keep_split -> --keep-split (#7374)
2024-05-20 Srihari-mcwAdd provisions for windows support for BF16 code includ...
2024-05-19 slarenllama : remove MPI backend (#7395)
2024-05-19 Fred Douglasquantize : fix --keep-split check (#7374)
2024-05-19 0cc4mVulkan Embedding Fix (#7360)
2024-05-19 slarenggml : fix another case of quants nans (#7387)
2024-05-19 Johannes Gäßlerggml: implement quantized KV cache for FA (#7372)
2024-05-19 Johannes Gäßlerserver: add test for token probs (#7347)
2024-05-19 Johannes Gäßlerserver: fix seed being reported back (#7382)
2024-05-19 Anas AhouziAdd StableLM2 pre-tokenizer (#7349)
2024-05-19 slarencuda : clear error after buffer allocation failure...
2024-05-19 Brianlabeler.yml: Use settings from ggerganov/llama.cpp...
2024-05-19 Georgi Gerganovcmake : update android comments (#7341)
2024-05-18 fraxy-vCapture CUDA logging output (#7298)
2024-05-18 Georgi Gerganovci : re-enable sanitizer runs (#7358)
2024-05-18 Georgi Gerganovandroid : use "ci-android" branch for CI (#7341)
2024-05-18 Johannes GäßlerCUDA: deduplicate FlashAttention code (#7352)
2024-05-18 Johannes Gäßlerserver: correct --threads documentation [no ci] (#7362)
2024-05-18 Engininja2cuda : add half2 __shfl_xor() for ROCm 5.5 (#7263)
2024-05-18 Steffen Röckerllama : add support for larger Granite Code Models...
2024-05-18 strawberrymelonpandaperplexity : ndot progress and show stats with < 100...
2024-05-18 0cc4mUpdate and fix Vulkan soft_max and argsort implementati...
2024-05-18 Briangithub-actions-labeler: initial commit (#7330)
2024-05-18 Georgi Gerganovconvert : fix set_vocab_sentencepiece (#6866)
2024-05-18 slarenggml : fix quants nans when all the group weights are...
2024-05-18 Engininja2cmake : fix typo in AMDGPU_TARGETS (#7356)
2024-05-17 jaime-m-pUnicode codepoint flags for custom regexs (#7245)
2024-05-17 Johannes GäßlerCUDA: faster large batch FA without tensor cores (...
2024-05-17 Gavin ZhaoROCm: use native CMake HIP support (#5966)
2024-05-17 Radoslav Gerganovrpc : set SO_REUSEADDR for the server socket (#7320)
2024-05-17 BrianAdded a single test function script and fix debug-test...
2024-05-17 Aarni Koskelapy : convert-hf-to-gguf-update improvements (#7340)
2024-05-17 fairydreamingllama : use n_embd_head_v when reshaping kqv (#7327)
2024-05-17 Johannes Gäßlertokenization: add warning for double BOS (#7332)
2024-05-17 Herman Semenovggml-quants, llama : removed excess checks (#7274)
2024-05-17 amd-lalithncconvert : fix Qwen/Qwen-7b conversion (#7308)
2024-05-17 Radoslav Gerganovserver : add support for the RPC backend (#7305)
2024-05-17 Justine Tunneyggml : rewrite silu and softmax for cpu (#7154)
2024-05-17 Leon Knauer[Server] Added --verbose option to README [no ci] ...
2024-05-16 Pierrick HymbertRevert "server bench: fix bench not waiting for model...
2024-05-16 Radoslav Gerganovrpc : get available mem for the CPU backend
2024-05-16 Radoslav Gerganovrpc : add command line arg for specifying backend memory
2024-05-16 Jared Van Bortelconvert : get general.name from model dir, not its...
2024-05-16 Herman Semenovgrammar, json, llama: replace push on emplace if it...
2024-05-16 Vaibhav Srivastavdoc: add references to hugging face GGUF-my-repo quanti...
2024-05-16 Max Krasnyanskyci: fix bin/Release path for windows-arm64 builds ...
2024-05-16 Max KrasnyanskyAdd support for properly optimized Windows ARM64 builds...
2024-05-15 Daniel Beveniusreadme : remove stray double quote (#7310)
2024-05-15 kunnisggml : use dynamic thread scheduling for matrix multipl...
2024-05-15 agray3Avoid unnecessarily disabling CUDA graphs (#7302)
2024-05-15 slarenggml : tag ggml_tensor::backend as deprecated (#7290)
2024-05-15 AidanBeltonSAdd missing " (#7303)
2024-05-15 dm4embedding : free the batch after execution (#7297)
2024-05-15 Georgi Gerganovsync : ggml
2024-05-15 John Balisggml : add `ggml_upscale_ext` (ggml/814)
2024-05-15 Johannes Gäßlerserver bench: fix bench not waiting for model load...
2024-05-14 Georgi Gerganovscript : sync ggml-rpc
2024-05-14 Georgi Gerganovmetal : support FA without mask + add asserts (#7278)
2024-05-14 Georgi Gerganovsync : ggml
2024-05-14 Georgi Gerganovmetal : tune soft_max number of threads (whisper/0)
2024-05-14 Georgi Gerganovggml : try fix ppc64 (whisper/0)
2024-05-14 Przemysław... ggml : expose SSE3 and SSSE3 for MSVC when AVX is avail...
2024-05-14 Hong Bo PENGggml : optimize for ppc64le using VSX intrinsics (ggml...
2024-05-14 Steve Grubbserver: free sampling contexts on exit (#7264)
2024-05-14 BrianRevert "move ndk code to a new library (#6951)" (#7282)
2024-05-14 Radoslav Gerganovggml : add RPC backend (#6829)
2024-05-14 slarenllama : disable pipeline parallelism with nkvo (#7265)
2024-05-14 Elton Kolamove ndk code to a new library (#6951)
2024-05-14 Haggai NuchiAdd left recursion check: quit early instead of going...
2024-05-14 Ryueidocs: Fix typo and update description for --embeddings...
2024-05-13 compiladeconvert-hf : support direct Q8_0 conversion (#7234)
2024-05-13 Georgi Gerganovllama : less KV padding when FA is off (#7257)
2024-05-13 k.h.laillava-cli: fix base64 prompt (#7248)
2024-05-13 Johannes Gäßlerperplexity: add BF16 vs. FP16 results (#7150)
2024-05-13 Neo Zhang[SYCL] rm wait() (#7233)
2024-05-13 Joan Fontanalsllama : rename jina tokenizers to v2 (#7249)
2024-05-13 Brianconvert.py: Outfile default name change and additional...
next