git.djapps.eu Git - pkg/ggml/sources/ggml/log

]> git.djapps.eu Git - pkg/ggml/sources/ggml/log

overview / pkg / ggml / sources / ggml / log

commit | commitdiff | tree

Georgi Gerganov [Wed, 24 May 2023 07:41:06 +0000 (10:41 +0300)]

examples : remove prompt pipe-in support

Need cross-platform solution, factored out in common

commit | commitdiff | tree

Georgi Gerganov [Wed, 24 May 2023 07:40:27 +0000 (10:40 +0300)]

common : add missing declarations

commit | commitdiff | tree

klosax [Wed, 24 May 2023 07:27:36 +0000 (09:27 +0200)]

mpt : utf-8 support, perplexity testing, repeat penalty sampling (#184)

* common: utf-8 decoder, reverted gpt_toeknize utf-8 convert

* Update common.h

* main: decode utf-8 tokens on load

* mpt import: bug fix

* common: style fixes

* common: style fix

* Update common.h

* common: revert gpt_tokenize utf-8 convert

* Update common.cpp

* Update common.cpp

* Update common.cpp

* Add perplexity to mpt

* Update CMakeLists: perplexity

* mpt-perplexity: fixes

* Update perplexity.cpp

* common: add sampling with repeat penalty

* mpt-main: add repeat penalty sampling, add commandline parameters

* Update common.h

* mpt-main: style fixes

* Update perplexity.cpp

* Delete perplexity.cpp

* mpt: move perplexity to main

* mpt: move perplexity to main

* common.cpp: Use codecvt utf-8 converter

* main.cpp: Use codecvt utf-8 converter

* mpt : code style changes

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Georgi Gerganov [Mon, 22 May 2023 14:57:21 +0000 (17:57 +0300)]

readme : update Features

commit | commitdiff | tree

Ravindra Marella [Sun, 21 May 2023 12:32:05 +0000 (18:02 +0530)]

readme : add link to python bindings (#181)

commit | commitdiff | tree

klosax [Sun, 21 May 2023 08:21:51 +0000 (10:21 +0200)]

common : support utf-8 + fix gpt_tokenize + fix MPT model import (#179)

* Update convert-h5-to-ggml.py

* Import tokens correctly

* gpt_tokenize: Convert input to utf-8 + bug fix

* common : minor style fixes

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Dan Forbes [Sat, 20 May 2023 18:25:25 +0000 (11:25 -0700)]

readme : add link to GGML format docs (#177)

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 17:56:35 +0000 (20:56 +0300)]

examples : use scratch buffers to reduce memory usage (#176)

* starcoder : example for using scratch buffers to reduce memory usage

* starcoder : bump scratch buffers to 256 MB

* examples : add scratch buffers to MPT and GPT-NeoX

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 17:00:27 +0000 (20:00 +0300)]

ggml : update WASM SIMD

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 15:59:04 +0000 (18:59 +0300)]

whisper : fix Hebrew lang id

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 15:01:40 +0000 (18:01 +0300)]

examples : add quantize version to MPT and Replit examples (ref #168)

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 14:45:49 +0000 (17:45 +0300)]

common : force --top_k to be at least 1

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 14:33:07 +0000 (17:33 +0300)]

examples : fix vocab loading (close #163)

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 14:22:58 +0000 (17:22 +0300)]

common : fix gpt_tokenize (ref #170)

commit | commitdiff | tree

Michael Verrilli [Sat, 20 May 2023 14:12:24 +0000 (10:12 -0400)]

dolly-v2 : par_res and neox changes (#167)

* dolly-v2 example: par_res and neox changes

* Update examples/dolly-v2/quantize.cpp

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 14:09:41 +0000 (17:09 +0300)]

examples : call ggml_time_init() (close #166)

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 13:48:03 +0000 (16:48 +0300)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 12:59:34 +0000 (15:59 +0300)]

ggml : sync llama.cpp - CUDA improvements + ggml minor fixes

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 11:56:14 +0000 (14:56 +0300)]

ggml : sync llama.cpp - new quantization formats Q4 + Q8

commit | commitdiff | tree

pikalover6 [Thu, 18 May 2023 06:52:22 +0000 (23:52 -0700)]

readme : update roadmap (#164)

+ MPT & Replit

commit | commitdiff | tree

Lukas Möller [Wed, 17 May 2023 19:58:21 +0000 (21:58 +0200)]

examples : sample replit + MPT inference (#145)

* Add replit model

* Add unigram tokenization support

* Remove debug log

* Port alibi attn bias fix

* Remove torch input

* Fix hardcoded path

* Remove unsupported hyperparams

* Add mpt

* Add replit quantization script

* Remove debug print

* Add quantization support to mpt

* Reformat

* Remove trailing return type

* Implement stylistic changes

* use f16 in k/v memory calculations for replit/mpt

* Update context size calculation

* Add clip_qkv and alibi_bias_max support

* fix clamping implementation, remove implicit conversions

* Fix qkv if condition

* Fix replit context size calculation

* Potentially fix gcc compilation error

* Fix warning

* Adjust object overhead

* Remove dead code

commit | commitdiff | tree

jaeminSon [Wed, 17 May 2023 15:49:37 +0000 (00:49 +0900)]

examples : fix a hyperparameter value in gpt-neox (#161) (#162)

commit | commitdiff | tree

Andrei [Wed, 17 May 2023 06:27:11 +0000 (02:27 -0400)]

ggml : fix typo in ggml_diag_mask_zero_inplace() (#159)

commit | commitdiff | tree

Georgi Gerganov [Mon, 15 May 2023 04:50:54 +0000 (07:50 +0300)]

readme : add link to training example

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 15:56:18 +0000 (18:56 +0300)]

ggml : add AVX dot products

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 15:55:29 +0000 (18:55 +0300)]

whisper : sync whisper.cpp

commit | commitdiff | tree

IGUILIZ Salah-Eddine [Sun, 14 May 2023 15:31:08 +0000 (17:31 +0200)]

starcoder : detect santacoder fix end of text token (#155)

Co-authored-by: IGUILIZ Salah-Eddine <redacted>

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 14:26:41 +0000 (17:26 +0300)]

readme : add re-quantization warning

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 12:10:32 +0000 (15:10 +0300)]

examples : use inplace calls explicitly

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 11:55:28 +0000 (14:55 +0300)]

tests : add tests from llama.cpp

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 11:45:13 +0000 (14:45 +0300)]

ggml : fix multi-threaded ggml_compute_forward_diag_mask_f32()

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 11:16:47 +0000 (14:16 +0300)]

ggml : fix rope calculation (!inplace + GPT-NeoX mode)

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 08:23:02 +0000 (11:23 +0300)]

ggml : new Q4 and Q5 quantization formats + backward ops

sync llama.cpp

- bump GGML_QNT_VERSION -> 1
- increase cwggml object overhead size from 256 to 512 in examples
- drop Q4_2 support
- tensor backend support CUDA

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 07:07:27 +0000 (10:07 +0300)]

ggml : add GGML_QNT_VERSION for tracking changes to the quantization format

ref #150

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 07:06:19 +0000 (10:06 +0300)]

whisper : sync whisper.cpp minor changes

commit | commitdiff | tree

Ravindra Marella [Sat, 13 May 2023 13:47:02 +0000 (19:17 +0530)]

starcoder : update example to follow the naming convention of other examples (#153)

commit | commitdiff | tree

Georgi Gerganov [Sat, 13 May 2023 13:02:49 +0000 (16:02 +0300)]

readme : fix gpt-neox example link

commit | commitdiff | tree

Ravindra Marella [Sat, 13 May 2023 12:24:47 +0000 (17:54 +0530)]

examples : fix warnings (#152)

commit | commitdiff | tree

Nouamane Tazi [Sat, 13 May 2023 10:46:10 +0000 (12:46 +0200)]

readme : add BLOOM example (#151)

commit | commitdiff | tree

Georgi Gerganov [Sat, 13 May 2023 10:08:56 +0000 (13:08 +0300)]

examples : update readme with new quantization usage + remove bug alert

commit | commitdiff | tree

Georgi Gerganov [Sat, 13 May 2023 10:04:57 +0000 (13:04 +0300)]

readme : update example list (#146)

commit | commitdiff | tree

Nouamane Tazi [Sat, 13 May 2023 09:54:03 +0000 (11:54 +0200)]

examples : add StarCoder/SantaCoder sample inference (#146)

* init commit

* fix building starcoder

* gen work

* fix vocab

* santacoder mha

* .

* fix quantize

* offload_state_dict

* endoftext

* rename scripts

* fix main

* scripts

* update README

* quickfixes

commit | commitdiff | tree

Eldar Yusupov [Sat, 13 May 2023 09:41:45 +0000 (12:41 +0300)]

gpt-neox : add non-parallel residual support (#139)

* Add non-parallel residual support

* Rename stablelm to gpt-neox

* Fix stablelm model name

commit | commitdiff | tree

Nevin [Sat, 13 May 2023 08:41:43 +0000 (10:41 +0200)]

common : allow prompts to be loaded from file (#102)

* common: allow prompts to be loaded from file

* common : extra help for -f

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

yangyaofei [Thu, 11 May 2023 21:47:48 +0000 (05:47 +0800)]

ggml : fix bug in alibi (#143)

commit | commitdiff | tree

Georgi Gerganov [Mon, 8 May 2023 15:07:10 +0000 (18:07 +0300)]

dolly-v2 : ggml_cgraph init (#112)

commit | commitdiff | tree

Tanmay Sachan [Mon, 8 May 2023 15:06:36 +0000 (20:36 +0530)]

examples : make struct initialization more portable (#112)

commit | commitdiff | tree

Georgi Gerganov [Mon, 8 May 2023 15:03:47 +0000 (18:03 +0300)]

dolly-v2 : minor formatting

commit | commitdiff | tree

Michael Verrilli [Sat, 6 May 2023 05:51:45 +0000 (01:51 -0400)]

examples : add dolly-v2 sample inference (#132)

* Vocab support for special tokens

* Initial dolly-v2 commit

* update README

commit | commitdiff | tree

Georgi Gerganov [Thu, 4 May 2023 15:45:39 +0000 (18:45 +0300)]

stablelm : update README.md

commit | commitdiff | tree

Georgi Gerganov [Wed, 3 May 2023 20:22:14 +0000 (23:22 +0300)]

ggml : vectorize Q8_0 quantization (#127)

commit | commitdiff | tree

Georgi Gerganov [Tue, 2 May 2023 19:14:27 +0000 (22:14 +0300)]

ggml : fix 32-bit ARM

commit | commitdiff | tree

Georgi Gerganov [Tue, 2 May 2023 18:28:21 +0000 (21:28 +0300)]

whisper : sync with latest

commit | commitdiff | tree

Georgi Gerganov [Tue, 2 May 2023 18:27:02 +0000 (21:27 +0300)]

scripts : update sync scripts

commit | commitdiff | tree

Georgi Gerganov [Tue, 2 May 2023 17:23:16 +0000 (20:23 +0300)]

ggml : sync llama.cpp (clBLAST support + tensor names)

commit | commitdiff | tree

Georgi Gerganov [Mon, 1 May 2023 07:13:59 +0000 (10:13 +0300)]

ggml : temp comment

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 19:28:14 +0000 (22:28 +0300)]

ggml : fix UB (int << 31)

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 16:03:35 +0000 (19:03 +0300)]

ggml, whisper : sync whisper.cpp (GGML_FTYPE + Q5 WASM SIMD)

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 07:25:13 +0000 (10:25 +0300)]

ggml : fix labels for GGML_OP_ALIBI

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 18:33:59 +0000 (21:33 +0300)]

ggml : fix 32-bit ARM NEON

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 18:13:40 +0000 (21:13 +0300)]

ggml : use vzip instead of vuzp for consistency

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 16:13:53 +0000 (19:13 +0300)]

ggml : fix SHARED build

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 16:07:19 +0000 (19:07 +0300)]

ggml : sync llama.cpp (less memory for mul_mat f16 + asserts)

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 09:33:57 +0000 (12:33 +0300)]

scripts : add sync-whisper.sh

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 07:30:56 +0000 (10:30 +0300)]

common : forgot to remove Q4_3 references

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 07:03:59 +0000 (10:03 +0300)]

ggml : remove Q4_3

commit | commitdiff | tree

Georgi Gerganov [Fri, 28 Apr 2023 17:47:27 +0000 (20:47 +0300)]

ggml : ggml_alibi() fixes (#113)

commit | commitdiff | tree

Dan Forbes [Fri, 28 Apr 2023 17:37:07 +0000 (10:37 -0700)]

ggml : add ggml_alibi (positional embedding) (#113)

Co-authored-by: @hhamud <redacted>

commit | commitdiff | tree

Georgi Gerganov [Fri, 28 Apr 2023 17:34:38 +0000 (20:34 +0300)]

ggml : sync llama.cpp (CLBlast)

commit | commitdiff | tree

Georgi Gerganov [Fri, 28 Apr 2023 17:33:44 +0000 (20:33 +0300)]

gitignore : add python env folders

commit | commitdiff | tree

Santtu Keskinen [Fri, 28 Apr 2023 04:25:11 +0000 (07:25 +0300)]

readme : add bert.cpp link (#114)

commit | commitdiff | tree

Georgi Gerganov [Thu, 27 Apr 2023 16:07:40 +0000 (19:07 +0300)]

stablelm : put warning about bug in the implementation

commit | commitdiff | tree

Georgi Gerganov [Thu, 27 Apr 2023 15:31:53 +0000 (18:31 +0300)]

ggml : sync llama.cpp (Q5_0 + Q5_1) + refactor examples quantization

commit | commitdiff | tree

Georgi Gerganov [Mon, 24 Apr 2023 15:52:25 +0000 (18:52 +0300)]

ggml : sync llama.cpp (fix GCC 8 build, close #99)

commit | commitdiff | tree

Georgi Gerganov [Sun, 23 Apr 2023 17:04:03 +0000 (20:04 +0300)]

ggml : indentation

commit | commitdiff | tree

Georgi Gerganov [Sun, 23 Apr 2023 16:57:37 +0000 (19:57 +0300)]

ggml : add GGML_API for exporting shared symbols

commit | commitdiff | tree

Georgi Gerganov [Sun, 23 Apr 2023 16:45:39 +0000 (19:45 +0300)]

ggml : better PERF prints

commit | commitdiff | tree

le.chang [Sun, 23 Apr 2023 16:12:49 +0000 (00:12 +0800)]

tests : fix compile error (#98)

commit | commitdiff | tree

appvoid [Sun, 23 Apr 2023 16:11:33 +0000 (12:11 -0400)]

gpt-2 : remove GPT-J unnecessary import (#91)

commit | commitdiff | tree

AsukaMinato [Sun, 23 Apr 2023 15:03:52 +0000 (00:03 +0900)]

tests : remove type cast (#100)

commit | commitdiff | tree

Georgi Gerganov [Sun, 23 Apr 2023 13:38:00 +0000 (16:38 +0300)]

ggml : sync llama.cpp (AVX improvements)

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 13:34:39 +0000 (16:34 +0300)]

ggml : fix Q4_3 cuBLAS + fix quantize_row_q4_2()

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 12:49:15 +0000 (15:49 +0300)]

examples : refactor quantization tools

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 11:59:42 +0000 (14:59 +0300)]

examples : utils -> common

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 10:59:49 +0000 (13:59 +0300)]

ggml : fix ARM build

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 10:23:20 +0000 (13:23 +0300)]

cmake : add CMake support for cuBLAS (#101)

* cmake : add cuBLAS support

* cmake : fix cuBLAS build

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 09:52:25 +0000 (12:52 +0300)]

examples : add Q4_2 and Q4_3 quantization support

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 09:36:42 +0000 (12:36 +0300)]

ggml : sync llama.cpp (Q4_3 + CUDA)

commit | commitdiff | tree

Bart Pelle [Thu, 20 Apr 2023 21:15:45 +0000 (23:15 +0200)]

mnist : add missing header (#95)

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:35:52 +0000 (23:35 +0300)]

stablelm : update README.md

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:23:07 +0000 (23:23 +0300)]

minor : fix GPT-NeoX name

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:21:38 +0000 (23:21 +0300)]

readme : add StableLM reference

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:20:38 +0000 (23:20 +0300)]

examples : add StableLM example (#96)

* ggml : there is a bug in ggml_cpy() F32 -> F32

Cannot see why, but multi-thread does not work

* stablelm : initial implementation, but QKV seems broken

* stablelm : make it work

* stablelm : use original merged QKV matrix

* stablelm : minor

* stablelm : instructions

* stablelm : update README.md

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 19:00:49 +0000 (22:00 +0300)]

ggml : sync llama.cpp (cuBLAS, Q4_3, bug fix, etc)

commit | commitdiff | tree

Georgi Gerganov [Wed, 19 Apr 2023 17:20:23 +0000 (20:20 +0300)]

ggml : sync llama.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 15 Apr 2023 19:23:10 +0000 (22:23 +0300)]

examples : update huggingface links

commit | commitdiff | tree

Georgi Gerganov [Sat, 15 Apr 2023 16:50:54 +0000 (19:50 +0300)]

ggml : sync llama.cpp