]> git.djapps.eu Git - pkg/ggml/sources/ggml/log
pkg/ggml/sources/ggml
2 years agoci : introduce Github Actions CI workflow (#247)
Adam Tazi [Sun, 18 Jun 2023 08:15:58 +0000 (01:15 -0700)]
ci : introduce Github Actions CI workflow (#247)

* Introduce Github Actions CI workflow for the ggml repo

This commit integrates a Github Actions CI workflow that compiles and tests the codebase on both Ubuntu 22.04 and macOS 12 Monterey. The workflow is triggered on pull requests against the main branch and on every push to the main branch.

To accommodate the resource constraints of the Github-hosted runners, a `GGML_NITER` environment variable is introduced, allowing tests to run within a reasonable time frame. `test-grad0.c` is modified to use this variable instead of `GGML_NLOOP`.

The workflow file includes:

- A build strategy for both Ubuntu and MacOS.
- An environment setup with variables `GGML_NLOOP` and `GGML_NITER`.
- A step to limit the number of threads used by `test2.c` for efficient execution.
- A typical build process with steps for environment creation, CMake configuration, building, and verbose testing with a timeout.

* main to master

2 years agoggml : convert interleaved addressing to sequential addressing for reduce functions...
Tanmay [Sun, 18 Jun 2023 08:09:48 +0000 (13:39 +0530)]
ggml : convert interleaved addressing to sequential addressing for reduce functions (#117)

* Convert interleaved addressing to sequential addressing for REDUCE

* update addressing on new archs

2 years agoexamples : fix c++ standard errors and pedantic warnings (#239)
Ravindra Marella [Sun, 18 Jun 2023 07:54:59 +0000 (13:24 +0530)]
examples : fix c++ standard errors and pedantic warnings (#239)

2 years agoggml : fix minor resource leak reported by static analysis (#237)
Cristiano Calcagno [Sun, 18 Jun 2023 07:45:11 +0000 (09:45 +0200)]
ggml : fix minor resource leak reported by static analysis (#237)

2 years agostarcoder : add support for starchat special tokens (#246)
Ravindra Marella [Sun, 18 Jun 2023 07:37:09 +0000 (13:07 +0530)]
starcoder : add support for starchat special tokens (#246)

* starcoder : add support for starchat special tokens

* examples : fix `gpt_tokenize()` for special tokens

2 years agoggml : return input tensor in ggml_set_name (#262)
LoganDark [Fri, 16 Jun 2023 19:39:09 +0000 (12:39 -0700)]
ggml : return input tensor in ggml_set_name (#262)

this is SO USEFUL for debugging. in order to find any cgraph node,
I can wrap it in ggml_set_name and set a conditional breakpoint.

but I can only wrap existing code if this returns its input.
otherwise the barrier becomes annoyingly high (have to move a
bunch of code around to add name to a tensor)

2 years agoggml : fix ggml_clamp (#263)
LoganDark [Fri, 16 Jun 2023 19:17:30 +0000 (12:17 -0700)]
ggml : fix ggml_clamp (#263)

This unconditionally failed before

2 years agoggml : add quick GELU (#254)
M. Yusuf Sarıgöz [Fri, 16 Jun 2023 17:36:46 +0000 (20:36 +0300)]
ggml : add quick GELU (#254)

* Implement Quick GELU

* Revert "Implement Quick GELU"

This reverts commit ff220cc1f91a184f195d19b17ed4c352cc72a6f0.

* Tidy up ggml.h

* Respect to the style of ggml

* Fix: Fix minor typo

* Rename `quick_gelu` -> `gelu_quick`

2 years agocmake : export all symbols on windows when building shared library (#234)
Andrei [Thu, 8 Jun 2023 18:51:39 +0000 (14:51 -0400)]
cmake : export all symbols on windows when building shared library (#234)

Currently building ggml on windows as a shared library does not export all symbols by default.

2 years agoggml : correct off-by-one bounds check in ggml_compute_forward_set_f32 (#229)
LoganDark [Wed, 7 Jun 2023 16:16:19 +0000 (09:16 -0700)]
ggml : correct off-by-one bounds check in ggml_compute_forward_set_f32 (#229)

without this fix you will be unable to set a zero-length tensor to the end of another tensor

this sounds stupid, but is used in my testing

2 years agogpt-neox : fix ctx size calculation (#228)
klosax [Wed, 7 Jun 2023 16:15:50 +0000 (18:15 +0200)]
gpt-neox : fix ctx size calculation (#228)

2 years agoggml : fix ggml_clamp thresholds being read as ints instead of floats (#221)
Georgi Gerganov [Wed, 7 Jun 2023 16:14:50 +0000 (19:14 +0300)]
ggml : fix ggml_clamp thresholds being read as ints instead of floats (#221)

2 years agoggml : add inplace ops api in header file (#219)
Jiahao Li [Wed, 7 Jun 2023 16:14:27 +0000 (00:14 +0800)]
ggml : add inplace ops api in header file (#219)

2 years agoggml : add ggml_conv_2d_sk_p0(), ggml_win_part(), ggml_win_unpart()
Georgi Gerganov [Fri, 2 Jun 2023 12:46:59 +0000 (15:46 +0300)]
ggml : add ggml_conv_2d_sk_p0(), ggml_win_part(), ggml_win_unpart()

2 years agoggml : fix ggml op conv_1d enum names
Georgi Gerganov [Tue, 30 May 2023 10:49:08 +0000 (13:49 +0300)]
ggml : fix ggml op conv_1d enum names

2 years agoggml : better conv_1d naming
Georgi Gerganov [Tue, 30 May 2023 10:19:55 +0000 (13:19 +0300)]
ggml : better conv_1d naming

2 years agoggml : rename conv_1d ops to reflect half-padding used
Georgi Gerganov [Tue, 30 May 2023 07:18:31 +0000 (10:18 +0300)]
ggml : rename conv_1d ops to reflect half-padding used

2 years agoggml : fix compiler warnings for printf
Georgi Gerganov [Tue, 30 May 2023 07:03:30 +0000 (10:03 +0300)]
ggml : fix compiler warnings for printf

2 years agomnist : remove redundant stuff + rename ctx0
Georgi Gerganov [Mon, 29 May 2023 18:14:52 +0000 (21:14 +0300)]
mnist : remove redundant stuff + rename ctx0

2 years agomnist : add missing header (#213)
Eldar Yusupov [Mon, 29 May 2023 16:55:13 +0000 (19:55 +0300)]
mnist : add missing header (#213)

2 years agocommon : fix compilation on Linux (#212)
Eldar Yusupov [Mon, 29 May 2023 16:47:57 +0000 (19:47 +0300)]
common : fix compilation on Linux (#212)

2 years agoggml : cgraph export/import/eval example + GPU support (#108)
Georgi Gerganov [Mon, 29 May 2023 16:28:07 +0000 (19:28 +0300)]
ggml : cgraph export/import/eval example + GPU support (#108)

* ggml : cgraph export brainstorming

* mnist : code style

* mnist : minor

* ggml : initial cgraph export

* ggml : initial graph import (wip)

* ggml : import op args correctly

* ggml : add ggml_get_tensor_by_name()

* mnist : add compute graph evaluation on CPU example

* ggml : add ggml_tensor_overhead()

* ggml : rename new functions to ggml_cgraph_...

* mnist : add Metal inference skeleton (WIP)

* mnist : working on the Metal pipeline (WIP)

* mnist : prepare the Metal encoder (WIP)

* mnist : first Metal kernel for F32 ADD

* mnist : looks like MTLHeap does not work

* mnist : initial full pass of MNIST on the GPU (not verified)

* mnist : minor cleanup

* mnist : full GPU inference works

* mnist : use custom soft_max kernel since MPSMatrixSoftMax is bugged

* mnist : use constant for soft_max instead of hardcoded 10

* mnist : check multiple predictions (Metal)

* mnist : minor

* ggml : move cgraph import / export to ggml

* mnist : remove common dependencies

* mnist : fix soft_max threadgroup size

* mnist : init no_alloc member

* ggml : improve "get tensor" API

2 years agofix : fix ggml_alibi (#204)
Tyé singwa [Sun, 28 May 2023 17:41:11 +0000 (20:41 +0300)]
fix : fix ggml_alibi (#204)

2 years agoreadme : add "development" (#203)
Skyler Celestinian-Sterling [Sun, 28 May 2023 10:45:30 +0000 (03:45 -0700)]
readme : add "development" (#203)

You are welcome lol

2 years agoggml : add CLBLAST support (#197)
apcameron [Sat, 27 May 2023 13:48:33 +0000 (14:48 +0100)]
ggml : add CLBLAST support (#197)

Enable support for the RISCV architecture

This addresses https://github.com/ggerganov/ggml/issues/129

2 years agocuda : sync latest llama.cpp (control DMMV X/Y sizes)
Georgi Gerganov [Sat, 27 May 2023 13:20:24 +0000 (16:20 +0300)]
cuda : sync latest llama.cpp (control DMMV X/Y sizes)

2 years agoggml : add ggml_tensor_overhead() + ggml_get_tensort_by_name()
Georgi Gerganov [Sat, 27 May 2023 13:18:28 +0000 (16:18 +0300)]
ggml : add ggml_tensor_overhead() + ggml_get_tensort_by_name()

2 years agoggml : sync llama.cpp (OpenCL support for GPU offload)
Georgi Gerganov [Sat, 27 May 2023 08:55:25 +0000 (11:55 +0300)]
ggml : sync llama.cpp (OpenCL support for GPU offload)

2 years agomnist : gitignore stuff
Georgi Gerganov [Sat, 27 May 2023 08:51:29 +0000 (11:51 +0300)]
mnist : gitignore stuff

2 years agoexamples : add tokenization tests and refactor codes (#186)
jaeminSon [Sat, 27 May 2023 08:47:34 +0000 (17:47 +0900)]
examples : add tokenization tests and refactor codes (#186)

* examples : [refactor] remove unnecessary lines and segments

* examples : [feature] add tokenization test for gpt-neox

* examples : [feature] handle multibyte character set

* examples : [refactor] find the longest token for word

* examples : [refactor] move test_tokenizer to common.cpp as the function affects other models

* add 'test_tokenizer' function after loading the model

* examples : [feature] add test cases for checking tokenization

* examples : [feature] tokenize with huggingface tokenizers for currently supported models

* examples : add tokenization test cases for each model

* revert conversion from string to utf-8 encoded byte strings

* [refactor] make util functions for testing tokenizers available

* [bug fix] test replit using functions and variables (e.g. tokenizer struct, tokenization method) defined in its main.cpp

* [refactor] modify function name test_tokenizer -> test_gpt_tokenizer

* [refactor] put parenthesis on single line for-loops and if-statements

* [refactor] withdraw <filesystem> and use <iostream> and <dirent.h>

* [refactor] remove 'find_test_file' function and directly set test file path from 'test_gpt_tokenizer' function

* call a function for testing tokenizer with filename specified

* revert test tokenizer in replit (replit uses seperate methods for tokenzation and decoding)

* compare vector of id to check if two tokenizations are identical.

* write token ids instead of strings.

* [refactor] use --token_test rather than --test for token-test argument

* add english test cases

* update test cases with more english prompts

* examples : tokenizer testing fixes

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoUpdate README.md
Georgi Gerganov [Sat, 27 May 2023 06:11:10 +0000 (09:11 +0300)]
Update README.md

2 years agominor : fix printf warnings
Georgi Gerganov [Fri, 26 May 2023 09:34:29 +0000 (12:34 +0300)]
minor : fix printf warnings

2 years agomnist : smooth user input (#199)
Radoslav Gerganov [Fri, 26 May 2023 08:53:18 +0000 (11:53 +0300)]
mnist : smooth user input (#199)

Drawing on the canvas is now smooth. The final image which is used for
prediction is obtained by down-scaling the canvas to 28x28 pixels.
Download button is aslo added for downloading raw image values.

2 years agoexamples : add missing header file (#198)
Radoslav Gerganov [Fri, 26 May 2023 06:36:40 +0000 (09:36 +0300)]
examples : add missing header file (#198)

Some of the examples are missing the cstring header which is needed for
memcpy().

2 years agomnist : add progress indicator on the web page (#194)
Radoslav Gerganov [Thu, 25 May 2023 08:27:15 +0000 (11:27 +0300)]
mnist : add progress indicator on the web page (#194)

Prevent user actions before the model and the data set is loaded

2 years agomnist : add WASM instructions + web-page link
Georgi Gerganov [Wed, 24 May 2023 08:49:53 +0000 (11:49 +0300)]
mnist : add WASM instructions + web-page link

2 years agomnist : add web page for the MNIST example (#190)
Radoslav Gerganov [Wed, 24 May 2023 08:40:47 +0000 (11:40 +0300)]
mnist : add web page for the MNIST example (#190)

The web page is using WASM for model inference.
Users can draw digits on an HTML canvas and load random digits from the
MNIST dataset.

2 years agomnist : cleanup main.cpp
Georgi Gerganov [Wed, 24 May 2023 08:38:41 +0000 (11:38 +0300)]
mnist : cleanup main.cpp

2 years agodocs : add golang transformer bindings (#191)
Ettore Di Giacinto [Wed, 24 May 2023 08:01:31 +0000 (10:01 +0200)]
docs : add golang transformer bindings (#191)

This PR adds golang bindings to transformers in ggml

2 years agompt : fix n_ctx (close #165)
Georgi Gerganov [Wed, 24 May 2023 07:54:45 +0000 (10:54 +0300)]
mpt : fix n_ctx (close #165)

2 years agoexamples : remove prompt pipe-in support
Georgi Gerganov [Wed, 24 May 2023 07:41:06 +0000 (10:41 +0300)]
examples : remove prompt pipe-in support

Need cross-platform solution, factored out in common

2 years agocommon : add missing declarations
Georgi Gerganov [Wed, 24 May 2023 07:40:27 +0000 (10:40 +0300)]
common : add missing declarations

2 years agompt : utf-8 support, perplexity testing, repeat penalty sampling (#184)
klosax [Wed, 24 May 2023 07:27:36 +0000 (09:27 +0200)]
mpt : utf-8 support, perplexity testing, repeat penalty sampling (#184)

* common: utf-8 decoder, reverted gpt_toeknize utf-8 convert

* Update common.h

* main: decode utf-8 tokens on load

* mpt import: bug fix

* common: style fixes

* common: style fix

* Update common.h

* common: revert gpt_tokenize utf-8 convert

* Update common.cpp

* Update common.cpp

* Update common.cpp

* Add perplexity to mpt

* Update CMakeLists: perplexity

* mpt-perplexity: fixes

* Update perplexity.cpp

* common: add sampling with repeat penalty

* mpt-main: add repeat penalty sampling, add commandline parameters

* Update common.h

* mpt-main: style fixes

* Update perplexity.cpp

* Delete perplexity.cpp

* mpt: move perplexity to main

* mpt: move perplexity to main

* common.cpp: Use codecvt utf-8 converter

* main.cpp: Use codecvt utf-8 converter

* mpt : code style changes

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoreadme : update Features
Georgi Gerganov [Mon, 22 May 2023 14:57:21 +0000 (17:57 +0300)]
readme : update Features

2 years agoreadme : add link to python bindings (#181)
Ravindra Marella [Sun, 21 May 2023 12:32:05 +0000 (18:02 +0530)]
readme : add link to python bindings (#181)

2 years agocommon : support utf-8 + fix gpt_tokenize + fix MPT model import (#179)
klosax [Sun, 21 May 2023 08:21:51 +0000 (10:21 +0200)]
common : support utf-8 + fix gpt_tokenize + fix MPT model import (#179)

* Update convert-h5-to-ggml.py

* Import tokens correctly

* gpt_tokenize: Convert input to utf-8 + bug fix

* common : minor style fixes

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoreadme : add link to GGML format docs (#177)
Dan Forbes [Sat, 20 May 2023 18:25:25 +0000 (11:25 -0700)]
readme : add link to GGML format docs (#177)

2 years agoexamples : use scratch buffers to reduce memory usage (#176)
Georgi Gerganov [Sat, 20 May 2023 17:56:35 +0000 (20:56 +0300)]
examples : use scratch buffers to reduce memory usage (#176)

* starcoder : example for using scratch buffers to reduce memory usage

* starcoder : bump scratch buffers to 256 MB

* examples : add scratch buffers to MPT and GPT-NeoX

2 years agoggml : update WASM SIMD
Georgi Gerganov [Sat, 20 May 2023 17:00:27 +0000 (20:00 +0300)]
ggml : update WASM SIMD

2 years agowhisper : fix Hebrew lang id
Georgi Gerganov [Sat, 20 May 2023 15:59:04 +0000 (18:59 +0300)]
whisper : fix Hebrew lang id

2 years agoexamples : add quantize version to MPT and Replit examples (ref #168)
Georgi Gerganov [Sat, 20 May 2023 15:01:40 +0000 (18:01 +0300)]
examples : add quantize version to MPT and Replit examples (ref #168)

2 years agocommon : force --top_k to be at least 1
Georgi Gerganov [Sat, 20 May 2023 14:45:49 +0000 (17:45 +0300)]
common : force --top_k to be at least 1

2 years agoexamples : fix vocab loading (close #163)
Georgi Gerganov [Sat, 20 May 2023 14:33:07 +0000 (17:33 +0300)]
examples : fix vocab loading (close #163)

2 years agocommon : fix gpt_tokenize (ref #170)
Georgi Gerganov [Sat, 20 May 2023 14:22:58 +0000 (17:22 +0300)]
common : fix gpt_tokenize (ref #170)

2 years agodolly-v2 : par_res and neox changes (#167)
Michael Verrilli [Sat, 20 May 2023 14:12:24 +0000 (10:12 -0400)]
dolly-v2 : par_res and neox changes (#167)

* dolly-v2 example: par_res and neox changes

* Update examples/dolly-v2/quantize.cpp

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoexamples : call ggml_time_init() (close #166)
Georgi Gerganov [Sat, 20 May 2023 14:09:41 +0000 (17:09 +0300)]
examples : call ggml_time_init() (close #166)

2 years agoUpdate README.md
Georgi Gerganov [Sat, 20 May 2023 13:48:03 +0000 (16:48 +0300)]
Update README.md

2 years agoggml : sync llama.cpp - CUDA improvements + ggml minor fixes
Georgi Gerganov [Sat, 20 May 2023 12:59:34 +0000 (15:59 +0300)]
ggml : sync llama.cpp - CUDA improvements + ggml minor fixes

2 years agoggml : sync llama.cpp - new quantization formats Q4 + Q8
Georgi Gerganov [Sat, 20 May 2023 11:56:14 +0000 (14:56 +0300)]
ggml : sync llama.cpp - new quantization formats Q4 + Q8

2 years agoreadme : update roadmap (#164)
pikalover6 [Thu, 18 May 2023 06:52:22 +0000 (23:52 -0700)]
readme : update roadmap (#164)

+ MPT & Replit

2 years agoexamples : sample replit + MPT inference (#145)
Lukas Möller [Wed, 17 May 2023 19:58:21 +0000 (21:58 +0200)]
examples : sample replit + MPT inference (#145)

* Add replit model

* Add unigram tokenization support

* Remove debug log

* Port alibi attn bias fix

* Remove torch input

* Fix hardcoded path

* Remove unsupported hyperparams

* Add mpt

* Add replit quantization script

* Remove debug print

* Add quantization support to mpt

* Reformat

* Remove trailing return type

* Implement stylistic changes

* use f16 in k/v memory calculations for replit/mpt

* Update context size calculation

* Add clip_qkv and alibi_bias_max support

* fix clamping implementation, remove implicit conversions

* Fix qkv if condition

* Fix replit context size calculation

* Potentially fix gcc compilation error

* Fix warning

* Adjust object overhead

* Remove dead code

2 years agoexamples : fix a hyperparameter value in gpt-neox (#161) (#162)
jaeminSon [Wed, 17 May 2023 15:49:37 +0000 (00:49 +0900)]
examples : fix a hyperparameter value in gpt-neox (#161) (#162)

2 years agoggml : fix typo in ggml_diag_mask_zero_inplace() (#159)
Andrei [Wed, 17 May 2023 06:27:11 +0000 (02:27 -0400)]
ggml : fix typo in ggml_diag_mask_zero_inplace() (#159)

2 years agoreadme : add link to training example
Georgi Gerganov [Mon, 15 May 2023 04:50:54 +0000 (07:50 +0300)]
readme : add link to training example

2 years agoggml : add AVX dot products
Georgi Gerganov [Sun, 14 May 2023 15:56:18 +0000 (18:56 +0300)]
ggml : add AVX dot products

2 years agowhisper : sync whisper.cpp
Georgi Gerganov [Sun, 14 May 2023 15:55:29 +0000 (18:55 +0300)]
whisper : sync whisper.cpp

2 years agostarcoder : detect santacoder fix end of text token (#155)
IGUILIZ Salah-Eddine [Sun, 14 May 2023 15:31:08 +0000 (17:31 +0200)]
starcoder : detect santacoder fix end of text token (#155)

Co-authored-by: IGUILIZ Salah-Eddine <redacted>
2 years agoreadme : add re-quantization warning
Georgi Gerganov [Sun, 14 May 2023 14:26:41 +0000 (17:26 +0300)]
readme : add re-quantization warning

2 years agoexamples : use inplace calls explicitly
Georgi Gerganov [Sun, 14 May 2023 12:10:32 +0000 (15:10 +0300)]
examples : use inplace calls explicitly

2 years agotests : add tests from llama.cpp
Georgi Gerganov [Sun, 14 May 2023 11:55:28 +0000 (14:55 +0300)]
tests : add tests from llama.cpp

2 years agoggml : fix multi-threaded ggml_compute_forward_diag_mask_f32()
Georgi Gerganov [Sun, 14 May 2023 11:45:13 +0000 (14:45 +0300)]
ggml : fix multi-threaded ggml_compute_forward_diag_mask_f32()

2 years agoggml : fix rope calculation (!inplace + GPT-NeoX mode)
Georgi Gerganov [Sun, 14 May 2023 11:16:47 +0000 (14:16 +0300)]
ggml : fix rope calculation (!inplace + GPT-NeoX mode)

2 years agoggml : new Q4 and Q5 quantization formats + backward ops
Georgi Gerganov [Sun, 14 May 2023 08:23:02 +0000 (11:23 +0300)]
ggml : new Q4 and Q5 quantization formats + backward ops

sync llama.cpp

- bump GGML_QNT_VERSION -> 1
- increase cwggml object overhead size from 256 to 512 in examples
- drop Q4_2 support
- tensor backend support CUDA

2 years agoggml : add GGML_QNT_VERSION for tracking changes to the quantization format
Georgi Gerganov [Sun, 14 May 2023 07:07:27 +0000 (10:07 +0300)]
ggml : add GGML_QNT_VERSION for tracking changes to the quantization format

ref #150

2 years agowhisper : sync whisper.cpp minor changes
Georgi Gerganov [Sun, 14 May 2023 07:06:19 +0000 (10:06 +0300)]
whisper : sync whisper.cpp minor changes

2 years agostarcoder : update example to follow the naming convention of other examples (#153)
Ravindra Marella [Sat, 13 May 2023 13:47:02 +0000 (19:17 +0530)]
starcoder : update example to follow the naming convention of other examples (#153)

2 years agoreadme : fix gpt-neox example link
Georgi Gerganov [Sat, 13 May 2023 13:02:49 +0000 (16:02 +0300)]
readme : fix gpt-neox example link

2 years agoexamples : fix warnings (#152)
Ravindra Marella [Sat, 13 May 2023 12:24:47 +0000 (17:54 +0530)]
examples : fix warnings (#152)

2 years agoreadme : add BLOOM example (#151)
Nouamane Tazi [Sat, 13 May 2023 10:46:10 +0000 (12:46 +0200)]
readme : add BLOOM example (#151)

2 years agoexamples : update readme with new quantization usage + remove bug alert
Georgi Gerganov [Sat, 13 May 2023 10:08:56 +0000 (13:08 +0300)]
examples : update readme with new quantization usage + remove bug alert

2 years agoreadme : update example list (#146)
Georgi Gerganov [Sat, 13 May 2023 10:04:57 +0000 (13:04 +0300)]
readme : update example list (#146)

2 years agoexamples : add StarCoder/SantaCoder sample inference (#146)
Nouamane Tazi [Sat, 13 May 2023 09:54:03 +0000 (11:54 +0200)]
examples : add StarCoder/SantaCoder sample inference (#146)

* init commit

* fix building starcoder

* gen work

* fix vocab

* santacoder mha

* .

* fix quantize

* offload_state_dict

* endoftext

* rename scripts

* fix main

* scripts

* update README

* quickfixes

2 years agogpt-neox : add non-parallel residual support (#139)
Eldar Yusupov [Sat, 13 May 2023 09:41:45 +0000 (12:41 +0300)]
gpt-neox : add non-parallel residual support (#139)

* Add non-parallel residual support

* Rename stablelm to gpt-neox

* Fix stablelm model name

2 years agocommon : allow prompts to be loaded from file (#102)
Nevin [Sat, 13 May 2023 08:41:43 +0000 (10:41 +0200)]
common : allow prompts to be loaded from file (#102)

* common: allow prompts to be loaded from file

* common : extra help for -f

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoggml : fix bug in alibi (#143)
yangyaofei [Thu, 11 May 2023 21:47:48 +0000 (05:47 +0800)]
ggml : fix bug in alibi (#143)

2 years agodolly-v2 : ggml_cgraph init (#112)
Georgi Gerganov [Mon, 8 May 2023 15:07:10 +0000 (18:07 +0300)]
dolly-v2 : ggml_cgraph init (#112)

2 years agoexamples : make struct initialization more portable (#112)
Tanmay Sachan [Mon, 8 May 2023 15:06:36 +0000 (20:36 +0530)]
examples : make struct initialization more portable (#112)

2 years agodolly-v2 : minor formatting
Georgi Gerganov [Mon, 8 May 2023 15:03:47 +0000 (18:03 +0300)]
dolly-v2 : minor formatting

2 years agoexamples : add dolly-v2 sample inference (#132)
Michael Verrilli [Sat, 6 May 2023 05:51:45 +0000 (01:51 -0400)]
examples : add dolly-v2 sample inference (#132)

* Vocab support for special tokens

* Initial dolly-v2 commit

* update README

2 years agostablelm : update README.md
Georgi Gerganov [Thu, 4 May 2023 15:45:39 +0000 (18:45 +0300)]
stablelm : update README.md

2 years agoggml : vectorize Q8_0 quantization (#127)
Georgi Gerganov [Wed, 3 May 2023 20:22:14 +0000 (23:22 +0300)]
ggml : vectorize Q8_0 quantization (#127)

2 years agoggml : fix 32-bit ARM
Georgi Gerganov [Tue, 2 May 2023 19:14:27 +0000 (22:14 +0300)]
ggml : fix 32-bit ARM

2 years agowhisper : sync with latest
Georgi Gerganov [Tue, 2 May 2023 18:28:21 +0000 (21:28 +0300)]
whisper : sync with latest

2 years agoscripts : update sync scripts
Georgi Gerganov [Tue, 2 May 2023 18:27:02 +0000 (21:27 +0300)]
scripts : update sync scripts

2 years agoggml : sync llama.cpp (clBLAST support + tensor names)
Georgi Gerganov [Tue, 2 May 2023 17:23:16 +0000 (20:23 +0300)]
ggml : sync llama.cpp (clBLAST support + tensor names)

2 years agoggml : temp comment
Georgi Gerganov [Mon, 1 May 2023 07:13:59 +0000 (10:13 +0300)]
ggml : temp comment

2 years agoggml : fix UB (int << 31)
Georgi Gerganov [Sun, 30 Apr 2023 19:28:14 +0000 (22:28 +0300)]
ggml : fix UB (int << 31)

2 years agoggml, whisper : sync whisper.cpp (GGML_FTYPE + Q5 WASM SIMD)
Georgi Gerganov [Sun, 30 Apr 2023 16:03:35 +0000 (19:03 +0300)]
ggml, whisper : sync whisper.cpp (GGML_FTYPE + Q5 WASM SIMD)

2 years agoggml : fix labels for GGML_OP_ALIBI
Georgi Gerganov [Sun, 30 Apr 2023 07:25:13 +0000 (10:25 +0300)]
ggml : fix labels for GGML_OP_ALIBI

2 years agoggml : fix 32-bit ARM NEON
Georgi Gerganov [Sat, 29 Apr 2023 18:33:59 +0000 (21:33 +0300)]
ggml : fix 32-bit ARM NEON