]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/log
pkg/ggml/sources/llama.cpp
2 years agoUpdate hot topics to mention Alpaca support
Georgi Gerganov [Sun, 19 Mar 2023 17:51:55 +0000 (19:51 +0200)]
Update hot topics to mention Alpaca support

2 years agoFix off-by-one bug (#115)
Georgi Gerganov [Sun, 19 Mar 2023 17:46:32 +0000 (19:46 +0200)]
Fix off-by-one bug (#115)

2 years agoFix python stuff (#109)
Georgi Gerganov [Sun, 19 Mar 2023 17:33:18 +0000 (19:33 +0200)]
Fix python stuff (#109)

2 years agoRefactoring `convert-pth-to-ggml.py`: more concise and readable (#109)
qunash [Sun, 19 Mar 2023 17:17:39 +0000 (20:17 +0300)]
Refactoring `convert-pth-to-ggml.py`: more concise and readable (#109)

* Refactor get_n_parts function to simplify code and improve readability

* Use f-strings instead of concatenation

* Refactoring: more concise and readable

* modularize

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoDrop trailing new line from file prompts (#80)
Georgi Gerganov [Sun, 19 Mar 2023 17:04:44 +0000 (19:04 +0200)]
Drop trailing new line from file prompts (#80)

2 years agoAdd instruction for using Alpaca (#240)
Georgi Gerganov [Sun, 19 Mar 2023 16:49:50 +0000 (18:49 +0200)]
Add instruction for using Alpaca (#240)

2 years agoAdd "--instruct" argument for usage with Alpaca (#240)
Georgi Gerganov [Sun, 19 Mar 2023 16:37:02 +0000 (18:37 +0200)]
Add "--instruct" argument for usage with Alpaca (#240)

Also start adding prompts in "./prompts"

2 years agoChange RMSNorm eps to 1e-6 (#173)
Georgi Gerganov [Sun, 19 Mar 2023 15:30:00 +0000 (17:30 +0200)]
Change RMSNorm eps to 1e-6 (#173)

I think this is what is used in the Python code

2 years agoWarn user if a context size greater than 2048 tokens is specified (#274)
Ronsor [Sun, 19 Mar 2023 00:10:47 +0000 (17:10 -0700)]
Warn user if a context size greater than 2048 tokens is specified (#274)

LLaMA doesn't support more than 2048 token context sizes, and going above that produces terrible results.

2 years agoFix typo in readme
Pavol Rusnak [Sat, 18 Mar 2023 21:39:46 +0000 (22:39 +0100)]
Fix typo in readme

2 years agoAdd note about Python 3.11 to readme
Pavol Rusnak [Sat, 18 Mar 2023 21:20:04 +0000 (22:20 +0100)]
Add note about Python 3.11 to readme

2 years agoAdd memory/disk requirements to readme
Pavol Rusnak [Sat, 18 Mar 2023 20:58:46 +0000 (21:58 +0100)]
Add memory/disk requirements to readme

2 years agoRemove unused code since n_vocab is model.hparams.n_vocab (#262)
Alex Nguyen [Sat, 18 Mar 2023 13:51:49 +0000 (20:51 +0700)]
Remove unused code since n_vocab is model.hparams.n_vocab (#262)

2 years agofixed warning with std::ignore about unused function result (#151)
Justin Suess [Sat, 18 Mar 2023 11:44:09 +0000 (07:44 -0400)]
fixed warning with std::ignore about unused function result (#151)

fixed warning with std::ignore about unused function result

2 years agoFix n^2 loop in tokenization (#254)
Gary Linscott [Sat, 18 Mar 2023 11:17:19 +0000 (04:17 -0700)]
Fix n^2 loop in tokenization (#254)

This causes long prompts to parse very slowly.

2 years agoCI Improvements (#230)
anzz1 [Sat, 18 Mar 2023 07:27:12 +0000 (09:27 +0200)]
CI Improvements (#230)

* CI Improvements

Manual build feature, autoreleases for Windows

* better CI naming convention

use branch name in releases and tags

2 years agoNix flake (#40)
Niklas Korz [Fri, 17 Mar 2023 22:03:48 +0000 (23:03 +0100)]
Nix flake (#40)

* Nix flake

* Nix: only add Accelerate framework on macOS

* Nix: development shel, direnv and compatibility

* Nix: use python packages supplied by withPackages

* Nix: remove channel compatibility

* Nix: fix ARM neon dotproduct on macOS

---------

Co-authored-by: Pavol Rusnak <redacted>
2 years agoImplement non-greedy tokenizer that tries to maximize token lengths (#242)
thement [Fri, 17 Mar 2023 20:05:58 +0000 (21:05 +0100)]
Implement non-greedy tokenizer that tries to maximize token lengths (#242)

* Implement non-greedy tokenizer that tries to maximize token lengths

* Insert single space in front of the prompt

- this is to match original llama tokenizer behavior

---------

Co-authored-by: Jakub Horak <redacted>
2 years agoDefault to 4 threads (#243)
Georgi Gerganov [Fri, 17 Mar 2023 19:46:46 +0000 (21:46 +0200)]
Default to 4 threads (#243)

2 years agoUpdate Contributing section
Georgi Gerganov [Fri, 17 Mar 2023 18:30:04 +0000 (20:30 +0200)]
Update Contributing section

2 years agoDon't tell users to use a bad number of threads (#243)
Stephan Walter [Fri, 17 Mar 2023 17:47:35 +0000 (17:47 +0000)]
Don't tell users to use a bad number of threads (#243)

The readme tells people to use the command line option "-t 8", causing 8
threads to be started. On systems with fewer than 8 cores, this causes a
significant slowdown. Remove the option from the example command lines
and use /proc/cpuinfo on Linux to determine a sensible default.

2 years agoadd ptread link to fix cmake build under linux (#114)
mmyjona [Fri, 17 Mar 2023 16:38:24 +0000 (00:38 +0800)]
add ptread link to fix cmake build under linux (#114)

* add ptread link to fix cmake build under linux

* add cmake to linux and macos platform

* separate make and cmake workflow

---------

Co-authored-by: SebastiƔn A <redacted>
2 years agošŸš€ Dockerize llamacpp (#132)
Bernat Vadell [Fri, 17 Mar 2023 09:47:06 +0000 (10:47 +0100)]
šŸš€ Dockerize llamacpp (#132)

* feat: dockerize llamacpp

* feat: split build & runtime stages

* split dockerfile into main & tools

* add quantize into tool docker image

* Update .devops/tools.sh

Co-authored-by: Georgi Gerganov <redacted>
* add docker action pipeline

* change CI to publish at github docker registry

* fix name runs-on macOS-latest is macos-latest (lowercase)

* include docker versioned images

* fix github action docker

* fix docker.yml

* feat: include all-in-one command tool & update readme.md

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoQ4_1 quantization (#193)
Matvey Soloviev [Fri, 17 Mar 2023 04:48:39 +0000 (05:48 +0100)]
Q4_1 quantization (#193)

* Add AVX2 version of ggml_vec_dot_q4_1

* Small optimisations to q4_1 dot product (@Const-me)

* Rearrange Q4_1 quantization to work for multipart models. (Fix #152)

* Fix ggml_vec_mad_q4_1 too

* Fix non-vectorised q4_1 vec mul

2 years agoUpdate README.md
Georgi Gerganov [Thu, 16 Mar 2023 13:00:09 +0000 (15:00 +0200)]
Update README.md

2 years agoExpand "Contributing" section
Georgi Gerganov [Thu, 16 Mar 2023 06:55:13 +0000 (08:55 +0200)]
Expand "Contributing" section

2 years agoUpdate hot topics - RMSnorm
Georgi Gerganov [Thu, 16 Mar 2023 05:12:12 +0000 (07:12 +0200)]
Update hot topics - RMSnorm

2 years agoFix RMS norm in GGML (#191)
Nebula [Wed, 15 Mar 2023 23:29:25 +0000 (19:29 -0400)]
Fix RMS norm in GGML (#191)

2 years agoAdd RMS norm and use it (#187)
hoangmit [Wed, 15 Mar 2023 22:41:38 +0000 (18:41 -0400)]
Add RMS norm and use it (#187)

* add ggml_rms_norm

* update op num

2 years agofixed typo (#178)
moritzbrantner [Wed, 15 Mar 2023 20:35:25 +0000 (21:35 +0100)]
fixed typo (#178)

2 years agoadd SIGINT support for _WIN32 environments (#120)
Rickey Bowers Jr [Wed, 15 Mar 2023 19:56:24 +0000 (13:56 -0600)]
add SIGINT support for _WIN32 environments (#120)

* add SIGINT support for _WIN32 environments

* perhaps more consistent

2 years agoadded ctx_size parameter (#148)
Justin Suess [Wed, 15 Mar 2023 19:42:40 +0000 (15:42 -0400)]
added ctx_size parameter (#148)

* added ctx_size parameter

* added it in more places

* Apply suggestions from code review

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agofixed color reset on exit (#149)
Justin Suess [Wed, 15 Mar 2023 19:39:38 +0000 (15:39 -0400)]
fixed color reset on exit (#149)

* fixed color reset on exit

* added sigint handler for ansi_color_reset

* Update main.cpp

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoFix potential licensing issue (#126)
Musab Gultekin [Wed, 15 Mar 2023 19:39:06 +0000 (22:39 +0300)]
Fix potential licensing issue (#126)

* Update README.md

* Update README.md

remove facebook

2 years agoUse `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py...
Ronsor [Wed, 15 Mar 2023 19:37:50 +0000 (12:37 -0700)]
Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142)

There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.

2 years agoinline -> static inline for "bytesFromNibbles" (#161)
hoangmit [Wed, 15 Mar 2023 19:05:14 +0000 (15:05 -0400)]
inline -> static inline for "bytesFromNibbles" (#161)

Without "static" prefix, it fails to compile in clang

2 years agoDon't use vdotq_s32 if it's not available (#139)
Ronsor [Tue, 14 Mar 2023 19:34:37 +0000 (12:34 -0700)]
Don't use vdotq_s32 if it's not available (#139)

* Don't use vdotq_s32 if it's not available

`dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available.

Reintroduces the code removed in 84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined.

* Update ggml.c

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoAdd section to README on how to run the project on Android (#130)
Radoslav Gerganov [Tue, 14 Mar 2023 13:30:08 +0000 (15:30 +0200)]
Add section to README on how to run the project on Android (#130)

2 years agoAdd Misc section + update hot topics + minor fixes
Georgi Gerganov [Tue, 14 Mar 2023 07:43:52 +0000 (09:43 +0200)]
Add Misc section + update hot topics + minor fixes

2 years agoAdd windows to the CI (#98)
SebastiƔn A [Mon, 13 Mar 2023 20:29:10 +0000 (17:29 -0300)]
Add windows to the CI (#98)

2 years agoCMake build in Release by default (#75)
Georgi Gerganov [Mon, 13 Mar 2023 19:22:15 +0000 (21:22 +0200)]
CMake build in Release by default (#75)

2 years agoUpdate contribution section, hot topics, limitations, etc.
Georgi Gerganov [Mon, 13 Mar 2023 17:21:51 +0000 (19:21 +0200)]
Update contribution section, hot topics, limitations, etc.

2 years agoPrint system information
Georgi Gerganov [Mon, 13 Mar 2023 17:15:08 +0000 (19:15 +0200)]
Print system information

2 years agoInitial support for CMake (#75)
SebastiƔn A [Mon, 13 Mar 2023 17:12:33 +0000 (14:12 -0300)]
Initial support for CMake (#75)

2 years agoAdd NetBSD support. (#90)
Thomas Klausner [Mon, 13 Mar 2023 16:40:54 +0000 (17:40 +0100)]
Add NetBSD support. (#90)

2 years agoUse fprintf for diagnostic output (#48)
Pavol Rusnak [Mon, 13 Mar 2023 16:39:56 +0000 (17:39 +0100)]
Use fprintf for diagnostic output (#48)

keep printf only for printing model output

one can now use ./main ... 2>dev/null to suppress any diagnostic output

2 years agoUse vdotq_s32 to improve performance (#67)
Georgi Gerganov [Mon, 13 Mar 2023 16:36:44 +0000 (18:36 +0200)]
Use vdotq_s32 to improve performance (#67)

* 10% performance boost on ARM

* Back to original change

2 years agoReduce model loading time (#43)
uint256_t [Mon, 13 Mar 2023 16:33:43 +0000 (01:33 +0900)]
Reduce model loading time (#43)

* Use buffering

* Use vector

* Minor

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoFix UTF-8 handling (including colors) (#79)
Val Kharitonov [Mon, 13 Mar 2023 16:24:18 +0000 (12:24 -0400)]
Fix UTF-8 handling (including colors) (#79)

2 years agoAdd quantize script for batch quantization (#92)
Pavol Rusnak [Mon, 13 Mar 2023 16:15:20 +0000 (17:15 +0100)]
Add quantize script for batch quantization (#92)

* Add quantize script for batch quantization

* Indentation

* README for new quantize.sh

* Fix script name

* Fix file list on Mac OS

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoAdd initial contribution guidelines
Georgi Gerganov [Mon, 13 Mar 2023 07:42:26 +0000 (09:42 +0200)]
Add initial contribution guidelines

2 years agoGate signal support on being on a unixoid system. (#74)
Matvey Soloviev [Mon, 13 Mar 2023 03:08:01 +0000 (04:08 +0100)]
Gate signal support on being on a unixoid system. (#74)

2 years agoFix token count accounting
Matvey Soloviev [Sun, 12 Mar 2023 23:35:51 +0000 (00:35 +0100)]
Fix token count accounting

2 years agoRevert "10% performance boost on ARM"
Georgi Gerganov [Sun, 12 Mar 2023 23:28:08 +0000 (01:28 +0200)]
Revert "10% performance boost on ARM"

This reverts commit 113a9e83ebc0f788f861394437087bf3ca0e019b.

There are some reports for illegal instruction.
Moved this stuff to vdotq_s32 branch until resolve

2 years agoCheck for vdotq_s32 availability
Georgi Gerganov [Sun, 12 Mar 2023 23:21:03 +0000 (01:21 +0200)]
Check for vdotq_s32 availability

2 years agoAmmend to previous commit - forgot to update non-QRDMX branch
Georgi Gerganov [Sun, 12 Mar 2023 23:05:24 +0000 (01:05 +0200)]
Ammend to previous commit - forgot to update non-QRDMX branch

2 years ago10% performance boost on ARM
Georgi Gerganov [Sun, 12 Mar 2023 22:56:10 +0000 (00:56 +0200)]
10% performance boost on ARM

2 years agoFix color getting reset before prompt output done (#65)
Matvey Soloviev [Sun, 12 Mar 2023 22:07:34 +0000 (23:07 +0100)]
Fix color getting reset before prompt output done (#65)

(cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)

2 years agoUpdate README.md
Georgi Gerganov [Sun, 12 Mar 2023 21:39:01 +0000 (23:39 +0200)]
Update README.md

2 years agoAdd interactive mode (#61)
Matvey Soloviev [Sun, 12 Mar 2023 21:13:28 +0000 (22:13 +0100)]
Add interactive mode (#61)

* Initial work on interactive mode.

* Improve interactive mode. Make rev. prompt optional.

* Update README to explain interactive mode.

* Fix OS X build

2 years agoFix typo in README (#45)
Marc Kƶhlbrugge [Sun, 12 Mar 2023 20:30:08 +0000 (03:30 +0700)]
Fix typo in README (#45)

2 years agoAllow using prompt files (#59)
Ben Garney [Sun, 12 Mar 2023 20:28:36 +0000 (13:28 -0700)]
Allow using prompt files (#59)

2 years agoAdd back top_k (#56)
beiller [Sun, 12 Mar 2023 20:23:15 +0000 (16:23 -0400)]
Add back top_k (#56)

* Add back top_k

* Update utils.cpp

* Update utils.h

---------

Co-authored-by: Bill Hamilton <redacted>
Co-authored-by: Georgi Gerganov <redacted>
2 years agoWindows fixes (#31)
SebastiƔn A [Sun, 12 Mar 2023 20:15:00 +0000 (17:15 -0300)]
Windows fixes (#31)

* Apply fixes suggested to build on windows

Issue: https://github.com/ggerganov/llama.cpp/issues/22

* Remove unsupported VLAs

* MSVC: Remove features that are only available on MSVC C++20.

* Fix zero initialization of the other fields.

* Change the use of vector for stack allocations.

2 years agoUpdate README.md
Georgi Gerganov [Sun, 12 Mar 2023 20:09:26 +0000 (22:09 +0200)]
Update README.md

2 years agoAdd CI (#60)
Georgi Gerganov [Sun, 12 Mar 2023 20:08:24 +0000 (22:08 +0200)]
Add CI (#60)

2 years agoRevert "weights_only" arg - this causing more trouble than help
Georgi Gerganov [Sun, 12 Mar 2023 18:59:01 +0000 (20:59 +0200)]
Revert "weights_only" arg - this causing more trouble than help

2 years agopython/pytorch compat notes (#44)
Oleksandr Nikitin [Sun, 12 Mar 2023 12:16:33 +0000 (14:16 +0200)]
python/pytorch compat notes (#44)

2 years agoAdd repetition penalty (#20)
beiller [Sun, 12 Mar 2023 09:27:42 +0000 (05:27 -0400)]
Add repetition penalty (#20)

* Adding repeat penalization

* Update utils.h

* Update utils.cpp

* Numeric fix

Should probably still scale by temp even if penalized

* Update comments, more proper application

I see that numbers can go negative so a fix from a referenced commit

* Minor formatting

---------

Co-authored-by: Georgi Gerganov <redacted>
2 years agoClarify meaning of hacking
Georgi Gerganov [Sun, 12 Mar 2023 07:03:25 +0000 (09:03 +0200)]
Clarify meaning of hacking

2 years agoREADME: add "Supported platforms" + update hot topics
Georgi Gerganov [Sun, 12 Mar 2023 06:41:54 +0000 (08:41 +0200)]
README: add "Supported platforms" + update hot topics

2 years agouse weights_only in conversion script (#32)
deepdiffuser [Sun, 12 Mar 2023 06:36:35 +0000 (22:36 -0800)]
use weights_only in conversion script (#32)

this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries

2 years agoAdd LICENSE (#21)
Pavol Rusnak [Sun, 12 Mar 2023 06:36:03 +0000 (07:36 +0100)]
Add LICENSE (#21)

2 years agoUpdate README.md
Georgi Gerganov [Sat, 11 Mar 2023 23:26:32 +0000 (01:26 +0200)]
Update README.md

2 years agoFix a typo in model name (#16)
Juraj Bednar [Sat, 11 Mar 2023 17:32:20 +0000 (18:32 +0100)]
Fix a typo in model name (#16)

2 years agoUpdate README.md
Georgi Gerganov [Sat, 11 Mar 2023 16:10:18 +0000 (18:10 +0200)]
Update README.md

2 years agoAdd AVX2 support for x86 architectures thanks to @Const-me !
Georgi Gerganov [Sat, 11 Mar 2023 15:58:18 +0000 (17:58 +0200)]
Add AVX2 support for x86 architectures thanks to @Const-me !

2 years agoFix un-initialized FP16 tables on x86 (#15, #2)
Georgi Gerganov [Sat, 11 Mar 2023 15:40:14 +0000 (17:40 +0200)]
Fix un-initialized FP16 tables on x86 (#15, #2)

2 years agoBump memory buffer
Georgi Gerganov [Sat, 11 Mar 2023 10:44:21 +0000 (12:44 +0200)]
Bump memory buffer

2 years agoUpdate README.md
Georgi Gerganov [Sat, 11 Mar 2023 10:31:21 +0000 (12:31 +0200)]
Update README.md

2 years ago.gitignore models/
Georgi Gerganov [Sat, 11 Mar 2023 10:26:46 +0000 (12:26 +0200)]
.gitignore models/

2 years agoUpdate Makefile var + add comment
Georgi Gerganov [Sat, 11 Mar 2023 10:26:16 +0000 (12:26 +0200)]
Update Makefile var + add comment

2 years agoUpdate README.md
Georgi Gerganov [Sat, 11 Mar 2023 09:34:25 +0000 (11:34 +0200)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Sat, 11 Mar 2023 09:34:11 +0000 (11:34 +0200)]
Update README.md

2 years agoSupport all LLaMA models + change Q4_0 quantization storage
Georgi Gerganov [Sat, 11 Mar 2023 08:47:09 +0000 (10:47 +0200)]
Support all LLaMA models + change Q4_0 quantization storage

2 years agoInclude Python dependencies in README (#6)
Simon Willison [Sat, 11 Mar 2023 05:47:26 +0000 (21:47 -0800)]
Include Python dependencies in README (#6)

2 years agoUpdate README.md
Georgi Gerganov [Fri, 10 Mar 2023 23:30:47 +0000 (01:30 +0200)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Fri, 10 Mar 2023 23:22:58 +0000 (01:22 +0200)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Fri, 10 Mar 2023 23:18:10 +0000 (01:18 +0200)]
Update README.md

2 years agoAdd missing headers for memcpy and assert (#3)
Jean-Michaƫl Celerier [Fri, 10 Mar 2023 23:04:06 +0000 (18:04 -0500)]
Add missing headers for memcpy and assert (#3)

2 years agoUpdate README.md
Georgi Gerganov [Fri, 10 Mar 2023 22:55:22 +0000 (00:55 +0200)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Fri, 10 Mar 2023 22:51:46 +0000 (00:51 +0200)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Fri, 10 Mar 2023 22:09:19 +0000 (00:09 +0200)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Fri, 10 Mar 2023 21:53:11 +0000 (23:53 +0200)]
Update README.md

2 years agoFix a bug in the rope calculation
Georgi Gerganov [Fri, 10 Mar 2023 21:46:39 +0000 (23:46 +0200)]
Fix a bug in the rope calculation

2 years agoUpdate README.md
Georgi Gerganov [Fri, 10 Mar 2023 19:52:27 +0000 (21:52 +0200)]
Update README.md

2 years agoFinal touches
Georgi Gerganov [Fri, 10 Mar 2023 19:50:46 +0000 (21:50 +0200)]
Final touches

2 years agoCreate README.md
Georgi Gerganov [Fri, 10 Mar 2023 19:47:46 +0000 (21:47 +0200)]
Create README.md

2 years agoInitial release
Georgi Gerganov [Fri, 10 Mar 2023 18:40:58 +0000 (20:40 +0200)]
Initial release