]>
git.djapps.eu Git - pkg/ggml/sources/llama.cpp/log
summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
Alex Nguyen [Sat, 18 Mar 2023 13:51:49 +0000 (20:51 +0700)]
Remove unused code since n_vocab is model.hparams.n_vocab (#262)
Justin Suess [Sat, 18 Mar 2023 11:44:09 +0000 (07:44 -0400)]
fixed warning with std::ignore about unused function result (#151)
fixed warning with std::ignore about unused function result
Gary Linscott [Sat, 18 Mar 2023 11:17:19 +0000 (04:17 -0700)]
Fix n^2 loop in tokenization (#254)
This causes long prompts to parse very slowly.
anzz1 [Sat, 18 Mar 2023 07:27:12 +0000 (09:27 +0200)]
CI Improvements (#230)
* CI Improvements
Manual build feature, autoreleases for Windows
* better CI naming convention
use branch name in releases and tags
Niklas Korz [Fri, 17 Mar 2023 22:03:48 +0000 (23:03 +0100)]
Nix flake (#40)
* Nix flake
* Nix: only add Accelerate framework on macOS
* Nix: development shel, direnv and compatibility
* Nix: use python packages supplied by withPackages
* Nix: remove channel compatibility
* Nix: fix ARM neon dotproduct on macOS
---------
Co-authored-by: Pavol Rusnak <redacted>
thement [Fri, 17 Mar 2023 20:05:58 +0000 (21:05 +0100)]
Implement non-greedy tokenizer that tries to maximize token lengths (#242)
* Implement non-greedy tokenizer that tries to maximize token lengths
* Insert single space in front of the prompt
- this is to match original llama tokenizer behavior
---------
Co-authored-by: Jakub Horak <redacted>
Georgi Gerganov [Fri, 17 Mar 2023 19:46:46 +0000 (21:46 +0200)]
Default to 4 threads (#243)
Georgi Gerganov [Fri, 17 Mar 2023 18:30:04 +0000 (20:30 +0200)]
Update Contributing section
Stephan Walter [Fri, 17 Mar 2023 17:47:35 +0000 (17:47 +0000)]
Don't tell users to use a bad number of threads (#243)
The readme tells people to use the command line option "-t 8", causing 8
threads to be started. On systems with fewer than 8 cores, this causes a
significant slowdown. Remove the option from the example command lines
and use /proc/cpuinfo on Linux to determine a sensible default.
mmyjona [Fri, 17 Mar 2023 16:38:24 +0000 (00:38 +0800)]
add ptread link to fix cmake build under linux (#114)
* add ptread link to fix cmake build under linux
* add cmake to linux and macos platform
* separate make and cmake workflow
---------
Co-authored-by: SebastiƔn A <redacted>
Bernat Vadell [Fri, 17 Mar 2023 09:47:06 +0000 (10:47 +0100)]
š Dockerize llamacpp (#132)
* feat: dockerize llamacpp
* feat: split build & runtime stages
* split dockerfile into main & tools
* add quantize into tool docker image
* Update .devops/tools.sh
Co-authored-by: Georgi Gerganov <redacted>
* add docker action pipeline
* change CI to publish at github docker registry
* fix name runs-on macOS-latest is macos-latest (lowercase)
* include docker versioned images
* fix github action docker
* fix docker.yml
* feat: include all-in-one command tool & update readme.md
---------
Co-authored-by: Georgi Gerganov <redacted>
Matvey Soloviev [Fri, 17 Mar 2023 04:48:39 +0000 (05:48 +0100)]
Q4_1 quantization (#193)
* Add AVX2 version of ggml_vec_dot_q4_1
* Small optimisations to q4_1 dot product (@Const-me)
* Rearrange Q4_1 quantization to work for multipart models. (Fix #152)
* Fix ggml_vec_mad_q4_1 too
* Fix non-vectorised q4_1 vec mul
Georgi Gerganov [Thu, 16 Mar 2023 13:00:09 +0000 (15:00 +0200)]
Update README.md
Georgi Gerganov [Thu, 16 Mar 2023 06:55:13 +0000 (08:55 +0200)]
Expand "Contributing" section
Georgi Gerganov [Thu, 16 Mar 2023 05:12:12 +0000 (07:12 +0200)]
Update hot topics - RMSnorm
Nebula [Wed, 15 Mar 2023 23:29:25 +0000 (19:29 -0400)]
Fix RMS norm in GGML (#191)
hoangmit [Wed, 15 Mar 2023 22:41:38 +0000 (18:41 -0400)]
Add RMS norm and use it (#187)
* add ggml_rms_norm
* update op num
moritzbrantner [Wed, 15 Mar 2023 20:35:25 +0000 (21:35 +0100)]
fixed typo (#178)
Rickey Bowers Jr [Wed, 15 Mar 2023 19:56:24 +0000 (13:56 -0600)]
add SIGINT support for _WIN32 environments (#120)
* add SIGINT support for _WIN32 environments
* perhaps more consistent
Justin Suess [Wed, 15 Mar 2023 19:42:40 +0000 (15:42 -0400)]
added ctx_size parameter (#148)
* added ctx_size parameter
* added it in more places
* Apply suggestions from code review
---------
Co-authored-by: Georgi Gerganov <redacted>
Justin Suess [Wed, 15 Mar 2023 19:39:38 +0000 (15:39 -0400)]
fixed color reset on exit (#149)
* fixed color reset on exit
* added sigint handler for ansi_color_reset
* Update main.cpp
---------
Co-authored-by: Georgi Gerganov <redacted>
Musab Gultekin [Wed, 15 Mar 2023 19:39:06 +0000 (22:39 +0300)]
Fix potential licensing issue (#126)
* Update README.md
* Update README.md
remove facebook
Ronsor [Wed, 15 Mar 2023 19:37:50 +0000 (12:37 -0700)]
Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142)
There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.
hoangmit [Wed, 15 Mar 2023 19:05:14 +0000 (15:05 -0400)]
inline -> static inline for "bytesFromNibbles" (#161)
Without "static" prefix, it fails to compile in clang
Ronsor [Tue, 14 Mar 2023 19:34:37 +0000 (12:34 -0700)]
Don't use vdotq_s32 if it's not available (#139)
* Don't use vdotq_s32 if it's not available
`dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available.
Reintroduces the code removed in
84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined.
* Update ggml.c
---------
Co-authored-by: Georgi Gerganov <redacted>
Radoslav Gerganov [Tue, 14 Mar 2023 13:30:08 +0000 (15:30 +0200)]
Add section to README on how to run the project on Android (#130)
Georgi Gerganov [Tue, 14 Mar 2023 07:43:52 +0000 (09:43 +0200)]
Add Misc section + update hot topics + minor fixes
SebastiƔn A [Mon, 13 Mar 2023 20:29:10 +0000 (17:29 -0300)]
Add windows to the CI (#98)
Georgi Gerganov [Mon, 13 Mar 2023 19:22:15 +0000 (21:22 +0200)]
CMake build in Release by default (#75)
Georgi Gerganov [Mon, 13 Mar 2023 17:21:51 +0000 (19:21 +0200)]
Update contribution section, hot topics, limitations, etc.
Georgi Gerganov [Mon, 13 Mar 2023 17:15:08 +0000 (19:15 +0200)]
Print system information
SebastiƔn A [Mon, 13 Mar 2023 17:12:33 +0000 (14:12 -0300)]
Initial support for CMake (#75)
Thomas Klausner [Mon, 13 Mar 2023 16:40:54 +0000 (17:40 +0100)]
Add NetBSD support. (#90)
Pavol Rusnak [Mon, 13 Mar 2023 16:39:56 +0000 (17:39 +0100)]
Use fprintf for diagnostic output (#48)
keep printf only for printing model output
one can now use ./main ... 2>dev/null to suppress any diagnostic output
Georgi Gerganov [Mon, 13 Mar 2023 16:36:44 +0000 (18:36 +0200)]
Use vdotq_s32 to improve performance (#67)
* 10% performance boost on ARM
* Back to original change
uint256_t [Mon, 13 Mar 2023 16:33:43 +0000 (01:33 +0900)]
Reduce model loading time (#43)
* Use buffering
* Use vector
* Minor
---------
Co-authored-by: Georgi Gerganov <redacted>
Val Kharitonov [Mon, 13 Mar 2023 16:24:18 +0000 (12:24 -0400)]
Fix UTF-8 handling (including colors) (#79)
Pavol Rusnak [Mon, 13 Mar 2023 16:15:20 +0000 (17:15 +0100)]
Add quantize script for batch quantization (#92)
* Add quantize script for batch quantization
* Indentation
* README for new quantize.sh
* Fix script name
* Fix file list on Mac OS
---------
Co-authored-by: Georgi Gerganov <redacted>
Georgi Gerganov [Mon, 13 Mar 2023 07:42:26 +0000 (09:42 +0200)]
Add initial contribution guidelines
Matvey Soloviev [Mon, 13 Mar 2023 03:08:01 +0000 (04:08 +0100)]
Gate signal support on being on a unixoid system. (#74)
Matvey Soloviev [Sun, 12 Mar 2023 23:35:51 +0000 (00:35 +0100)]
Fix token count accounting
Georgi Gerganov [Sun, 12 Mar 2023 23:28:08 +0000 (01:28 +0200)]
Revert "10% performance boost on ARM"
This reverts commit
113a9e83ebc0f788f861394437087bf3ca0e019b .
There are some reports for illegal instruction.
Moved this stuff to vdotq_s32 branch until resolve
Georgi Gerganov [Sun, 12 Mar 2023 23:21:03 +0000 (01:21 +0200)]
Check for vdotq_s32 availability
Georgi Gerganov [Sun, 12 Mar 2023 23:05:24 +0000 (01:05 +0200)]
Ammend to previous commit - forgot to update non-QRDMX branch
Georgi Gerganov [Sun, 12 Mar 2023 22:56:10 +0000 (00:56 +0200)]
10% performance boost on ARM
Matvey Soloviev [Sun, 12 Mar 2023 22:07:34 +0000 (23:07 +0100)]
Fix color getting reset before prompt output done (#65)
(cherry picked from commit
7eb2987619feee04c40eff69b604017d09919cb6 )
Georgi Gerganov [Sun, 12 Mar 2023 21:39:01 +0000 (23:39 +0200)]
Update README.md
Matvey Soloviev [Sun, 12 Mar 2023 21:13:28 +0000 (22:13 +0100)]
Add interactive mode (#61)
* Initial work on interactive mode.
* Improve interactive mode. Make rev. prompt optional.
* Update README to explain interactive mode.
* Fix OS X build
Marc Kƶhlbrugge [Sun, 12 Mar 2023 20:30:08 +0000 (03:30 +0700)]
Fix typo in README (#45)
Ben Garney [Sun, 12 Mar 2023 20:28:36 +0000 (13:28 -0700)]
Allow using prompt files (#59)
beiller [Sun, 12 Mar 2023 20:23:15 +0000 (16:23 -0400)]
Add back top_k (#56)
* Add back top_k
* Update utils.cpp
* Update utils.h
---------
Co-authored-by: Bill Hamilton <redacted>
Co-authored-by: Georgi Gerganov <redacted>
SebastiƔn A [Sun, 12 Mar 2023 20:15:00 +0000 (17:15 -0300)]
Windows fixes (#31)
* Apply fixes suggested to build on windows
Issue: https://github.com/ggerganov/llama.cpp/issues/22
* Remove unsupported VLAs
* MSVC: Remove features that are only available on MSVC C++20.
* Fix zero initialization of the other fields.
* Change the use of vector for stack allocations.
Georgi Gerganov [Sun, 12 Mar 2023 20:09:26 +0000 (22:09 +0200)]
Update README.md
Georgi Gerganov [Sun, 12 Mar 2023 20:08:24 +0000 (22:08 +0200)]
Add CI (#60)
Georgi Gerganov [Sun, 12 Mar 2023 18:59:01 +0000 (20:59 +0200)]
Revert "weights_only" arg - this causing more trouble than help
Oleksandr Nikitin [Sun, 12 Mar 2023 12:16:33 +0000 (14:16 +0200)]
python/pytorch compat notes (#44)
beiller [Sun, 12 Mar 2023 09:27:42 +0000 (05:27 -0400)]
Add repetition penalty (#20)
* Adding repeat penalization
* Update utils.h
* Update utils.cpp
* Numeric fix
Should probably still scale by temp even if penalized
* Update comments, more proper application
I see that numbers can go negative so a fix from a referenced commit
* Minor formatting
---------
Co-authored-by: Georgi Gerganov <redacted>
Georgi Gerganov [Sun, 12 Mar 2023 07:03:25 +0000 (09:03 +0200)]
Clarify meaning of hacking
Georgi Gerganov [Sun, 12 Mar 2023 06:41:54 +0000 (08:41 +0200)]
README: add "Supported platforms" + update hot topics
deepdiffuser [Sun, 12 Mar 2023 06:36:35 +0000 (22:36 -0800)]
use weights_only in conversion script (#32)
this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries
Pavol Rusnak [Sun, 12 Mar 2023 06:36:03 +0000 (07:36 +0100)]
Add LICENSE (#21)
Georgi Gerganov [Sat, 11 Mar 2023 23:26:32 +0000 (01:26 +0200)]
Update README.md
Juraj Bednar [Sat, 11 Mar 2023 17:32:20 +0000 (18:32 +0100)]
Fix a typo in model name (#16)
Georgi Gerganov [Sat, 11 Mar 2023 16:10:18 +0000 (18:10 +0200)]
Update README.md
Georgi Gerganov [Sat, 11 Mar 2023 15:58:18 +0000 (17:58 +0200)]
Add AVX2 support for x86 architectures thanks to @Const-me !
Georgi Gerganov [Sat, 11 Mar 2023 15:40:14 +0000 (17:40 +0200)]
Fix un-initialized FP16 tables on x86 (#15, #2)
Georgi Gerganov [Sat, 11 Mar 2023 10:44:21 +0000 (12:44 +0200)]
Bump memory buffer
Georgi Gerganov [Sat, 11 Mar 2023 10:31:21 +0000 (12:31 +0200)]
Update README.md
Georgi Gerganov [Sat, 11 Mar 2023 10:26:46 +0000 (12:26 +0200)]
.gitignore models/
Georgi Gerganov [Sat, 11 Mar 2023 10:26:16 +0000 (12:26 +0200)]
Update Makefile var + add comment
Georgi Gerganov [Sat, 11 Mar 2023 09:34:25 +0000 (11:34 +0200)]
Update README.md
Georgi Gerganov [Sat, 11 Mar 2023 09:34:11 +0000 (11:34 +0200)]
Update README.md
Georgi Gerganov [Sat, 11 Mar 2023 08:47:09 +0000 (10:47 +0200)]
Support all LLaMA models + change Q4_0 quantization storage
Simon Willison [Sat, 11 Mar 2023 05:47:26 +0000 (21:47 -0800)]
Include Python dependencies in README (#6)
Georgi Gerganov [Fri, 10 Mar 2023 23:30:47 +0000 (01:30 +0200)]
Update README.md
Georgi Gerganov [Fri, 10 Mar 2023 23:22:58 +0000 (01:22 +0200)]
Update README.md
Georgi Gerganov [Fri, 10 Mar 2023 23:18:10 +0000 (01:18 +0200)]
Update README.md
Jean-Michaƫl Celerier [Fri, 10 Mar 2023 23:04:06 +0000 (18:04 -0500)]
Add missing headers for memcpy and assert (#3)
Georgi Gerganov [Fri, 10 Mar 2023 22:55:22 +0000 (00:55 +0200)]
Update README.md
Georgi Gerganov [Fri, 10 Mar 2023 22:51:46 +0000 (00:51 +0200)]
Update README.md
Georgi Gerganov [Fri, 10 Mar 2023 22:09:19 +0000 (00:09 +0200)]
Update README.md
Georgi Gerganov [Fri, 10 Mar 2023 21:53:11 +0000 (23:53 +0200)]
Update README.md
Georgi Gerganov [Fri, 10 Mar 2023 21:46:39 +0000 (23:46 +0200)]
Fix a bug in the rope calculation
Georgi Gerganov [Fri, 10 Mar 2023 19:52:27 +0000 (21:52 +0200)]
Update README.md
Georgi Gerganov [Fri, 10 Mar 2023 19:50:46 +0000 (21:50 +0200)]
Final touches
Georgi Gerganov [Fri, 10 Mar 2023 19:47:46 +0000 (21:47 +0200)]
Create README.md
Georgi Gerganov [Fri, 10 Mar 2023 18:40:58 +0000 (20:40 +0200)]
Initial release