]>
git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log
Georgi Gerganov [Sat, 29 Apr 2023 07:42:14 +0000 (10:42 +0300)]
whisper : fix bug from previous commit
Georgi Gerganov [Sat, 29 Apr 2023 07:36:50 +0000 (10:36 +0300)]
whisper : avoid designated initializers
AsukaMinato [Sat, 29 Apr 2023 07:06:25 +0000 (16:06 +0900)]
minor : improve C++ and Python style (#768)
* use some STL functions
* use self.field than setattr, use pathlib.Path
* recover some format
* const some iter
* Keep the original
* 2 space
Georgi Gerganov [Fri, 28 Apr 2023 19:41:29 +0000 (22:41 +0300)]
readme : add logo
Laytan Laats [Sun, 23 Apr 2023 16:01:59 +0000 (18:01 +0200)]
main : escape quotes in csv output (#815)
Taras Glek [Sun, 23 Apr 2023 14:00:30 +0000 (17:00 +0300)]
stream : flush upon finishing inference (#811)
Philipp Zabel [Sun, 23 Apr 2023 13:52:52 +0000 (15:52 +0200)]
examples : add missing #include <cstdint> (#798)
common.cpp uses uint8_t and uint64_t, which are defined in <cstdint>.
Tauseef Mohiuddin [Sun, 23 Apr 2023 13:47:30 +0000 (08:47 -0500)]
main : update escape_double_quotes() function (#776)
Updated the escape_double_quotes() function such that the function now escapes both double quotes and backslashes in the input string.
Changes Made:
- Renamed the function to escape_quotes_and_backslashes
- Modified the condition in the first loop to increment the value of 'escaped_length' for both double quotes and backslashes.
- Modified the condition in second loop to add a backslash before the current character if it is a double quote or a backslash.
Resolves: #769
Georgi Gerganov [Sat, 15 Apr 2023 14:30:44 +0000 (17:30 +0300)]
release : v1.3.0
Georgi Gerganov [Sat, 15 Apr 2023 14:18:43 +0000 (17:18 +0300)]
whisper : pad audio instead of spectrogram (#579)
Also, fallback only if more temperatures are available and if we are
at least 3 seconds before the end of the audio
Georgi Gerganov [Sat, 15 Apr 2023 13:04:07 +0000 (16:04 +0300)]
whisper : restore decoder temperature fallbacks
I disabled this because there were many complaints about slow decoding.
The current implementation does not allow batching the decoders when
using the "best of" or "beam size" parameters, so the decoding time is
proportional to the number of decoders, which is obviously not great.
However, now there are even more complaints about wrong decodings and
repetition.
So, making a compromise by re-enabling the fallbacks, but defaulting to
just 2 "best of" / "beam size" decoders. Also, the temperature step is
increased from 0.2 to 0.4 - i.e. from maximum of 5 fallbacks to maximum
of 2.
Also, the stream example now has fallbacks enabled by default.
close #471 #477 #508 #612 #719 #731
Jhen-Jie Hong [Sat, 15 Apr 2023 11:21:58 +0000 (19:21 +0800)]
ggml, ci : fix build on whisper.android (ARM_NEON) + add CI (#764)
* ggml : fix undefined symbol by remove inline handle
* ggml : make own ggml_aligned_malloc function
* ci: add ios/android build
Georgi Gerganov [Sat, 15 Apr 2023 11:18:46 +0000 (14:18 +0300)]
whisper : slightly faster Log Mel computation + n-1 FFT threads (#568)
Georgi Gerganov [Sat, 15 Apr 2023 10:30:36 +0000 (13:30 +0300)]
readme : fix link
Georgi Gerganov [Sat, 15 Apr 2023 10:30:07 +0000 (13:30 +0300)]
readme : add usage instructions for Core ML
Georgi Gerganov [Sat, 15 Apr 2023 10:21:27 +0000 (13:21 +0300)]
whisper : add Core ML support (#566)
* coreml : use Core ML encoder inference
* coreml : simlpify whisper_encode + log messages
* whisper : resolve rebase conflicts
* coreml : add scripts for CoreML model generation
* bench-all : recognize COREML flag
Maximiliano Levi [Fri, 14 Apr 2023 19:35:34 +0000 (16:35 -0300)]
whisper : do not launch log_mel threads when n_thread is 1 (#763)
AfryMask [Fri, 14 Apr 2023 17:35:03 +0000 (01:35 +0800)]
whisper : fix the bug related to word splitting errors in the "tokenize" function. (#760)
Co-authored-by: AfryMask <redacted>
Aaron Taylor [Fri, 14 Apr 2023 17:24:00 +0000 (13:24 -0400)]
readme : add SwiftWhisper to listed bindings (#755)
Georgi Gerganov [Fri, 14 Apr 2023 17:13:47 +0000 (20:13 +0300)]
gitignore : add .test
Bader-eddine Ouaich [Fri, 14 Apr 2023 17:05:56 +0000 (17:05 +0000)]
whisper : fix potential memory leaks (#740)
* fix potential memory leak if whisper_init_state failed
* fix potential memory leak if gpt2_init failed
Anton Kostin [Fri, 14 Apr 2023 17:04:42 +0000 (00:04 +0700)]
license : update year (#739)
GitAritron [Fri, 14 Apr 2023 17:03:16 +0000 (20:03 +0300)]
whisper : fix typos in whisper.h (#737)
Fixed a couple of typos (in comments, so nothing major). Keep up the great work 😄
Ali Alameh [Fri, 14 Apr 2023 17:02:18 +0000 (20:02 +0300)]
stream : support language auto-detect (#501)
#445 fix Language auto-detect "auto" flag does not work using the stream tool
Alex Evgrashin [Fri, 14 Apr 2023 16:59:44 +0000 (19:59 +0300)]
readme : add unity bindings (#733)
DGdev91 [Fri, 14 Apr 2023 16:53:58 +0000 (18:53 +0200)]
talk, talk-llama : add basic example script for eleven-labs tts (#728)
Ivan Gorin [Fri, 14 Apr 2023 16:50:39 +0000 (19:50 +0300)]
models : change convert-pt-to-ggml to use .tiktoken tokenizer files (#725)
LittleLoli [Fri, 14 Apr 2023 16:36:38 +0000 (00:36 +0800)]
cmake : add msvc compiler args /utf-8 fix error C3688 (#721)
* force msvc compiler use utf-8 encode
* only enable on msvc
Maciek [Fri, 14 Apr 2023 16:36:09 +0000 (18:36 +0200)]
talk-llama : correct default speak.sh path (#720)
There is `speak.sh` file in `./examples/talk-llama` as described in README.
However `./examples/talk/speak.sh` is used in `talk-llama.cpp`, this commit corrects that.
LittleLoli [Fri, 14 Apr 2023 16:35:33 +0000 (00:35 +0800)]
main : add lrc output support (#718)
* add lrc output support.
* fix wrong comment
Sam [Fri, 14 Apr 2023 16:33:06 +0000 (16:33 +0000)]
readme : make the quick start instructions clearer. (#716)
Users wanting to make use of this implementation of the whisper model with no prior knowledge of C/C++ may download the Whisper model but fail to use of the "make" command as specified given that they forgot or didn't know they needed to clone the repository first. Hope this modification clears things up.
duthils [Fri, 14 Apr 2023 16:31:51 +0000 (12:31 -0400)]
make : disable avx in case f16c is not available (#706)
Why:
* ggml.c does not support AVX without F16C
bocytko [Fri, 14 Apr 2023 16:25:23 +0000 (18:25 +0200)]
readme : add shell command example for --print-colors (#710)
The section of the readme file explaining `--print-colors` includes only a screenshot with directories that are inconsistent with other examples. This commit adds an example shell command, consistent with the remaining examples.
Georgi Gerganov [Fri, 14 Apr 2023 16:20:39 +0000 (19:20 +0300)]
ggml : sync latest ggml
Georgi Gerganov [Fri, 14 Apr 2023 16:16:34 +0000 (19:16 +0300)]
whisper : fix bug in prompt processing (close #705)
Was dereferencing a dangling pointer
Brian Murray [Fri, 14 Apr 2023 15:52:10 +0000 (09:52 -0600)]
go : exposed various parts to the Go Interface (#697)
novag [Fri, 14 Apr 2023 10:34:20 +0000 (12:34 +0200)]
ggml : fix q4_1 dot product types (#759)
Co-authored-by: Georgi Gerganov <redacted>
Georgi Gerganov [Thu, 13 Apr 2023 15:53:44 +0000 (18:53 +0300)]
ggml : sync latest changes from ggml and llama.cpp
Georgi Gerganov [Mon, 10 Apr 2023 20:18:29 +0000 (23:18 +0300)]
ggml : fix WASM build
Georgi Gerganov [Mon, 10 Apr 2023 20:09:15 +0000 (23:09 +0300)]
talk-llama : increase context to 2048
Georgi Gerganov [Mon, 10 Apr 2023 19:59:13 +0000 (22:59 +0300)]
talk-llama : update to latest llama.cpp (improved performance)
Georgi Gerganov [Mon, 10 Apr 2023 19:28:54 +0000 (22:28 +0300)]
ggml : backport llama.cpp updates (close #709)
- About x2 overall performance improvement on Apple Silicon
- Results should now be the same for different number of threads (not
tested)
pajowu [Thu, 30 Mar 2023 17:29:29 +0000 (19:29 +0200)]
whisper : add progress callback (#600)
Zigfrid Zvezdin [Thu, 30 Mar 2023 04:51:33 +0000 (01:51 -0300)]
misc : typo (#688)
InconsolableCellist [Wed, 29 Mar 2023 21:10:20 +0000 (15:10 -0600)]
talk-llama : fixing usage message for talk-llama (#687)
"-ml" instead of "-mg" for specifying the llama file
Georgi Gerganov [Wed, 29 Mar 2023 20:59:45 +0000 (23:59 +0300)]
main : add <cstring> header
Lucas Zanek [Wed, 29 Mar 2023 20:59:17 +0000 (17:59 -0300)]
whisper.addon : fixed test to new async implementation (#686)
* fixed blocking code on node addon
* modify the example to run async
* format
* added logic to see the whisper output
* added logic to see the whisper output
* removed extra function for more clean example
* fixed whisper test to new async implementation
be-next [Wed, 29 Mar 2023 20:38:33 +0000 (22:38 +0200)]
models : handle spaces and special characters in shell script paths (#677)
This commit modifies the `get_script_path` function to correctly handle
spaces and special characters in directory paths. The fix involves adding
double quotes around variables and commands where needed to ensure proper
parsing of paths with spaces and special characters.
Egor Egorov [Wed, 29 Mar 2023 20:26:39 +0000 (23:26 +0300)]
main : fix typo in JSON output (#648)
* typo in JSON output
* fix double quotes in JSON output
Jhen-Jie Hong [Wed, 29 Mar 2023 20:23:23 +0000 (04:23 +0800)]
whisper : add initial_prompt param (#645)
clach04 [Wed, 29 Mar 2023 20:11:35 +0000 (13:11 -0700)]
make : 32-bit ARM flags (#486)
* issue #470 - working 32-bit ARM
* Update Makefile
* Update Makefile
---------
Co-authored-by: Georgi Gerganov <redacted>
Jonno [Wed, 29 Mar 2023 20:04:38 +0000 (06:04 +1000)]
whisper.swiftui : update README.md (#682)
- Slight tweaks to README for improved comprehension.
Evan Jones [Wed, 29 Mar 2023 20:01:14 +0000 (16:01 -0400)]
talk-llama : add alpaca support (#668)
Georgi Gerganov [Tue, 28 Mar 2023 07:50:49 +0000 (10:50 +0300)]
whisper : bump "large" scratch buffer even mode (close #671)
Georgi Gerganov [Tue, 28 Mar 2023 07:36:16 +0000 (10:36 +0300)]
whisper : increase scratch buffers after recent change (#671)
Should fix the error:
ggml_new_tensor_impl: not enough space in the scratch memory
Georgi Gerganov [Tue, 28 Mar 2023 07:11:34 +0000 (10:11 +0300)]
talk-llama : add discussion link
Georgi Gerganov [Mon, 27 Mar 2023 18:28:00 +0000 (21:28 +0300)]
talk-llama : try to fix windows build ..
Georgi Gerganov [Mon, 27 Mar 2023 18:02:35 +0000 (21:02 +0300)]
readme : add talk-llama example to the table
Georgi Gerganov [Mon, 27 Mar 2023 18:00:32 +0000 (21:00 +0300)]
talk-llama : add new example + sync ggml from llama.cpp (#664)
* talk-llama : talk with LLaMA AI
* talk.llama : disable EOS token
* talk-llama : add README instructions
* ggml : fix build in debug
Georgi Gerganov [Wed, 22 Mar 2023 20:34:39 +0000 (22:34 +0200)]
whisper : disable fallbacks until the performance is improved (#588)
Andrew Huynh [Wed, 22 Mar 2023 20:30:40 +0000 (13:30 -0700)]
cmake : add a flag to disable F16C (#628)
jwijffels [Wed, 22 Mar 2023 20:28:22 +0000 (21:28 +0100)]
Include link to R wrapper in README (#626)
Lucas Zanek [Wed, 22 Mar 2023 20:19:22 +0000 (17:19 -0300)]
Nodejs Addon blocking main thread. Implemented Napi::AsyncWorker (#642)
* fixed blocking code on node addon
* modify the example to run async
* format
* added logic to see the whisper output
* added logic to see the whisper output
* removed extra function for more clean example
Jhen-Jie Hong [Wed, 22 Mar 2023 20:16:04 +0000 (04:16 +0800)]
whisper.objc : add `-O3 -DNDEBUG` in release mode (#640)
sandrohanea [Wed, 22 Mar 2023 19:47:09 +0000 (20:47 +0100)]
fixed language auto-detection for state provided processing (#627)
Co-authored-by: Sandro Hanea <redacted>
Jhen-Jie Hong [Wed, 22 Mar 2023 19:39:02 +0000 (03:39 +0800)]
readme : add react-native bindings (#619)
Leo Moll [Wed, 22 Mar 2023 19:37:36 +0000 (20:37 +0100)]
main : provide option for creating JSON output (#615)
* examples : provide option for exporting also as JSON file (ggerganov/whisper.cpp#614)
* main : remove leftovers
---------
Co-authored-by: Georgi Gerganov <redacted>
Kamilake [Wed, 22 Mar 2023 19:17:24 +0000 (04:17 +0900)]
models : change default encoding to utf8 (#605)
Georgi Gerganov [Wed, 22 Mar 2023 18:51:42 +0000 (20:51 +0200)]
make : fix MUSL Linux build (#576)
Georgi Gerganov [Wed, 22 Mar 2023 18:44:56 +0000 (20:44 +0200)]
models : change HF hosting from dataset to model
Takeshi Inoue [Tue, 7 Mar 2023 19:36:30 +0000 (04:36 +0900)]
whisper.android : support benchmark for Android example. (#542)
* whisper.android: Support benchmark for Android example.
* whisper.android: update screenshot in README.
* update: Make text selectable for copy & paste.
* Update whisper.h to restore API name
Co-authored-by: Georgi Gerganov <redacted>
* whisper.android: Restore original API names.
---------
Co-authored-by: tinoue <redacted>
Co-authored-by: Georgi Gerganov <redacted>
Georgi Gerganov [Mon, 6 Mar 2023 19:06:27 +0000 (21:06 +0200)]
readme : add bench-wts.sh demo
Georgi Gerganov [Mon, 6 Mar 2023 19:02:24 +0000 (21:02 +0200)]
bench-wts.sh : rename script + add execute permission
venkr [Mon, 6 Mar 2023 17:18:11 +0000 (09:18 -0800)]
qual-bench.sh : add quality comparison tool, and update main.cpp to allow using a font file (#569)
Takeshi Inoue [Mon, 6 Mar 2023 17:15:57 +0000 (02:15 +0900)]
whisper.android : enable fp16 instrinsics (FP16_VA) which is supported by ARMv8.2 or later. (#572)
sandrohanea [Sun, 5 Mar 2023 19:42:19 +0000 (20:42 +0100)]
whisper : add whisper_state + default state on the whisper_context (#523)
* Added whisper state + default state on the whisper_context
* Fixed some examples and bindings
* Fixed whisper_n_len (which was used in some binding) and added whisper_n_len_from_state
* Fixed comments
* whisper : reuse kv_cache_free() and fix compiler warnings
* whisper : clean-up the API comments
---------
Co-authored-by: Sandro Hanea <redacted>
Co-authored-by: Georgi Gerganov <redacted>
Georgi Gerganov [Sun, 5 Mar 2023 18:53:43 +0000 (20:53 +0200)]
whisper : set no_context == true by default (#537)
polarmoon [Sun, 5 Mar 2023 18:50:25 +0000 (10:50 -0800)]
go : NewContext now returns a clean context (#537)
Co-authored-by: Ming <redacted>
HY. Kelvin Lee [Thu, 2 Mar 2023 16:32:16 +0000 (11:32 -0500)]
main : add csv header (#552)
Georgi Gerganov [Tue, 28 Feb 2023 21:27:54 +0000 (23:27 +0200)]
make : add -DNDEBUG compile flag
Georgi Gerganov [Tue, 28 Feb 2023 20:29:12 +0000 (22:29 +0200)]
release : v1.2.1
FlippFuzz [Mon, 27 Feb 2023 19:04:16 +0000 (03:04 +0800)]
make : add "-mcpu=native" when building for aarch64 (#532)
Aaron Pham [Mon, 27 Feb 2023 19:02:11 +0000 (11:02 -0800)]
readme : add pybind11 bindings (#538)
Georgi Gerganov [Fri, 24 Feb 2023 06:46:06 +0000 (08:46 +0200)]
readme : add cython bindings (#9)
Georgi Gerganov [Tue, 21 Feb 2023 17:00:42 +0000 (19:00 +0200)]
whisper : zero-initialize some more context variables
Just in case
Finn Voorhees [Tue, 21 Feb 2023 11:42:10 +0000 (11:42 +0000)]
whisper : fix uninitialized exp_n_audio_ctx
Georgi Gerganov [Sun, 19 Feb 2023 16:35:01 +0000 (18:35 +0200)]
whisper : add API for applying custom logits filters during decoding
Georgi Gerganov [Sat, 18 Feb 2023 07:42:31 +0000 (09:42 +0200)]
yt-wsp.sh : print help on empty args
Georgi Gerganov [Wed, 15 Feb 2023 19:48:49 +0000 (21:48 +0200)]
whisper : by default disable non-speech tokens suppression (#473)
This seems to be causing hallucinations in the end of the audio, e.g.:
"Thank you for listening"
"Amen"
..
Georgi Gerganov [Wed, 15 Feb 2023 17:51:54 +0000 (19:51 +0200)]
readme : add Ruby discussion + update .NET discussion
Todd [Wed, 15 Feb 2023 17:46:55 +0000 (12:46 -0500)]
bindings : add Ruby (#500)
* adding ruby bindings
* avoid adding these they are copied in via extconf.rb
* ignore these files here
* add definitions for boolean params
* initial transcribe for ruby
* use en model and transcribe jfk with assertion
* possibly this works for building ruby binding
* ci : try to add ruby workflow
---------
Co-authored-by: Georgi Gerganov <redacted>
conradg [Wed, 15 Feb 2023 17:31:16 +0000 (17:31 +0000)]
main : fix std in input (#503)
if we don't add this as an explicit check, then we get an "error: unknown argument: -" later on
Georgi Gerganov [Wed, 15 Feb 2023 17:28:10 +0000 (19:28 +0200)]
examples : refactor in order to reuse code and reduce duplication (#482)
* examples : refactor common code into a library
* examples : refactor common SDL code into a library
* make : update Makefile to use common libs
* common : fix MSVC M_PI ..
* addon.node : link common lib
shikokuchuo [Wed, 15 Feb 2023 17:08:25 +0000 (17:08 +0000)]
whisper : fix signedness compiler warning (#506)
genevera (she/her) [Tue, 14 Feb 2023 18:12:51 +0000 (13:12 -0500)]
yt-wsp.sh : add unique filename generation (#495)
Co-authored-by: genevera <redacted>
Georgi Gerganov [Tue, 14 Feb 2023 18:04:03 +0000 (20:04 +0200)]
readme : add another .NET repo (#303)
Georgi Gerganov [Sat, 11 Feb 2023 15:35:33 +0000 (17:35 +0200)]
readme : add .NET repo (#303)
Avik Sengupta [Sat, 11 Feb 2023 07:13:32 +0000 (07:13 +0000)]
cmake : install whisper.h header (#485)
Including the header file in the install bundle helps projects that ship binaries.
shibukazu [Wed, 8 Feb 2023 07:05:34 +0000 (16:05 +0900)]
whisper : suppress non-speech-related token outputs (#473)
* add non-speech-token suppression
* add suppress non-speech_tokens param
sandrohanea [Wed, 8 Feb 2023 07:01:47 +0000 (08:01 +0100)]
whisper : fixed Beam Search Strategy and exposed whisper_pcm_to_mel_phase_vocoder (#474)
Co-authored-by: Sandro Hanea <redacted>