git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log

]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log

overview / pkg / ggml / sources / whisper.cpp / log

commit | commitdiff | tree

Georgi Gerganov [Sat, 10 Dec 2022 14:54:57 +0000 (16:54 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Sat, 10 Dec 2022 14:48:15 +0000 (16:48 +0200)]

talk : fix build for MSVC

commit | commitdiff | tree

Georgi Gerganov [Fri, 9 Dec 2022 18:38:10 +0000 (20:38 +0200)]

talk : talk with AI in the terminal

commit | commitdiff | tree

bert hubert [Sat, 10 Dec 2022 12:09:31 +0000 (13:09 +0100)]

fix potential bug reading model data into a small size optimized string which could lead to memory corruption. In an SSO string, you can't write data to &str[0] and expect it to work well.

Also added a small wrapper function to more safely read model data without having to get the sizeof right. I tested this on tiny, base and large models, there was no change in behaviour.

commit | commitdiff | tree

Georgi Gerganov [Sat, 10 Dec 2022 11:38:26 +0000 (13:38 +0200)]

whisper : minor improvemnt in decoding strategy (#244)

Do not allow for text segments to go beyond end of audio.
This partially mitigates some issues when the last audio window is 1-2
seconds just before the end of the audio file and the decoding spirals
into a repetition of the last transcribed phrase.

commit | commitdiff | tree

Georgi Gerganov [Thu, 8 Dec 2022 21:48:04 +0000 (23:48 +0200)]

ggml : add alternative cblas_sgemm call

commit | commitdiff | tree

Georgi Gerganov [Thu, 8 Dec 2022 17:42:06 +0000 (19:42 +0200)]

make : indentation + .gitignore

commit | commitdiff | tree

Reinis Muiznieks [Wed, 7 Dec 2022 12:44:58 +0000 (14:44 +0200)]

Flag for Position Independent Code

commit | commitdiff | tree

Georgi Gerganov [Thu, 8 Dec 2022 17:17:24 +0000 (19:17 +0200)]

twitch.sh : various fixes and polishing

- check if streamlink is installed
- fix audio chunking
- change default threads to 4

commit | commitdiff | tree

keyehzy [Thu, 8 Dec 2022 02:45:54 +0000 (23:45 -0300)]

Allow for Twitch.tv live transcription

We rely on streamlink library to give us a stream, then we proceed similarly to
the radio livestream example.

commit | commitdiff | tree

Kartik Saranathan [Thu, 8 Dec 2022 04:18:30 +0000 (23:18 -0500)]

Fix paths echoed after the download

Was using models path instead of root path

commit | commitdiff | tree

Al Hoang [Thu, 8 Dec 2022 05:34:19 +0000 (05:34 +0000)]

fix compilation on haiku

commit | commitdiff | tree

Georgi Gerganov [Wed, 7 Dec 2022 20:12:08 +0000 (22:12 +0200)]

yt-wsp.sh : improve usage instructions

commit | commitdiff | tree

Georgi Gerganov [Wed, 7 Dec 2022 19:12:55 +0000 (21:12 +0200)]

yt-wsp.sh : fix usage instruction + comment

commit | commitdiff | tree

Georgi Gerganov [Wed, 7 Dec 2022 03:15:46 +0000 (05:15 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Wed, 7 Dec 2022 02:41:43 +0000 (04:41 +0200)]

livestream.sh : remove obsolete comment

commit | commitdiff | tree

Georgi Gerganov [Tue, 6 Dec 2022 20:12:57 +0000 (22:12 +0200)]

ggml : fix typo in previous commit

commit | commitdiff | tree

Georgi Gerganov [Tue, 6 Dec 2022 20:05:33 +0000 (22:05 +0200)]

ggml : use macros to inline FP16 <-> FP32 conversions

commit | commitdiff | tree

Georgi Gerganov [Tue, 6 Dec 2022 19:56:56 +0000 (21:56 +0200)]

ggml : add F16C CPU flag check

commit | commitdiff | tree

katsu560 [Tue, 6 Dec 2022 18:32:48 +0000 (03:32 +0900)]

add fp16/fp32 convert intrinsics

commit | commitdiff | tree

Georgi Gerganov [Tue, 6 Dec 2022 16:48:57 +0000 (18:48 +0200)]

models : add the new "large" model release by OpenAI

The old "large" model is now renamed "large-v1".
If you have been using it, make sure to rename it and download the new
"large" model for best results.

commit | commitdiff | tree

Georgi Gerganov [Tue, 6 Dec 2022 16:47:48 +0000 (18:47 +0200)]

bench : add commit hash to bench-all.sh results

commit | commitdiff | tree

Georgi Gerganov [Fri, 2 Dec 2022 19:51:50 +0000 (21:51 +0200)]

Try to improve the token sampling strategy (#193)

* whisper : try to improve the token sampling strategy

- Add the "max_initial_timestaamp" token logic from OpenAI
- Disallow sampling timestamps that are in the past

* whisper : fix the max initial timestamp logic + fallback decoding

commit | commitdiff | tree

Georgi Gerganov [Mon, 28 Nov 2022 20:44:01 +0000 (22:44 +0200)]

tests : adding transcription tests

commit | commitdiff | tree

Georgi Gerganov [Thu, 1 Dec 2022 20:15:12 +0000 (22:15 +0200)]

ggml : remove inline specifier from fp16 <-> fp32 converters

commit | commitdiff | tree

Georgi Gerganov [Thu, 1 Dec 2022 18:49:09 +0000 (20:49 +0200)]

livestream : handle ffmpeg errors gracefully and stabilize transcript

commit | commitdiff | tree

Georgi Gerganov [Thu, 1 Dec 2022 17:47:58 +0000 (19:47 +0200)]

livestream : minor changes

commit | commitdiff | tree

semiformal-net [Thu, 1 Dec 2022 17:18:22 +0000 (12:18 -0500)]

livestream : fix losing words across audio chunk (#195)

* improve livestream script

* Update examples/livestream.sh

Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: Paul Edwards <redacted>
Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Tienshiao Ma [Tue, 29 Nov 2022 07:29:34 +0000 (23:29 -0800)]

Fix Darwin flags - was incorrectly always using the Linux else clause

commit | commitdiff | tree

Georgi Gerganov [Sun, 27 Nov 2022 18:28:36 +0000 (20:28 +0200)]

whisper : add mechanism for aborting the whisper_full() computation

commit | commitdiff | tree

Georgi Gerganov [Sun, 27 Nov 2022 09:30:32 +0000 (11:30 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Sun, 27 Nov 2022 08:48:59 +0000 (10:48 +0200)]

whisper.objc : fix context + broken readme links

commit | commitdiff | tree

Georgi Gerganov [Sat, 26 Nov 2022 15:28:28 +0000 (17:28 +0200)]

whisper.objc : add real-time processing (#97)

Similar to the "stream" app

commit | commitdiff | tree

Georgi Gerganov [Sat, 26 Nov 2022 14:27:04 +0000 (16:27 +0200)]

whisper.objc : fix build warnings

commit | commitdiff | tree

Georgi Gerganov [Sat, 26 Nov 2022 11:07:54 +0000 (13:07 +0200)]

minor : remove "examples/" prefix from the README

commit | commitdiff | tree

Georgi Gerganov [Sat, 26 Nov 2022 10:53:23 +0000 (12:53 +0200)]

yt-wsp.sh : script to easily transcribe VODs

Thanks to @DaniruKun
ref: https://gist.github.com/DaniruKun/96f763ec1a037cc92fe1a059b643b818

Usage:

  cd whisper.cpp
  make

  ./examples/yt-wsp.sh <video-url>

commit | commitdiff | tree

Georgi Gerganov [Sat, 26 Nov 2022 09:56:55 +0000 (11:56 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Sat, 26 Nov 2022 09:40:06 +0000 (11:40 +0200)]

command.wasm : add voice assistant example for the Web (#171)

Same as the command-line tool "command", but runs in the browser

Also, added helper script "extra/deploy-wasm.sh" and fixed some timing
constants for the WASM examples.

commit | commitdiff | tree

Georgi Gerganov [Sat, 26 Nov 2022 08:22:42 +0000 (10:22 +0200)]

minor : add comment for using "generate_karaoke.sh"

commit | commitdiff | tree

Georgi Gerganov [Sat, 26 Nov 2022 08:05:37 +0000 (10:05 +0200)]

livestream.sh : simple tool to transcribe audio livestreams (#185)

commit | commitdiff | tree

Georgi Gerganov [Fri, 25 Nov 2022 21:57:46 +0000 (23:57 +0200)]

stream.wasm : add web-based real-time transcription (#112)

commit | commitdiff | tree

Georgi Gerganov [Fri, 25 Nov 2022 21:07:42 +0000 (23:07 +0200)]

whisper.wasm : do not block page while processing (close #86)

commit | commitdiff | tree

Georgi Gerganov [Fri, 25 Nov 2022 20:08:58 +0000 (22:08 +0200)]

main : add stereo-channel-based diarization (#64)

Not tested - I don't have stereo dialog audio

commit | commitdiff | tree

Georgi Gerganov [Fri, 25 Nov 2022 18:23:58 +0000 (20:23 +0200)]

command : add demonstration video

commit | commitdiff | tree

Georgi Gerganov [Fri, 25 Nov 2022 17:53:50 +0000 (19:53 +0200)]

command : fix build + fix README + add bold printing

commit | commitdiff | tree

Georgi Gerganov [Fri, 25 Nov 2022 17:06:56 +0000 (19:06 +0200)]

examples : add "command" tool (#171)

commit | commitdiff | tree

Georgi Gerganov [Fri, 25 Nov 2022 17:08:51 +0000 (19:08 +0200)]

refactoring : more readable code

commit | commitdiff | tree

vicalloy [Fri, 25 Nov 2022 03:24:08 +0000 (11:24 +0800)]

correct model name display on running samples

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 21:13:26 +0000 (23:13 +0200)]

wasm : refactor wasm example + reuse fetch mechanism

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 18:15:07 +0000 (20:15 +0200)]

talk.wasm : update video link + some minor fixes

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 18:09:45 +0000 (20:09 +0200)]

Update README.md

Use a less cringy video to demo talk.wasm lol

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 18:06:51 +0000 (20:06 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 16:24:06 +0000 (18:24 +0200)]

talk.wasm : move to https://whisper.ggerganov.com/talk

This way, we can share the same models across different WASM examples
and not have to download them for each page

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 15:54:41 +0000 (17:54 +0200)]

models : add instructions for using HF fine-tuned models

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 15:54:16 +0000 (17:54 +0200)]

whisper : improve printfs

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 15:53:51 +0000 (17:53 +0200)]

main : fix dangling pointer when using stdin for input (#65)

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 15:52:04 +0000 (17:52 +0200)]

main, stream : remove --verbose flag (#178)

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 22:34:00 +0000 (00:34 +0200)]

talk.wasm : add audio pre-processing + bump memory

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 22:08:57 +0000 (00:08 +0200)]

talk.wasm : refactoring + update README.md

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 21:22:40 +0000 (23:22 +0200)]

models : add usage comments to the HF convert script (#157)

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 21:14:11 +0000 (23:14 +0200)]

models : fix HF fine-tuned model conversion script (#157)

It works now

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 20:40:06 +0000 (22:40 +0200)]

ggml : fix the fix

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 20:27:49 +0000 (22:27 +0200)]

ggml : fix cross-compile Linux -> Window with mingw (#168)

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 20:16:50 +0000 (22:16 +0200)]

Revert "update README.md"

This reverts commit 6a84147113669bed68bbc4d31e3c14f914092bf8.

commit | commitdiff | tree

katsu560 [Wed, 23 Nov 2022 13:59:54 +0000 (22:59 +0900)]

update README.md

commit | commitdiff | tree

katsu560 [Wed, 23 Nov 2022 13:54:21 +0000 (22:54 +0900)]

ggml: change inline ggml_fp16_to_fp32, ggml_fp16_t ggml_fp32_to_fp16

commit | commitdiff | tree

katsu560 [Wed, 23 Nov 2022 12:31:05 +0000 (21:31 +0900)]

add gprof option

commit | commitdiff | tree

katsu560 [Wed, 23 Nov 2022 11:23:35 +0000 (20:23 +0900)]

fix AVX,AVX2,FMA,F16C detection on Linux and add flags for OpenBLAS

commit | commitdiff | tree

katsu560 [Wed, 23 Nov 2022 11:23:24 +0000 (20:23 +0900)]

add AVX support

commit | commitdiff | tree

Tamotsu Takahashi [Wed, 23 Nov 2022 06:17:13 +0000 (15:17 +0900)]

Build with OpenBLAS and SDL2 on windows

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 20:07:20 +0000 (22:07 +0200)]

models : minor changes to the HF convert script (#157)

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 15:17:31 +0000 (17:17 +0200)]

models : add "convert-h5-to-ggml.py" script (#157)

Converts transformers models to ggml.
Although the conversion is successful, it does not work for some reason.
Not sure why

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 15:17:01 +0000 (17:17 +0200)]

minor : updates few prints + fix buttons in whisper.wasm

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 07:53:55 +0000 (09:53 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 07:52:36 +0000 (09:52 +0200)]

Update README.md

commit | commitdiff | tree

Tamotsu Takahashi [Wed, 23 Nov 2022 00:46:56 +0000 (09:46 +0900)]

Find libopenblas.dll.a on windows

"lib" is needed for windows.

With this change, you can build whisper.cpp with OpenBLAS's prebuilt DLL.
1. extract a zip from https://github.com/xianyi/OpenBLAS/releases
2. copy the headers in (openblas)/include to the root directory of whisper.cpp
3. invoke cmake with -DCMAKE_LIBRARY_PATH=(openblas)\lib -DWHISPER_SUPPORT_OPENBLAS=ON
4. copy (openblas)/bin/libopenblas.dll to the same directory of whisper.dll after msbuild

https://github.com/ggerganov/whisper.cpp/issues/89#issuecomment-1324391258

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 06:24:29 +0000 (08:24 +0200)]

unicode : fix character replacement (thanks to @tamo)

commit | commitdiff | tree

Georgi Gerganov [Tue, 22 Nov 2022 20:48:56 +0000 (22:48 +0200)]

close #109 : add fetching of the model over HTTP (whisper.wasm)

commit | commitdiff | tree

Georgi Gerganov [Tue, 22 Nov 2022 20:22:17 +0000 (22:22 +0200)]

talk.wasm : final touches

commit | commitdiff | tree

Georgi Gerganov [Tue, 22 Nov 2022 18:10:20 +0000 (20:10 +0200)]

talk.wasm : polishing + adding many AI personalities

commit | commitdiff | tree

Georgi Gerganov [Tue, 22 Nov 2022 16:20:05 +0000 (18:20 +0200)]

stream : "-kc" now enables context keeping from previous segment (#90)

By default, the context keeping is disabled

commit | commitdiff | tree

M. Eren Akbiyik [Tue, 22 Nov 2022 16:10:35 +0000 (17:10 +0100)]

Prompt previous tokens for streaming (#163)

* feat: prompt previous tokens for streaming

I used a vector pointer instead of vector itself because it gave weird errors, and why not

* convert vector to use with C api

* feat: remove old refs, check for prompt size

* feat: use better way of getting the pointer

commit | commitdiff | tree

Georgi Gerganov [Mon, 21 Nov 2022 20:42:29 +0000 (22:42 +0200)]

talk.wasm : update README.md

commit | commitdiff | tree

Georgi Gerganov [Mon, 21 Nov 2022 20:20:42 +0000 (22:20 +0200)]

talk.wasm : GPT-2 meets Whisper in WebAssembly (#155)

* talk : initial real-time transcription in the browser

* talk : polishing the UI

* talk : ready for beta testing

* talk.wasm : rename example

commit | commitdiff | tree

Georgi Gerganov [Mon, 21 Nov 2022 16:52:20 +0000 (18:52 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 20:43:32 +0000 (22:43 +0200)]

ggml : fix Windows build

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 20:39:39 +0000 (22:39 +0200)]

ci : add Windows build

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 19:22:02 +0000 (21:22 +0200)]

stream : add "max_tokens" cli arg

Controls the max tokens per segment for the stream example

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 19:12:01 +0000 (21:12 +0200)]

stream : add "audio_ctx" parameter

Used to overwrite the audio context size of the Encoder.
For example, setting "audio_ctx = 512" will make it run about 3 times
faster, processing about 10s of audio, instead of 30s.

The transcription quality drops, but this can be used for real-time
streaming purposes where performance is important.

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 18:52:24 +0000 (20:52 +0200)]

stream : add "max_tokens" parameter

Used to limit the number of tokens in a segment.
Useful to battle with word repetition when using partial encoder context

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 18:45:10 +0000 (20:45 +0200)]

stream : add "single_segment" option

Force the entire audio chunk to be transcribed into a single segment

commit | commitdiff | tree

Georgi Gerganov [Fri, 11 Nov 2022 20:33:10 +0000 (22:33 +0200)]

stream : partial encoder experiments

commit | commitdiff | tree

greeshmay [Thu, 17 Nov 2022 20:12:51 +0000 (12:12 -0800)]

fix: free ggml_context (close #149) (#150)

* fix: free ggml_context

* ggml : free the model's contexts in whisper_free()

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Georgi Gerganov [Wed, 16 Nov 2022 17:21:43 +0000 (19:21 +0200)]

models : simplify the conversion script

"transformers" dependency is not actually needed

commit | commitdiff | tree

Dody Suria Wijaya [Wed, 16 Nov 2022 16:53:01 +0000 (23:53 +0700)]

Update download-ggml-model.sh

follow curl redirect to new hosting site

commit | commitdiff | tree

Georgi Gerganov [Tue, 15 Nov 2022 17:47:06 +0000 (19:47 +0200)]

models : change default hosting to Hugging Face

My Linode is running out of monthly bandwidth due to the big interest in
the project

commit | commitdiff | tree

Georgi Gerganov [Sat, 12 Nov 2022 16:03:49 +0000 (18:03 +0200)]

whisper : add option to speed up the audio tempo by x2

Using a Phase Vocoder for speeding up the audio tempo by scaling down
the frequencies in the frequency domain.

This reduces the computation in the Encoder by a factor of 2.
The transcription accuracy is degraded, but for slow to normal speech -
it seems to be still very good.

I think this can find application for real-time transcription - i.e. the
"stream" example.

commit | commitdiff | tree

Georgi Gerganov [Sun, 13 Nov 2022 07:08:33 +0000 (09:08 +0200)]

make : add libwhisper.so target (#144)

commit | commitdiff | tree

Chidi Williams [Fri, 11 Nov 2022 16:10:01 +0000 (16:10 +0000)]

Add WHISPER_NO_AVX and WHISPER_NO_AVX2 to CMakeLists (#136)

* Check for AVX and AVX2 on Darwin

* Add AVX options to CMakeLists

commit | commitdiff | tree

Georgi Gerganov [Fri, 11 Nov 2022 16:02:58 +0000 (18:02 +0200)]

minor : remove one more redundant line

Packaging of ggerganov/whisper.cpp

RSS Atom