git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log

]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log

overview / pkg / ggml / sources / whisper.cpp / log

commit | commitdiff | tree

Georgi Gerganov [Fri, 25 Nov 2022 17:08:51 +0000 (19:08 +0200)]

refactoring : more readable code

commit | commitdiff | tree

vicalloy [Fri, 25 Nov 2022 03:24:08 +0000 (11:24 +0800)]

correct model name display on running samples

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 21:13:26 +0000 (23:13 +0200)]

wasm : refactor wasm example + reuse fetch mechanism

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 18:15:07 +0000 (20:15 +0200)]

talk.wasm : update video link + some minor fixes

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 18:09:45 +0000 (20:09 +0200)]

Update README.md

Use a less cringy video to demo talk.wasm lol

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 18:06:51 +0000 (20:06 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 16:24:06 +0000 (18:24 +0200)]

talk.wasm : move to https://whisper.ggerganov.com/talk

This way, we can share the same models across different WASM examples
and not have to download them for each page

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 15:54:41 +0000 (17:54 +0200)]

models : add instructions for using HF fine-tuned models

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 15:54:16 +0000 (17:54 +0200)]

whisper : improve printfs

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 15:53:51 +0000 (17:53 +0200)]

main : fix dangling pointer when using stdin for input (#65)

commit | commitdiff | tree

Georgi Gerganov [Thu, 24 Nov 2022 15:52:04 +0000 (17:52 +0200)]

main, stream : remove --verbose flag (#178)

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 22:34:00 +0000 (00:34 +0200)]

talk.wasm : add audio pre-processing + bump memory

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 22:08:57 +0000 (00:08 +0200)]

talk.wasm : refactoring + update README.md

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 21:22:40 +0000 (23:22 +0200)]

models : add usage comments to the HF convert script (#157)

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 21:14:11 +0000 (23:14 +0200)]

models : fix HF fine-tuned model conversion script (#157)

It works now

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 20:40:06 +0000 (22:40 +0200)]

ggml : fix the fix

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 20:27:49 +0000 (22:27 +0200)]

ggml : fix cross-compile Linux -> Window with mingw (#168)

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 20:16:50 +0000 (22:16 +0200)]

Revert "update README.md"

This reverts commit 6a84147113669bed68bbc4d31e3c14f914092bf8.

commit | commitdiff | tree

katsu560 [Wed, 23 Nov 2022 13:59:54 +0000 (22:59 +0900)]

update README.md

commit | commitdiff | tree

katsu560 [Wed, 23 Nov 2022 13:54:21 +0000 (22:54 +0900)]

ggml: change inline ggml_fp16_to_fp32, ggml_fp16_t ggml_fp32_to_fp16

commit | commitdiff | tree

katsu560 [Wed, 23 Nov 2022 12:31:05 +0000 (21:31 +0900)]

add gprof option

commit | commitdiff | tree

katsu560 [Wed, 23 Nov 2022 11:23:35 +0000 (20:23 +0900)]

fix AVX,AVX2,FMA,F16C detection on Linux and add flags for OpenBLAS

commit | commitdiff | tree

katsu560 [Wed, 23 Nov 2022 11:23:24 +0000 (20:23 +0900)]

add AVX support

commit | commitdiff | tree

Tamotsu Takahashi [Wed, 23 Nov 2022 06:17:13 +0000 (15:17 +0900)]

Build with OpenBLAS and SDL2 on windows

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 20:07:20 +0000 (22:07 +0200)]

models : minor changes to the HF convert script (#157)

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 15:17:31 +0000 (17:17 +0200)]

models : add "convert-h5-to-ggml.py" script (#157)

Converts transformers models to ggml.
Although the conversion is successful, it does not work for some reason.
Not sure why

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 15:17:01 +0000 (17:17 +0200)]

minor : updates few prints + fix buttons in whisper.wasm

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 07:53:55 +0000 (09:53 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 07:52:36 +0000 (09:52 +0200)]

Update README.md

commit | commitdiff | tree

Tamotsu Takahashi [Wed, 23 Nov 2022 00:46:56 +0000 (09:46 +0900)]

Find libopenblas.dll.a on windows

"lib" is needed for windows.

With this change, you can build whisper.cpp with OpenBLAS's prebuilt DLL.
1. extract a zip from https://github.com/xianyi/OpenBLAS/releases
2. copy the headers in (openblas)/include to the root directory of whisper.cpp
3. invoke cmake with -DCMAKE_LIBRARY_PATH=(openblas)\lib -DWHISPER_SUPPORT_OPENBLAS=ON
4. copy (openblas)/bin/libopenblas.dll to the same directory of whisper.dll after msbuild

https://github.com/ggerganov/whisper.cpp/issues/89#issuecomment-1324391258

commit | commitdiff | tree

Georgi Gerganov [Wed, 23 Nov 2022 06:24:29 +0000 (08:24 +0200)]

unicode : fix character replacement (thanks to @tamo)

commit | commitdiff | tree

Georgi Gerganov [Tue, 22 Nov 2022 20:48:56 +0000 (22:48 +0200)]

close #109 : add fetching of the model over HTTP (whisper.wasm)

commit | commitdiff | tree

Georgi Gerganov [Tue, 22 Nov 2022 20:22:17 +0000 (22:22 +0200)]

talk.wasm : final touches

commit | commitdiff | tree

Georgi Gerganov [Tue, 22 Nov 2022 18:10:20 +0000 (20:10 +0200)]

talk.wasm : polishing + adding many AI personalities

commit | commitdiff | tree

Georgi Gerganov [Tue, 22 Nov 2022 16:20:05 +0000 (18:20 +0200)]

stream : "-kc" now enables context keeping from previous segment (#90)

By default, the context keeping is disabled

commit | commitdiff | tree

M. Eren Akbiyik [Tue, 22 Nov 2022 16:10:35 +0000 (17:10 +0100)]

Prompt previous tokens for streaming (#163)

* feat: prompt previous tokens for streaming

I used a vector pointer instead of vector itself because it gave weird errors, and why not

* convert vector to use with C api

* feat: remove old refs, check for prompt size

* feat: use better way of getting the pointer

commit | commitdiff | tree

Georgi Gerganov [Mon, 21 Nov 2022 20:42:29 +0000 (22:42 +0200)]

talk.wasm : update README.md

commit | commitdiff | tree

Georgi Gerganov [Mon, 21 Nov 2022 20:20:42 +0000 (22:20 +0200)]

talk.wasm : GPT-2 meets Whisper in WebAssembly (#155)

* talk : initial real-time transcription in the browser

* talk : polishing the UI

* talk : ready for beta testing

* talk.wasm : rename example

commit | commitdiff | tree

Georgi Gerganov [Mon, 21 Nov 2022 16:52:20 +0000 (18:52 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 20:43:32 +0000 (22:43 +0200)]

ggml : fix Windows build

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 20:39:39 +0000 (22:39 +0200)]

ci : add Windows build

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 19:22:02 +0000 (21:22 +0200)]

stream : add "max_tokens" cli arg

Controls the max tokens per segment for the stream example

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 19:12:01 +0000 (21:12 +0200)]

stream : add "audio_ctx" parameter

Used to overwrite the audio context size of the Encoder.
For example, setting "audio_ctx = 512" will make it run about 3 times
faster, processing about 10s of audio, instead of 30s.

The transcription quality drops, but this can be used for real-time
streaming purposes where performance is important.

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 18:52:24 +0000 (20:52 +0200)]

stream : add "max_tokens" parameter

Used to limit the number of tokens in a segment.
Useful to battle with word repetition when using partial encoder context

commit | commitdiff | tree

Georgi Gerganov [Sun, 20 Nov 2022 18:45:10 +0000 (20:45 +0200)]

stream : add "single_segment" option

Force the entire audio chunk to be transcribed into a single segment

commit | commitdiff | tree

Georgi Gerganov [Fri, 11 Nov 2022 20:33:10 +0000 (22:33 +0200)]

stream : partial encoder experiments

commit | commitdiff | tree

greeshmay [Thu, 17 Nov 2022 20:12:51 +0000 (12:12 -0800)]

fix: free ggml_context (close #149) (#150)

* fix: free ggml_context

* ggml : free the model's contexts in whisper_free()

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Georgi Gerganov [Wed, 16 Nov 2022 17:21:43 +0000 (19:21 +0200)]

models : simplify the conversion script

"transformers" dependency is not actually needed

commit | commitdiff | tree

Dody Suria Wijaya [Wed, 16 Nov 2022 16:53:01 +0000 (23:53 +0700)]

Update download-ggml-model.sh

follow curl redirect to new hosting site

commit | commitdiff | tree

Georgi Gerganov [Tue, 15 Nov 2022 17:47:06 +0000 (19:47 +0200)]

models : change default hosting to Hugging Face

My Linode is running out of monthly bandwidth due to the big interest in
the project

commit | commitdiff | tree

Georgi Gerganov [Sat, 12 Nov 2022 16:03:49 +0000 (18:03 +0200)]

whisper : add option to speed up the audio tempo by x2

Using a Phase Vocoder for speeding up the audio tempo by scaling down
the frequencies in the frequency domain.

This reduces the computation in the Encoder by a factor of 2.
The transcription accuracy is degraded, but for slow to normal speech -
it seems to be still very good.

I think this can find application for real-time transcription - i.e. the
"stream" example.

commit | commitdiff | tree

Georgi Gerganov [Sun, 13 Nov 2022 07:08:33 +0000 (09:08 +0200)]

make : add libwhisper.so target (#144)

commit | commitdiff | tree

Chidi Williams [Fri, 11 Nov 2022 16:10:01 +0000 (16:10 +0000)]

Add WHISPER_NO_AVX and WHISPER_NO_AVX2 to CMakeLists (#136)

* Check for AVX and AVX2 on Darwin

* Add AVX options to CMakeLists

commit | commitdiff | tree

Georgi Gerganov [Fri, 11 Nov 2022 16:02:58 +0000 (18:02 +0200)]

minor : remove one more redundant line

commit | commitdiff | tree

Georgi Gerganov [Fri, 11 Nov 2022 15:58:51 +0000 (17:58 +0200)]

minor : fix double float32 conversion in python script

commit | commitdiff | tree

Georgi Gerganov [Wed, 9 Nov 2022 19:41:21 +0000 (21:41 +0200)]

ref #40 : start working on the documentation

commit | commitdiff | tree

Alan [Wed, 9 Nov 2022 18:24:06 +0000 (15:24 -0300)]

Adds support for stdin wav input

commit | commitdiff | tree

Georgi Gerganov [Wed, 9 Nov 2022 17:32:58 +0000 (19:32 +0200)]

js : update whipser.js to latest

commit | commitdiff | tree

Chidi Williams [Wed, 9 Nov 2022 00:28:36 +0000 (00:28 +0000)]

Check for AVX and AVX2 on Darwin

commit | commitdiff | tree

boolemancer [Tue, 8 Nov 2022 11:04:23 +0000 (03:04 -0800)]

Fix the Windows pthread_create shim

The current implementation doesn't actually set the out parameter,
and it returns 0 on failure instead of on success.

commit | commitdiff | tree

Georgi Gerganov [Mon, 7 Nov 2022 19:48:13 +0000 (21:48 +0200)]

sync : submodule whisper.spm

commit | commitdiff | tree

Georgi Gerganov [Mon, 7 Nov 2022 18:50:24 +0000 (20:50 +0200)]

cmake : add submodule whisper.spm

commit | commitdiff | tree

Georgi Gerganov [Mon, 7 Nov 2022 18:14:52 +0000 (20:14 +0200)]

ref #22 : add "duration" option

Can be used to partially process a recording

commit | commitdiff | tree

Georgi Gerganov [Sun, 6 Nov 2022 19:04:21 +0000 (21:04 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Sun, 6 Nov 2022 07:22:50 +0000 (09:22 +0200)]

examples : add simple script for generating Karaoke video

commit | commitdiff | tree

Georgi Gerganov [Sat, 5 Nov 2022 06:44:41 +0000 (08:44 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Fri, 4 Nov 2022 20:26:08 +0000 (22:26 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Fri, 4 Nov 2022 16:30:38 +0000 (18:30 +0200)]

main : fix generated bash script

commit | commitdiff | tree

Georgi Gerganov [Thu, 3 Nov 2022 18:53:44 +0000 (20:53 +0200)]

ggml : multi-thread the ggml_add operator

commit | commitdiff | tree

Georgi Gerganov [Thu, 3 Nov 2022 18:18:57 +0000 (20:18 +0200)]

cmake : fix passing GGML_PERF compile option

commit | commitdiff | tree

Georgi Gerganov [Wed, 2 Nov 2022 20:03:27 +0000 (22:03 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Wed, 2 Nov 2022 19:18:20 +0000 (21:18 +0200)]

whisper : token-level timestamp refactoring (#49, #120)

This turned out pretty good overall. The algorithm has been moved from
main.cpp to whisper.cpp and can be reused for all subtitles types. This
means that now you can specify the maximum length of the generated
lines. Simply provide the "-ml" argument specifying the max length in
number of characters

commit | commitdiff | tree

Georgi Gerganov [Wed, 2 Nov 2022 16:33:29 +0000 (18:33 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Wed, 2 Nov 2022 16:31:55 +0000 (18:31 +0200)]

extra : compute SHA of all models files

commit | commitdiff | tree

Georgi Gerganov [Wed, 2 Nov 2022 16:31:18 +0000 (18:31 +0200)]

whisper : fix extra memory usage after recent processor changes

Had increased the memory buffer to the size of the model and forgot to
bring it down.

commit | commitdiff | tree

Syed Jafri [Wed, 2 Nov 2022 16:00:19 +0000 (10:00 -0600)]

Allow building with Accelerate for x86_64 Macs (#123)

* Cross compile windows

* set env properly

* rm log

* fix review

* Add back space

* Don't force architecture

* Allow building x86_64 with accelerate

commit | commitdiff | tree

Georgi Gerganov [Wed, 2 Nov 2022 15:52:24 +0000 (17:52 +0200)]

ggml : fix the check for NEON support (#7)

Was using the wrong preprocessor macro

commit | commitdiff | tree

Syed Jafri [Wed, 2 Nov 2022 06:46:49 +0000 (00:46 -0600)]

Cross compilation (#121)

* Cross compile windows

* set env properly

* rm log

* fix review

* Add back space

commit | commitdiff | tree

Georgi Gerganov [Tue, 1 Nov 2022 20:47:58 +0000 (22:47 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Tue, 1 Nov 2022 20:35:21 +0000 (22:35 +0200)]

main : add some comments for the word-level timestamp algorithm

commit | commitdiff | tree

Georgi Gerganov [Tue, 1 Nov 2022 20:09:25 +0000 (22:09 +0200)]

main : fix some edge cases for word-level timestamps

commit | commitdiff | tree

Georgi Gerganov [Mon, 31 Oct 2022 20:06:05 +0000 (22:06 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Mon, 31 Oct 2022 18:19:41 +0000 (20:19 +0200)]

Update README.md

commit | commitdiff | tree

Mikhail Grigorev [Sun, 30 Oct 2022 19:51:29 +0000 (00:51 +0500)]

Added for Windows implemenated script download-ggml-model.cmd

commit | commitdiff | tree

Mikhail Grigorev [Sun, 30 Oct 2022 17:19:24 +0000 (22:19 +0500)]

Fixed sched_yield

commit | commitdiff | tree

Mikhail Grigorev [Sun, 30 Oct 2022 10:29:27 +0000 (15:29 +0500)]

Implemenated sched_yield function for Windows

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Oct 2022 15:11:37 +0000 (17:11 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Oct 2022 15:10:46 +0000 (17:10 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Oct 2022 08:05:58 +0000 (10:05 +0200)]

main : add option for word-leve timestamps (very experimental)

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Oct 2022 06:27:04 +0000 (08:27 +0200)]

stream : add "--capture" option to select capture device (ref #10)

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Oct 2022 06:23:52 +0000 (08:23 +0200)]

close #113 : fix struct whisper_token_data

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Oct 2022 18:28:06 +0000 (21:28 +0300)]

minor : update whisper.js

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Oct 2022 17:30:05 +0000 (20:30 +0300)]

whisper.wasm : update system info print

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Oct 2022 16:41:50 +0000 (19:41 +0300)]

ref #5 : update CMake for Windows build

- __AVX2__ should already be defined due to /arch:AVX2
- _CRT_SECURE_NO_WARNINGS should be defined both for shared and static lib

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Oct 2022 11:14:23 +0000 (14:14 +0300)]

minor : fix multiple definitions of to_timestamp()

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Oct 2022 11:08:23 +0000 (14:08 +0300)]

parallel : print time of audio boundaries + fix timings

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Oct 2022 09:38:29 +0000 (12:38 +0300)]

ggml : fix barrier

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Oct 2022 09:26:03 +0000 (12:26 +0300)]

main : merge parallel example in main