talk.wasm : update README.md

author Georgi Gerganov <redacted>

Mon, 21 Nov 2022 20:42:29 +0000 (22:42 +0200)

committer Georgi Gerganov <redacted>

Mon, 21 Nov 2022 20:42:29 +0000 (22:42 +0200)
author Georgi Gerganov <redacted>
Mon, 21 Nov 2022 20:42:29 +0000 (22:42 +0200)
committer Georgi Gerganov <redacted>
Mon, 21 Nov 2022 20:42:29 +0000 (22:42 +0200)
diff --git a/examples/talk.wasm/README.md b/examples/talk.wasm/README.md

index f20bb0105ecb3a7f0fc63c545ab852594813e434..43501cd6f6632b40bd664edac6024f49565ea18d 100644 (file)
--- a/examples/talk.wasm/README.md
+++ b/examples/talk.wasm/README.md
@@ -1,7 +1,38 @@
-# talk
+# talk.wasm
  
-WIP IN PROGRESS
+Talk with an Artificial Intelligence entity in your browser:
  
-ref: https://github.com/ggerganov/whisper.cpp/issues/154
+https://user-images.githubusercontent.com/1991296/202914175-115793b1-d32e-4aaa-a45b-59e313707ff6.mp4
  
-demo: https://talk.ggerganov.com
+Online demo: https://talk.ggerganov.com
+
+## How it works?
+
+This demo leverages 2 modern neural network models to create a high-quality voice chat directly in your browser:
+
+- [OpenAI's Whisper](https://github.com/openai/whisper) speech recognition model is used to process your voice and understand what you are saying
+- Upon receiving some voice input, the AI generates a text response using [OpenAI's GPT-2](https://github.com/openai/gpt-2) language model
+- The AI then vocalizes the response using the browser's [Web Speech API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API)
+
+The web page does the processing locally on your machine. However, in order to run the models, it first needs to
+download the model data which is about ~350 MB. The model data is then cached in your browser's cache and can be reused
+in future visits without downloading it again.
+
+The processing of these heavy neural network models in the browser is possible by implementing them efficiently in C/C++
+and using WebAssembly SIMD capabilities for extra performance. For more detailed information, checkout the
+[current repository](https://github.com/ggerganov/whisper.cpp).
+
+## Requirements
+
+In order to run this demo efficiently, you need to have the following:
+
+- Latest Chrome or Firefox browser (Safari is not supported)
+- Run this on a desktop or laptop with modern CPU (a mobile phone will likely not be good enough)
+- Speak phrases that are no longer than 10 seconds - this is the audio context of the AI
+- The web-page uses about 1.4GB of RAM
+
+## Feedback
+
+If you have any comments or ideas for improvement, please drop a comment in the following discussion:
+
+https://github.com/ggerganov/whisper.cpp/discussions/167
diff --git a/examples/talk.wasm/index-tmpl.html b/examples/talk.wasm/index-tmpl.html

index abaea137e0c6f73d52192764af2cfc63445153ad..86e2ceac8f1d8c9afb4688be82f1d878d61d6f1e 100644 (file)
--- a/examples/talk.wasm/index-tmpl.html
+++ b/examples/talk.wasm/index-tmpl.html
@@ -46,6 +46,10 @@
  
              <hr>
  
+            Select the models you would like to use and click the "Start" button to begin the conversation
+
+            <br><br>
+
              <div id="model-whisper">
                  <span id="model-whisper-status">Whisper model:</span>
                  <button id="fetch-whisper-tiny-en" onclick="loadWhisper('tiny.en')">tiny.en (75 MB)</button>
author	Georgi Gerganov <redacted>
	Mon, 21 Nov 2022 20:42:29 +0000 (22:42 +0200)
committer	Georgi Gerganov <redacted>
	Mon, 21 Nov 2022 20:42:29 +0000 (22:42 +0200)
examples/talk.wasm/README.md		patch \| blob \| history
examples/talk.wasm/index-tmpl.html		patch \| blob \| history