From: Johannes Gäßler Date: Mon, 29 Jul 2024 13:03:08 +0000 (+0200) Subject: examples: add TensorFlow to requirements.txt (#902) X-Git-Tag: upstream/0.0.1642~488 X-Git-Url: https://git.djapps.eu/?a=commitdiff_plain;h=49164e6672f1297b13316fbd876071a93783ac25;p=pkg%2Fggml%2Fsources%2Fggml examples: add TensorFlow to requirements.txt (#902) --- diff --git a/README.md b/README.md index 1a255440..f15f3808 100644 --- a/README.md +++ b/README.md @@ -44,20 +44,28 @@ Some of the development is currently happening in the [llama.cpp](https://github - [X] Example of multiple LLMs inference [foldl/chatllm.cpp](https://github.com/foldl/chatllm.cpp) - [X] SeamlessM4T inference *(in development)* https://github.com/facebookresearch/seamless_communication/tree/main/ggml -## GPT inference (example) - -With ggml you can efficiently run [GPT-2](examples/gpt-2) and [GPT-J](examples/gpt-j) inference on the CPU. - -Here is how to run the example programs: +## Python environment setup and building the examples ```bash -# Build ggml + examples git clone https://github.com/ggerganov/ggml cd ggml +# Install python dependencies in a virtual environment +python3.10 -m venv ggml_env +source ./ggml_env/bin/activate +pip install -r requirements.txt +# Build the examples mkdir build && cd build cmake .. -make -j4 gpt-2-backend gpt-j +cmake --build . --config Release -j 8 +``` + +## GPT inference (example) + +With ggml you can efficiently run [GPT-2](examples/gpt-2) and [GPT-J](examples/gpt-j) inference on the CPU. +Here is how to run the example programs: + +```bash # Run the GPT-2 small 117M model ../examples/gpt-2/download-ggml-model.sh 117M ./bin/gpt-2-backend -m models/gpt-2-117M/ggml-model.bin -p "This is an example" @@ -66,9 +74,6 @@ make -j4 gpt-2-backend gpt-j ../examples/gpt-j/download-ggml-model.sh 6B ./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin -p "This is an example" -# Install Python dependencies -python3 -m pip install -r ../requirements.txt - # Run the Cerebras-GPT 111M model # Download from: https://huggingface.co/cerebras python3 ../examples/gpt-2/convert-cerebras-to-ggml.py /path/to/Cerebras-GPT-111M/ diff --git a/examples/gpt-2/README.md b/examples/gpt-2/README.md index 45c932c9..10571621 100644 --- a/examples/gpt-2/README.md +++ b/examples/gpt-2/README.md @@ -28,7 +28,7 @@ Sample performance on MacBook M1 Pro: Sample output: -``` +```bash $ ./bin/gpt-2 -h usage: ./bin/gpt-2 [options] @@ -82,7 +82,7 @@ via the [convert-ckpt-to-ggml.py](convert-ckpt-to-ggml.py) python script. Here is the entire process for the GPT-2 117M model (download from official site + conversion): -``` +```bash cd ggml/build ../examples/gpt-2/download-model.sh 117M @@ -111,7 +111,7 @@ Clone the respective repository from here: https://huggingface.co/cerebras Use the [convert-cerebras-to-ggml.py](convert-cerebras-to-ggml.py) script to convert the model to `ggml` format: -``` +```bash cd ggml/build git clone https://huggingface.co/cerebras/Cerebras-GPT-111M models/ python ../examples/gpt-2/convert-cerebras-to-ggml.py models/Cerebras-GPT-111M/ @@ -125,7 +125,7 @@ way, you can directly download a single binary file and start using it. No pytho Here is how to get the 117M ggml model: -``` +```bash cd ggml/build ../examples/gpt-2/download-ggml-model.sh 117M @@ -146,7 +146,7 @@ You can also try to quantize the `ggml` models via 4-bit integer quantization. Keep in mind that for smaller models, this will render them completely useless. You generally want to quantize larger models. -``` +```bash # quantize GPT-2 F16 to Q4_0 (faster but less precise) ./bin/gpt-2-quantize models/gpt-2-1558M/ggml-model-f16.bin models/gpt-2-1558M/ggml-model-q4_0.bin 2 ./bin/gpt-2 -m models/gpt-2-1558M/ggml-model-q4_0.bin -p "This is an example" @@ -163,7 +163,7 @@ You can try the batched generation from a given prompt using the gpt-2-batched b Sample output: -``` +```bash $ gpt-2-batched -np 5 -m models/gpt-2-117M/ggml-model.bin -p "Hello my name is" -n 50 main: seed = 1697037431 diff --git a/examples/gpt-j/README.md b/examples/gpt-j/README.md index eac5a731..d63458bf 100644 --- a/examples/gpt-j/README.md +++ b/examples/gpt-j/README.md @@ -24,7 +24,7 @@ typically consists of 1 or 2 tokens). Here is a sample run with prompt `int main(int argc, char ** argv) {`: -``` +```bash $ time ./bin/gpt-j -p "int main(int argc, char ** argv) {" gptj_model_load: loading model from 'models/gpt-j-6B/ggml-model.bin' - please wait ... @@ -76,7 +76,7 @@ looks like to be the beginning of a networking program in C. Pretty cool! Here is another run, just for fun: -``` +```bash time ./bin/gpt-j -n 500 -t 8 -p "Ask HN: Inherited the worst code and tech team I have ever seen. How to fix it? " @@ -199,13 +199,6 @@ that is not important). If you want to give this a try and you are on Linux or Mac OS, simply follow these instructions: ```bash -# Clone the ggml library and build the gpt-j example -git clone https://github.com/ggerganov/ggml -cd ggml -mkdir build && cd build -cmake .. -make -j4 gpt-j - # Download the ggml-compatible GPT-J 6B model (requires 12GB disk space) ../examples/gpt-j/download-ggml-model.sh 6B diff --git a/examples/magika/README.md b/examples/magika/README.md index 8e1ca27d..5a1a979c 100644 --- a/examples/magika/README.md +++ b/examples/magika/README.md @@ -7,11 +7,11 @@ Simple example that shows how to use GGML for inference with the [Google Magika] - Obtain the Magika model in H5 format - Pinned version: https://github.com/google/magika/blob/4460acb5d3f86807c3b53223229dee2afa50c025/assets_generation/models/standard_v1/model.h5 - Use `convert.py` to convert the model to gguf format: -```sh +```bash $ python examples/magika/convert.py /path/to/model.h5 ``` - Invoke the program with the model file and a list of files to identify: -```sh +```bash $ build/bin/magika model.h5.gguf examples/sam/example.jpg examples/magika/convert.py README.md src/ggml.c /bin/gcc write.exe jfk.wav examples/sam/example.jpg : jpeg (100.00%) pptx (0.00%) smali (0.00%) shell (0.00%) sevenzip (0.00%) examples/magika/convert.py : python (99.99%) javascript (0.00%) txt (0.00%) asm (0.00%) scala (0.00%) diff --git a/examples/mnist/README.md b/examples/mnist/README.md index cf3f8d0d..796b49ae 100644 --- a/examples/mnist/README.md +++ b/examples/mnist/README.md @@ -3,30 +3,16 @@ These are simple examples of how to use GGML for inferencing. The first example uses convolutional neural network (CNN), the second one uses fully connected neural network. -## Python environment setup and build the examples - -```bash -git clone https://github.com/ggerganov/ggml -cd ggml -# Install python dependencies in a virtual environment -python3 -m venv ggml_env -source ./ggml_env/bin/activate -pip install -r requirements.txt -# Build the examples -mkdir build && cd build -cmake .. -make -j4 mnist-cnn mnist -``` - ## MNIST with CNN This implementation achieves ~99% accuracy on the MNIST test set. ### Training the model +Setup the Python environemt and build the examples according to the main README. Use the `mnist-cnn.py` script to train the model and convert it to GGUF format: -``` +```bash $ python3 ../examples/mnist/mnist-cnn.py train mnist-cnn-model ... Keras model saved to 'mnist-cnn-model' @@ -34,7 +20,7 @@ Keras model saved to 'mnist-cnn-model' Convert the model to GGUF format: -``` +```bash $ python3 ../examples/mnist/mnist-cnn.py convert mnist-cnn-model ... Model converted and saved to 'mnist-cnn-model.gguf' diff --git a/examples/mnist/mnist-cnn.py b/examples/mnist/mnist-cnn.py index 35dda60a..ee5fc82e 100755 --- a/examples/mnist/mnist-cnn.py +++ b/examples/mnist/mnist-cnn.py @@ -5,7 +5,11 @@ import numpy as np from tensorflow import keras from tensorflow.keras import layers + def train(model_name): + if not model_name.endswith(".keras") and not model_name.endswith(".h5"): + model_name += ".keras" + # Model / data parameters num_classes = 10 input_shape = (28, 28, 1) @@ -52,9 +56,19 @@ def train(model_name): model.save(model_name) print("Keras model saved to '" + model_name + "'") + def convert(model_name): + if not model_name.endswith(".keras") and not model_name.endswith(".h5"): + model_name += ".keras" + model = keras.models.load_model(model_name) - gguf_model_name = model_name + ".gguf" + if model_name.endswith(".keras"): + gguf_model_name = model_name[:-6] + ".gguf" + elif model_name.endswith(".h5"): + gguf_model_name = model_name[:-3] + ".gguf" + else: + gguf_model_name = model_name + ".gguf" + gguf_writer = gguf.GGUFWriter(gguf_model_name, "mnist-cnn") kernel1 = model.layers[0].weights[0].numpy() @@ -88,6 +102,7 @@ def convert(model_name): gguf_writer.close() print("Model converted and saved to '{}'".format(gguf_model_name)) + if __name__ == '__main__': if len(sys.argv) < 3: print("Usage: %s ".format(sys.argv[0])) diff --git a/examples/sam/README.md b/examples/sam/README.md index 1eef7e78..7354a4a4 100644 --- a/examples/sam/README.md +++ b/examples/sam/README.md @@ -20,23 +20,15 @@ The example currently supports only the [ViT-B SAM model checkpoint](https://hug - [ ] GPU support ## Quick start -```bash -git clone https://github.com/ggerganov/ggml -cd ggml - -# Install Python dependencies -python3 -m pip install -r requirements.txt +Setup Python and build examples according to main README. +```bash # Download PTH model wget -P examples/sam/ https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth # Convert PTH model to ggml python examples/sam/convert-pth-to-ggml.py examples/sam/sam_vit_b_01ec64.pth examples/sam/ 1 -# Build ggml + examples -mkdir build && cd build -cmake .. && make -j4 - # run inference ./bin/sam -t 16 -i ../examples/sam/example.jpg -m ../examples/sam/ggml-model-f16.bin ``` diff --git a/requirements.txt b/requirements.txt index 6adb1d8c..cae85914 100644 --- a/requirements.txt +++ b/requirements.txt @@ -5,6 +5,7 @@ torchvision>=0.15.2 transformers>=4.35.2,<5.0.0 gguf>=0.1.0 keras==2.15.0 +tensorflow==2.15.0 --extra-index-url https://download.pytorch.org/whl/cpu torch~=2.2.1