From: Daniel Bevenius <redacted>
Date: Wed, 18 Sep 2024 11:42:36 +0000 (+0200)
Subject: llama : use reserve/emplace_back in sampler_sample (#9534)
X-Git-Tag: upstream/0.0.4488~704
X-Git-Url: https://git.djapps.eu/?a=commitdiff_plain;h=6443ddd98576a9da904ef9f07df4e4398bb6a01a;p=pkg%2Fggml%2Fsources%2Fllama.cpp

llama : use reserve/emplace_back in sampler_sample (#9534)

This commit updates the llama_sampler_sample function to use reserve and
emplace_back for the vector of llama_token_data structs.

The motivation for this change is to avoid the creation of n_vocab
default-constructed llama_token_data structs which are then
immediately overwritten.
---

diff --git a/src/llama-sampling.cpp b/src/llama-sampling.cpp
index 5275b1d60..5299f5116 100644
--- a/src/llama-sampling.cpp
+++ b/src/llama-sampling.cpp
@@ -236,9 +236,10 @@ llama_token llama_sampler_sample(struct llama_sampler * smpl, struct llama_conte
     const int n_vocab = llama_n_vocab(llama_get_model(ctx));
 
     // TODO: do not allocate each time
-    std::vector<llama_token_data> cur(n_vocab);
+    std::vector<llama_token_data> cur;
+    cur.reserve(n_vocab);
     for (llama_token token_id = 0; token_id < n_vocab; token_id++) {
-        cur[token_id] = llama_token_data{token_id, logits[token_id], 0.0f};
+        cur.emplace_back(llama_token_data{token_id, logits[token_id], 0.0f});
     }
 
     llama_token_data_array cur_p = {