MiniCPM models use the llm_build_granite constructor which was changed
in the Granite Four PR to use hparams.rope_finetuned instead of a
use_rope parameter. MiniCPM models need rope enabled by default.
Fixes inference from gibberish to correct responses.
ml.get_key(LLM_KV_RESIDUAL_SCALE, hparams.f_residual_scale);
ml.get_key(LLM_KV_LOGIT_SCALE, hparams.f_logit_scale);
+ // MiniCPM uses rope by default, unlike Granite which uses it as a switch
+ hparams.rope_finetuned = true;
+
switch (hparams.n_layer) {
case 52: type = LLM_TYPE_1B; break;
case 40: type = LLM_TYPE_2B; break;