If you only ever have one set of configuration parameters per model (same temp, top_p, system prompt...), then I guess you can put them in a gguf file (as the format is extensible).
But what if you want two different sets? You still need to keep them somewhere. That could be a shell script for llama.cpp, or a ModelFile for ollama.
(Assuming you don't want to create a new (massive) gguf file for each permutation of parameters.)
https://github.com/ggml-org/llama.cpp/blob/master/examples/m...
If you only ever have one set of configuration parameters per model (same temp, top_p, system prompt...), then I guess you can put them in a gguf file (as the format is extensible).
But what if you want two different sets? You still need to keep them somewhere. That could be a shell script for llama.cpp, or a ModelFile for ollama.
(Assuming you don't want to create a new (massive) gguf file for each permutation of parameters.)