Мне кажется или CLI llama.cpp разучился вытирать реверс-промпт из ChatML-ответов?
$ llama-cli -m ./models/openhermes-2.5-neural-chat-7b-v3-1-7b/ggml-model-q5_k_m.gguf -n -1 -t 4 --color -f prompts/chat-with-chatml.txt --prompt-cache ./models/openhermes-2.5-neural-chat-7b-v3-1-7b/ggml-model-q5_k_m_chat-with-chatml.txt.prompt --ctx-size 2048 -cnv --in-prefix '<|im_start|>user\n' --in-suffix '<|im_end|>\n<|im_start|>assistant\n' --reverse-prompt '<|im_end|>' --chat-template chatml
...
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to the AI.
- To return control without starting a new line, end your input with '/'.
- If you want to submit another line, end your input with '\'.
<|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
Hello<|im_end|>
<|im_start|>assistant
Hi there<|im_end|>
<|im_start|>user
Who are you<|im_end|>
<|im_start|>assistant
I am an assistant<|im_end|>
> Howdy?
Hello again<|im_end|>
>
Вот чего он этот "<|im_end|>" мне показывает?!