Мне кажется или CLI llama.cpp разучился вытирать реверс-промпт из ChatML-ответов?
$ llama-cli -m ./models/openhermes-2.5-neural-chat-7b-v3-1-7b/ggml-model-q5_k_m.gguf -n -1 -t 4 --color -f prompts/chat-with-chatml.txt --prompt-cache ./models/openhermes-2.5-neural-chat-7b-v3-1-7b/ggml-model-q5_k_m_chat-with-chatml.txt.prompt --ctx-size 2048 -cnv --in-prefix '<|im_start|>user\n' --in-suffix '<|im_end|>\n<|im_start|>assistant\n' --reverse-prompt '<|im_end|>'  --chat-template chatml
...
== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to the AI.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.
 <|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
Hello<|im_end|>
<|im_start|>assistant
Hi there<|im_end|>
<|im_start|>user
Who are you<|im_end|>
<|im_start|>assistant
I am an assistant<|im_end|>
> Howdy?
Hello again<|im_end|>
> 
Вот чего он этот "<|im_end|>" мне показывает?!
