Files
ollama-for-amd/fs
Jesse Gross 71cb86af3e llm: Remove unneeded warning with flash attention enabled
If flash attention is enabled without KV cache quanitization, we will
currently always get this warning:
level=WARN source=server.go:226 msg="kv cache type not supported by model" type=""
2025-09-10 16:40:45 -07:00
..
2025-02-13 16:31:21 -08:00
2025-06-25 21:47:09 -07:00