mirror of
https://github.com/likelovewant/ollama-for-amd.git
synced 2025-12-22 06:43:57 +00:00
When creating a quantized model from safetensors we need the array KV values to be loaded.Changing this value to -1 loads the KV values on the returned layer to be used and saved during quantization.
18 KiB
18 KiB