mirror of
https://github.com/likelovewant/ollama-for-amd.git
synced 2025-12-21 14:26:30 +00:00
If we create a memory layout that should fit based on report free VRAM but allocation still fails, we start applying a backoff. This reduces free VRAM by an exponential percentage (1%, 2%, 4%...). However, the points chosen tend to be too dense at the beginning and too sparse at the end. Therefore, this switches to an incremental backoff (10%, 20%, 30%...).