ollama-for-amd

mirror of https://github.com/likelovewant/ollama-for-amd.git synced 2025-12-24 23:48:01 +00:00

Files

Jesse Gross a27462b708 ollamarunner: Temporarily disable worst case graph preallocation

When we later have a large batch running purely on a CPU, this
results the error:
GGML_ASSERT(talloc->buffer_id >= 0)

Disabling this means that we will incrementally reallocate memory
as the graph grows.

Fixes #10410

2025-04-29 11:04:58 -07:00

cache_test.go

ollamarunner: Preallocate worst case graph at startup

2025-04-08 10:01:28 -07:00

cache.go

kvcache: Add check for values that fall out of sliding window cache

2025-04-02 11:55:48 -07:00

runner.go

ollamarunner: Temporarily disable worst case graph preallocation

2025-04-29 11:04:58 -07:00