ollama-for-amd/llama/patches/0036-ggml-cuda-skip-large-batches.patch at dba62ff3a572af4af845711c2091b70606b06af4

mirror of https://github.com/likelovewant/ollama-for-amd.git synced 2025-12-22 06:43:57 +00:00

Files

Michael Yang 0796d79d19 cuda: skip large batches

cuda panics on batches larger than 1024 so skip those and fallback to
cpu

2025-11-18 16:11:37 -08:00

View Raw