ollama-for-amd/integration/concurrency_test.go at 730ed6e9e12d0bf182d554a54dee8bbbef6a88c7

mirror of https://github.com/likelovewant/ollama-for-amd.git synced 2025-12-22 06:43:57 +00:00

Files

Daniel Hiltgen 6745182885 tests: reduce stress on CPU to 2 models (#12161 )

* tests: reduce stress on CPU to 2 models

This should avoid flakes due to systems getting overloaded with 3 (or more) models running concurrently

* tests: allow slow systems to pass on timeout

If a slow system is still streaming a response, and the response
will pass validation, don't fail just because the system is slow.

* test: unload embedding models more quickly

2025-09-09 09:32:15 -07:00

6.1 KiB

Raw Blame History

View Raw

6.1 KiB Raw Blame History

6.1 KiB

Raw Blame History