Commit Graph

144 Commits

Author SHA1 Message Date
likelovewant
72bcdc1d4e Merge branch 'ollama:main' into main 2024-07-08 16:02:24 +08:00
Jeffrey Morgan
f8241bfba3 gpu: report system free memory instead of 0 (#5521) 2024-07-06 19:35:04 -04:00
likelovewant
dc1d1a121b Merge branch 'ollama:main' into main 2024-07-05 21:48:45 +08:00
Daniel Hiltgen
ef757da2c9 Better nvidia GPU discovery logging
Refine the way we log GPU discovery to improve the non-debug
output, and report more actionable log messages when possible
to help users troubleshoot on their own.
2024-07-03 10:50:40 -07:00
likelovewant
b8fdb0387c remove igpu limits 2024-07-02 11:06:26 +08:00
likelovewant
d772472225 Merge branch 'ollama:main' into main 2024-07-02 01:17:34 +08:00
likelovewant
1c648e512e remove code to support igpu 2024-06-29 22:32:45 +08:00
likelovewant
0e42bf50ca Merge upstream/main and resolve conflicts 2024-06-25 00:54:58 +08:00
Daniel Hiltgen
9929751cc8 Disable concurrency for AMD + Windows
Until ROCm v6.2 ships, we wont be able to get accurate free memory
reporting on windows, which makes automatic concurrency too risky.
Users can still opt-in but will need to pay attention to model sizes otherwise they may thrash/page VRAM or cause OOM crashes.
All other platforms and GPUs have accurate VRAM reporting wired
up now, so we can turn on concurrency by default.
2024-06-21 15:45:05 -07:00
Josh Yan
662568d453 err!=nil check 2024-06-20 09:30:59 -07:00
Josh Yan
4ebb66c662 reformat error check 2024-06-20 09:23:43 -07:00
Josh Yan
23e899f32d skip os.removeAll() if PID does not exist 2024-06-20 08:51:35 -07:00
Daniel Hiltgen
96624aa412 Merge pull request #5072 from dhiltgen/windows_path
Move libraries out of users path
2024-06-19 09:13:39 -07:00
Daniel Hiltgen
10f33b8537 Merge pull request #5146 from dhiltgen/backout
Put back temporary intel GPU env var
2024-06-19 09:12:45 -07:00
Daniel Hiltgen
d34d88e417 Revert "Revert "gpu: add env var for detecting Intel oneapi gpus (#5076)""
This reverts commit 755b4e4fc2.
2024-06-19 08:57:41 -07:00
Daniel Hiltgen
52ce350b7a Fix bad symbol load detection
pointer deref's weren't correct on a few libraries, which explains
some crashes on older systems or miswired symlinks for discovery libraries.
2024-06-19 08:39:07 -07:00
Wang,Zhe
badf975e45 get real func ptr. 2024-06-19 09:00:51 +08:00
Wang,Zhe
755b4e4fc2 Revert "gpu: add env var for detecting Intel oneapi gpus (#5076)"
This reverts commit 163cd3e77c.
2024-06-19 08:59:58 +08:00
Daniel Hiltgen
b2799f111b Move libraries out of users path
We update the PATH on windows to get the CLI mapped, but this has
an unintended side effect of causing other apps that may use our bundled
DLLs to get terminated when we upgrade.
2024-06-17 13:12:18 -07:00
Lei Jitang
4ad0d4d6d3 Fix a build warning (#5096)
Signed-off-by: Lei Jitang <leijitang@outlook.com>
2024-06-17 14:47:48 -04:00
Jeffrey Morgan
163cd3e77c gpu: add env var for detecting Intel oneapi gpus (#5076)
* gpu: add env var for detecting intel oneapi gpus

* fix build error
2024-06-16 20:09:05 -04:00
Daniel Hiltgen
fd1e6e0590 Add some more debugging logs for intel discovery
Also removes an unused overall count variable
2024-06-16 07:42:52 -07:00
Daniel Hiltgen
07d143f412 Merge pull request #5058 from coolljt0725/fix_build_warning
gpu: Fix build warning
2024-06-15 11:52:36 -07:00
Daniel Hiltgen
17ce203a26 Merge pull request #4875 from dhiltgen/rocm_gfx900_workaround
Rocm gfx900 workaround
2024-06-15 07:38:58 -07:00
Lei Jitang
225f0d1219 gpu: Fix build warning
Signed-off-by: Lei Jitang <leijitang@outlook.com>
2024-06-15 14:26:23 +08:00
Daniel Hiltgen
6be309e1bd Centralize GPU configuration vars
This should aid in troubleshooting by capturing and reporting the GPU
settings at startup in the logs along with all the other server settings.
2024-06-14 15:59:10 -07:00
Daniel Hiltgen
da3bf23354 Workaround gfx900 SDMA bugs
Implement support for GPU env var workarounds, and leverage
this for the Vega RX 56 which needs
HSA_ENABLE_SDMA=0 set to work properly
2024-06-14 15:38:13 -07:00
Daniel Hiltgen
6f351bf586 review comments and coverage 2024-06-14 14:55:50 -07:00
Daniel Hiltgen
fc37c192ae Refine CPU load behavior with system memory visibility 2024-06-14 14:51:40 -07:00
Daniel Hiltgen
434dfe30c5 Reintroduce nvidia nvml library for windows
This library will give us the most reliable free VRAM reporting on windows
to enable concurrent model scheduling.
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
4e2b7e181d Refactor intel gpu discovery 2024-06-14 14:51:40 -07:00
Daniel Hiltgen
6fd04ca922 Improve multi-gpu handling at the limit
Still not complete, needs some refinement to our prediction to understand the
discrete GPUs available space so we can see how many layers fit in each one
since we can't split one layer across multiple GPUs we can't treat free space
as one logical block
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
43ed358f9a Refine GPU discovery to bootstrap once
Now that we call the GPU discovery routines many times to
update memory, this splits initial discovery from free memory
updating.
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
b32ebb4f29 Use DRM driver for VRAM info for amd
The amdgpu drivers free VRAM reporting omits some other apps, so leverage the
upstream DRM driver which keeps better tabs on things
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
efac488675 Revert "Limit GPU lib search for now (#4777)"
This reverts commit 476fb8e892.
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
aac367636d Actually skip PhysX on windows 2024-06-13 13:17:19 -07:00
likelovewant
a6390a8992 Merge branch 'ollama:main' into main 2024-06-07 17:25:53 +08:00
Michael Yang
e919f6811f lint windows 2024-06-04 11:13:30 -07:00
Michael Yang
bf7edb0d5d lint linux 2024-06-04 11:13:30 -07:00
Michael Yang
e40145a39d lint 2024-06-04 11:13:30 -07:00
likelovewant
a4a435bf8f Update amd_windows.go 2024-06-03 14:55:48 +08:00
likelovewant
2490a69f7b Merge branch 'ollama:main' into main 2024-06-03 14:15:05 +08:00
Jeffrey Morgan
476fb8e892 Limit GPU lib search for now (#4777)
* fix oneapi errors on windows 10
2024-06-01 19:24:33 -07:00
likelovewant
cafde1f8ce Merge branch 'ollama:main' into main 2024-05-29 19:33:39 +08:00
Daniel Hiltgen
646371f56d Merge pull request #3278 from zhewang1-intc/rebase_ollama_main
Enabling ollama to run on Intel GPUs with SYCL backend
2024-05-28 16:30:50 -07:00
likelovewant
2a80d6f743 Merge branch 'ollama:main' into main 2024-05-26 11:57:21 +08:00
Patrick Devine
4cc3be3035 Move envconfig and consolidate env vars (#4608) 2024-05-24 14:57:15 -07:00
likelovewant
73c49d57e8 Update amd_windows.go
remove this will broken the installer build
2024-05-24 20:06:28 +08:00
likelovewant
6f03952a62 Update amd_linux.go
remove old gpu limits due to the new rocm supports
2024-05-24 15:30:41 +08:00
likelovewant
0e5b263a60 Update amd_windows.go
add igpu support ,remove those ,otherwise ,gfx1035 report not work
2024-05-24 15:26:24 +08:00