ollama-for-amd/server at ea7657b54a000b9cf381e6e83463f50aaa40a161 - ollama-for-amd - Git.NotJustAnna.net

mirrors/ollama-for-amd

mirror of https://github.com/likelovewant/ollama-for-amd.git synced 2025-12-21 22:33:56 +00:00

Files

History

Daniel Andersen ea7657b54a sched: Add support for grouping GPUs (#10678 )

This patch modifies Ollama to allow grouping GPUs to memory-fit to the requested model, instead of the former algorithm of using one GPU distributing over all available GPUs.

Benefits:
 - Lower amount of (PCIe-)bus communication between GPUs - especially when they are not very high speed
 - Allowing unallocated GPUs to get into power-saving mode.
 - Significantly reduce VRAM allocation when using more than 2 GPUs in a system
 - Due to the reduced memory allocation, you can run more models simultaneously.

2025-08-11 13:59:38 -07:00

..

cache: fix comment function name in cache.go (#11110 )

2025-06-18 05:21:45 -07:00

auth.go

fix nil deref in auth.go

2024-07-26 14:14:48 -07:00

create_test.go

server: validate local path on safetensor create (#9379 )

2025-02-28 16:10:43 -08:00

create.go

remove support for multiple ggufs in a single file (#10722 )

2025-05-21 13:55:31 -07:00

download.go

server: abort download on empty digest

2025-05-27 11:28:48 -07:00

fixblobs_test.go

server: replace blob prefix separator from ':' to '-' (#3146 )

2024-03-14 20:18:06 -07:00

fixblobs.go

server: replace blob prefix separator from ':' to '-' (#3146 )

2024-03-14 20:18:06 -07:00

harmonyparser_test.go

gpt-oss (#11672 )

2025-08-05 12:21:16 -07:00

harmonyparser.go

gpt-oss (#11672 )

2025-08-05 12:21:16 -07:00

images_test.go

Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )

2025-06-20 11:11:40 -07:00

images.go

gpt-oss (#11672 )

2025-08-05 12:21:16 -07:00

layer.go

One corrupt manifest should not wedge model operations (#7515 )

2024-11-05 14:21:45 -08:00

manifest_test.go

One corrupt manifest should not wedge model operations (#7515 )

2024-11-05 14:21:45 -08:00

manifest.go

One corrupt manifest should not wedge model operations (#7515 )

2024-11-05 14:21:45 -08:00

model.go

tools: refactor tool call parsing and enable streaming (#10415 )

2025-05-23 14:19:31 -07:00

modelpath_test.go

lint: enable usetesting, disable tenv (#10594 )

2025-05-08 11:42:14 -07:00

modelpath.go

server: add hint to the error message when model path access fails (#10843 )

2025-05-24 13:17:04 -07:00

prompt_test.go

gpt-oss (#11672 )

2025-08-05 12:21:16 -07:00

prompt.go

gpt-oss (#11672 )

2025-08-05 12:21:16 -07:00

quantization_test.go

Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )

2025-06-20 11:11:40 -07:00

quantization.go

skip quantizing per_layer_token_embd (#11207 )

2025-06-26 21:49:35 -07:00

routes_create_test.go

Move quantization to new backend (#10363 )

2025-05-06 11:20:48 -07:00

routes_delete_test.go

Update the /api/create endpoint to use JSON (#7935 )

2024-12-31 18:02:30 -08:00

routes_generate_test.go

tools: support anyOf types

2025-08-05 16:46:24 -07:00

routes_harmony_streaming_test.go

tools: support anyOf types

2025-08-05 16:46:24 -07:00

routes_list_test.go

Update the /api/create endpoint to use JSON (#7935 )

2024-12-31 18:02:30 -08:00

routes_test.go

server: use slices.Equal to simplify code (#11502 )

2025-07-23 14:25:39 -07:00

routes.go

server: Reduce gpt-oss context length for small VRAM GPUs

2025-08-07 14:23:55 -07:00

sched_test.go

Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )

2025-06-20 11:11:40 -07:00

sched.go

sched: Add support for grouping GPUs (#10678 )

2025-08-11 13:59:38 -07:00

sparse_common.go

Don't hard fail on sparse setup error

2024-08-09 12:16:19 -07:00

sparse_windows.go

Don't hard fail on sparse setup error

2024-08-09 12:16:19 -07:00

upload.go

server: always print upload/download part info (#8832 )

2025-02-04 19:30:49 -08:00