ollama-for-amd

mirrors/ollama-for-amd

Fork 0

mirror of https://github.com/likelovewant/ollama-for-amd.git synced 2025-12-21 22:33:56 +00:00

Commit Graph

Select branches

Hide Pull Requests

main

#114

#114

V0.1.33-alpha

v0.1.34-alpha

v0.1.35-alpha

v0.1.36-alpha

v0.1.37-alpha

v0.1.38-alpha

v0.1.39-alpha

v0.1.40-alpha

v0.1.42-alpha

v0.1.43-alpha

v0.1.45-alpha

v0.1.46-alpha

v0.1.47-Alpha

v0.1.48-alpha

v0.1.48-alpha-2

v0.10.1

v0.11.10

v0.11.2

v0.11.6

v0.11.8

v0.11.9

v0.12.11

v0.12.3

v0.12.6

v0.12.7

v0.13.0

v0.13.2

v0.13.3

v0.13.5

v0.2.1-alpha

v0.2.3-alpha

v0.2.5

v0.2.5-alpha

v0.2.7

v0.2.8

v0.3.0

v0.3.10

v0.3.11

v0.3.13

v0.3.14

v0.3.2

v0.3.4

v0.3.5

v0.3.6

v0.3.8

v0.4.1

v0.4.2

v0.4.4

v0.4.6

v0.5.1

v0.5.13

v0.5.4

v0.5.8

v0.5.9

v0.6.0

v0.6.1

v0.6.3

v0.6.6

v0.6.8

v0.7.0

v0.9.0

v0.9.2

v0.9.4

76eb7d0fff testing: test more models with tool calling (#12867) Patrick Devine 2025-10-30 13:19:21 -07:00
f67a6df110 interleaved mrope (#12807) Michael Yang 2025-10-30 11:29:00 -07:00
75e75d9afe qwen3vl: enable flash attention by default (#12862) Michael Yang 2025-10-30 10:51:37 -07:00
ed78e127d0 fix(cmd): unload model before removal (#12832) Michael Yang 2025-10-30 10:41:49 -07:00
d432ade714 fix: qwen2.5vl, qwen3vl composite image (#12841) Michael Yang 2025-10-30 10:33:19 -07:00
06b3422d5f tests: add tests and docs for commonly used ops (#12844) Michael Yang 2025-10-30 10:32:45 -07:00
cbe1cf06c4 Update README.md (#12822) Athiban Sharon 2025-10-30 17:14:39 +00:00
51e1480751 Merge branch 'ollama:main' into main v0.12.7 likelovewant 2025-10-30 10:15:05 +08:00
0a2d92081b Removing whitespace between Thinking and Content in Qwen3VL (#12838) Grace 2025-10-29 15:14:28 -07:00
c88647104d int: harden server lifecycle (#12835) Daniel Hiltgen 2025-10-29 11:50:56 -07:00
05aff4a4f1 tests: fix embeddinggemma integration test (#12830) Patrick Devine 2025-10-29 11:07:28 -07:00
0d140bd1af fix: conv2d bias (#12834) Michael Yang 2025-10-29 11:03:43 -07:00
93e45f0f0d docs: temporarily restore api.md and cleanup docs paths (#12818) Jeffrey Morgan 2025-10-28 23:25:48 -07:00
a342160803 docs: fix root api documentation page (#12813) Jeffrey Morgan 2025-10-28 19:17:54 -07:00
f6c29409dc docs: add new cloud model + fix openai redirect (#12812) Jeffrey Morgan 2025-10-28 19:09:07 -07:00
7d25b9e194 feat(model): add qwen3vl (#12665) Michael Yang 2025-10-28 17:39:47 -07:00
36d64fb531 embed: add distance correlation test for library embed models (#12796) Patrick Devine 2025-10-28 16:57:27 -07:00
d828517e78 docs: update readme and links (#12809) Parth Sareen 2025-10-28 16:20:02 -07:00
14977a9350 Fix vulkan PCI ID and ID handling (#12775) Daniel Hiltgen 2025-10-28 15:15:35 -07:00
29f63f37c8 Revert "server: Consolidate embedding truncation in runner (#12730)" (#12810) Patrick Devine 2025-10-28 14:49:14 -07:00
3d99d9779a docs: add docs for docs.ollama.com (#12805) Parth Sareen 2025-10-28 13:18:48 -07:00
6d02a43a75 docs: rename to mdx to setup docs site (#12804) Parth Sareen 2025-10-28 13:04:31 -07:00
5483497d7a Revert "docs: add reference to docs.ollama.com (#12800)" (#12803) Parth Sareen 2025-10-28 12:52:49 -07:00
934dd9e196 docs: add reference to docs.ollama.com (#12800) Parth Sareen 2025-10-28 12:44:02 -07:00
1188f408dd s/From*Slice/From*s/ (#12255) Michael Yang 2025-10-28 12:08:49 -07:00
15c7d30d9a embedding tests: added check against exact base64 string (#12790) nicole pardal 2025-10-28 10:37:20 -07:00
9862317174 Merge pull request #12793 from ollama/drifkin/12792_renderer-parser-from Devon Rifkin 2025-10-28 00:15:46 -07:00
ec9eb28f4c gemma3: make embedding non-causal (#12297) Michael Yang 2025-10-27 19:54:08 -07:00
1bdd816910 create: inherit FROM model's renderer/parser Devon Rifkin 2025-10-27 15:14:19 -07:00
5d347f6d6f server: Consolidate embedding truncation in runner (#12730) nicole pardal 2025-10-27 11:59:12 -07:00
b97eb2b858 cloud: set the proxy content-type to the same as local models (#12759) Patrick Devine 2025-10-25 10:57:10 -07:00
ad6f6a1d29 llm: Change memory allocation backoff from exponential to incremental Jesse Gross 2025-10-23 11:31:25 -07:00
6723a40be6 readme: add VT Code project to terminal community integrations (#12749) Vinh Nguyen 2025-10-24 02:29:50 +07:00
3258a89b6e DRY out the runner lifecycle code (#12540) Daniel Hiltgen 2025-10-23 11:20:02 -07:00
1c093e97af kvcache: Remove special case for reservation mask Jesse Gross 2025-10-22 16:00:43 -07:00
a8d9c2648e llamarunner: Record the time for all batches during prompt processing Jesse Gross 2025-10-16 16:27:45 -07:00
0334e67ffd tools: parse tool calls that don't conform to ("name": name, "arguments": args} (#12738) frob 2025-10-22 20:34:27 +02:00
e0ead1adee embeddings: base64 encoding fix (#12715) nicole pardal 2025-10-22 11:27:44 -07:00
d515aed6c3 cloud: don't error sending empty messages (#12724) Patrick Devine 2025-10-21 18:12:14 -07:00
7f551c41e7 Merge branch 'ollama:main' into main likelovewant 2025-10-21 19:38:31 +08:00
5fe7ba1b9b runner: always truncate embeddings requests (#12714) Jeffrey Morgan 2025-10-20 16:47:05 -07:00
d2b63c19b3 fs(ggml): fill in arch prefix if necessary (#12646) Michael Yang 2025-10-20 16:42:18 -07:00
94f110b35a model/parsers: remove warning for missing <think> tag for qwen3-vl (#12713) Jeffrey Morgan 2025-10-20 16:03:43 -07:00
5d22953ba7 cuda: get driver version after props (#12707) Daniel Hiltgen 2025-10-20 10:57:27 -07:00
d245dffed8 rocm: give it more time to bootstrap (#12681) Daniel Hiltgen 2025-10-20 09:43:05 -07:00
cb13784a11 merge update v0.12.6 likelovewant 2025-10-18 23:03:13 +08:00
bc1a818fdc contiguous input per layer (#12686) Daniel Hiltgen 2025-10-17 18:39:18 -07:00
ba2253dc30 win: more verbose load failures (#12683) Daniel Hiltgen 2025-10-17 17:13:16 -07:00
68e04c7ff8 test: harden scheduler tests (#12662) Daniel Hiltgen 2025-10-17 08:56:44 -07:00
270679932f cuda: tidy up CC settings (#12668) Daniel Hiltgen 2025-10-16 16:39:30 -07:00
65fb3ff49d renderers: add global flag for setting [img] tags (#12669) Jeffrey Morgan 2025-10-16 16:37:32 -07:00
e2a0b24435 Grace/qwen3 thinking (#12647) Grace 2025-10-16 15:29:41 -07:00
1813ff85a0 cuda: bring back CC 5.2 (#12666) Daniel Hiltgen 2025-10-16 13:07:41 -07:00
b531777a66 test: add a few missing embedding models (#12661) Daniel Hiltgen 2025-10-16 09:36:25 -07:00
fe3ec8dbf0 Revert "Workaround broken NVIDIA iGPU free VRAM data (#12490)" (#12642) Daniel Hiltgen 2025-10-16 09:09:48 -07:00
c744134287 vulkan: Get FilterID from Backend for Vulkan (#12655) Thomas Stocker 2025-10-16 18:07:35 +02:00
4be41d2d45 readme: add achatbot-go to community integrations (#12629) weedge 2025-10-16 12:54:15 +08:00
de670570c9 fs/ggml: fix function name in comment (#12630) zhetaicheleba 2025-10-16 13:53:38 +09:00
201d93716e Merge pull request #12651 from ollama/drifkin/oai-conversion Devon Rifkin 2025-10-15 21:10:30 -07:00
160cecc8e2 openai: make tool call conversion fns public Devon Rifkin 2025-10-15 20:54:58 -07:00
8b6e5baee7 CI: Set up temporary opt-out Vulkan support (#12614) Daniel Hiltgen 2025-10-15 14:18:01 -07:00
75d17fc6c2 perf: backport cuda iGPU sched spin (#12641) Daniel Hiltgen 2025-10-15 11:52:14 -07:00
8fafc8af77 ml/backend/ggml: NVML fallback for unified memory GPUs (#12619) Santosh Bhavani 2025-10-15 13:40:06 -05:00
c3c85aa06c llm: Enable flash attention by default for gemma3 Jesse Gross 2025-10-15 10:22:03 -07:00
0d713051a2 envconfig: default to port 443 when connecting to ollama.com (#12617) Jeffrey Morgan 2025-10-14 23:38:24 -07:00
c4c5a4a01e types: send index for tool calls (#12625) Parth Sareen 2025-10-14 19:35:15 -07:00
3dcfd5f69e llm: Perform eviction when num_gpu is set with new estimates Jesse Gross 2025-10-14 17:21:16 -07:00
53a969d509 Merge pull request #12621 from ollama/drifkin/any-of Devon Rifkin 2025-10-14 15:51:24 -07:00
08fbb60bb2 qwen3-coder: support anyOf when parsing tool calls Devon Rifkin 2025-10-14 15:33:05 -07:00
850da848c5 logs: fix bogus "0 MiB free" log line (#12590) Daniel Hiltgen 2025-10-14 11:26:28 -07:00
2aba569a2a Vulkan based on #9650 (#11835) Thomas Stocker 2025-10-14 19:59:58 +02:00
fd8aa947f3 Merge pull request #12562 from ollama/drifkin/registries Devon Rifkin 2025-10-14 02:01:53 -07:00
ddaca643d0 add registries for parsers/renderers Devon Rifkin 2025-10-14 01:13:54 -07:00
05982a95cb Qwen3VL Cloud Parser and Renderer (#12526) Grace 2025-10-13 16:52:33 -07:00
4987f13d34 Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) Gabe Goodhart 2025-10-13 16:26:18 -06:00
e638f2acb6 runner: fix shifting on llama runner (#12604) Jeffrey Morgan 2025-10-13 13:46:33 -07:00
18087f2ec7 Revert "use llama runner for qwen3 (#12556)" Michael Yang 2025-10-13 13:21:06 -07:00
6c833d5f8d fix(qwen3): deepseek distill Michael Yang 2025-10-13 12:09:53 -07:00
6544e14735 Reapply "add truncate and shift parameters" (#12582) Jeffrey Morgan 2025-10-11 16:06:14 -07:00
5db8a818a1 Merge pull request #12581 from ollama/drifkin/renderer-api-generate Devon Rifkin 2025-10-11 14:10:23 -07:00
6db8da9958 routes: fix built-in renderers for api/generate Devon Rifkin 2025-10-11 13:57:43 -07:00
0c68ec8d6a discover: fix typo (#12565) frob 2025-10-11 21:06:02 +02:00
70d9e363e1 doc: remove AMD EOL GPUs (#12567) Daniel Hiltgen 2025-10-10 17:16:29 -07:00
1a2feb2a97 ollamarunner: fix deadlock Michael Yang 2025-10-10 16:38:12 -07:00
aab2190420 implement nvml for linux (#12517) Daniel Hiltgen 2025-10-10 15:15:56 -07:00
629db9dc43 comment split Michael Yang 2025-10-09 16:13:03 -07:00
e0cd511661 fix test Michael Yang 2025-10-07 16:46:37 -07:00
207332078f fix lint Michael Yang 2025-10-07 16:39:14 -07:00
93085127f4 convert: slice gate_up weight Michael Yang 2025-10-06 16:05:38 -07:00
c00fa9cc2b convert: split gate_up bias Michael Yang 2025-10-06 14:55:55 -07:00
df411c4b02 refactor: using testing.B.Loop yajianggroup 2025-09-23 16:05:59 +08:00
3d32249c74 use llama runner for qwen3 (#12556) Jeffrey Morgan 2025-10-09 19:08:21 -07:00
d681cd7c29 thinking: allow "think": false for non-thinking models (#12555) Patrick Devine 2025-10-09 18:46:00 -07:00
47298fce39 refactor: use builtin max and min shengxinjing 2025-09-28 23:06:33 +01:00
4a48937ef1 refactor: use builtin max and min shengxinjing 2025-09-25 21:25:37 +01:00
967a82f52f ollamarunner: measure only active time Michael Yang 2025-09-29 12:29:26 -07:00
bbbc73d637 llamarunner: update metrics Michael Yang 2025-09-26 16:41:16 -07:00
15e3611d3d logs: quiet down context canceled on completion and scheduler noise (#12553) Daniel Hiltgen 2025-10-09 10:37:47 -07:00
77060d462c routes: structured outputs for gpt-oss (#12460) Parth Sareen 2025-10-08 19:13:38 -07:00
1b91d4dda1 openai: change the reasonin_effort field to also take none Patrick Devine 2025-10-08 18:21:01 -07:00

1 2 3 4 5 ...