Commit Graph

  • 76eb7d0fff testing: test more models with tool calling (#12867) Patrick Devine 2025-10-30 13:19:21 -07:00
  • f67a6df110 interleaved mrope (#12807) Michael Yang 2025-10-30 11:29:00 -07:00
  • 75e75d9afe qwen3vl: enable flash attention by default (#12862) Michael Yang 2025-10-30 10:51:37 -07:00
  • ed78e127d0 fix(cmd): unload model before removal (#12832) Michael Yang 2025-10-30 10:41:49 -07:00
  • d432ade714 fix: qwen2.5vl, qwen3vl composite image (#12841) Michael Yang 2025-10-30 10:33:19 -07:00
  • 06b3422d5f tests: add tests and docs for commonly used ops (#12844) Michael Yang 2025-10-30 10:32:45 -07:00
  • cbe1cf06c4 Update README.md (#12822) Athiban Sharon 2025-10-30 17:14:39 +00:00
  • 51e1480751 Merge branch 'ollama:main' into main v0.12.7 likelovewant 2025-10-30 10:15:05 +08:00
  • 0a2d92081b Removing whitespace between Thinking and Content in Qwen3VL (#12838) Grace 2025-10-29 15:14:28 -07:00
  • c88647104d int: harden server lifecycle (#12835) Daniel Hiltgen 2025-10-29 11:50:56 -07:00
  • 05aff4a4f1 tests: fix embeddinggemma integration test (#12830) Patrick Devine 2025-10-29 11:07:28 -07:00
  • 0d140bd1af fix: conv2d bias (#12834) Michael Yang 2025-10-29 11:03:43 -07:00
  • 93e45f0f0d docs: temporarily restore api.md and cleanup docs paths (#12818) Jeffrey Morgan 2025-10-28 23:25:48 -07:00
  • a342160803 docs: fix root api documentation page (#12813) Jeffrey Morgan 2025-10-28 19:17:54 -07:00
  • f6c29409dc docs: add new cloud model + fix openai redirect (#12812) Jeffrey Morgan 2025-10-28 19:09:07 -07:00
  • 7d25b9e194 feat(model): add qwen3vl (#12665) Michael Yang 2025-10-28 17:39:47 -07:00
  • 36d64fb531 embed: add distance correlation test for library embed models (#12796) Patrick Devine 2025-10-28 16:57:27 -07:00
  • d828517e78 docs: update readme and links (#12809) Parth Sareen 2025-10-28 16:20:02 -07:00
  • 14977a9350 Fix vulkan PCI ID and ID handling (#12775) Daniel Hiltgen 2025-10-28 15:15:35 -07:00
  • 29f63f37c8 Revert "server: Consolidate embedding truncation in runner (#12730)" (#12810) Patrick Devine 2025-10-28 14:49:14 -07:00
  • 3d99d9779a docs: add docs for docs.ollama.com (#12805) Parth Sareen 2025-10-28 13:18:48 -07:00
  • 6d02a43a75 docs: rename to mdx to setup docs site (#12804) Parth Sareen 2025-10-28 13:04:31 -07:00
  • 5483497d7a Revert "docs: add reference to docs.ollama.com (#12800)" (#12803) Parth Sareen 2025-10-28 12:52:49 -07:00
  • 934dd9e196 docs: add reference to docs.ollama.com (#12800) Parth Sareen 2025-10-28 12:44:02 -07:00
  • 1188f408dd s/From*Slice/From*s/ (#12255) Michael Yang 2025-10-28 12:08:49 -07:00
  • 15c7d30d9a embedding tests: added check against exact base64 string (#12790) nicole pardal 2025-10-28 10:37:20 -07:00
  • 9862317174 Merge pull request #12793 from ollama/drifkin/12792_renderer-parser-from Devon Rifkin 2025-10-28 00:15:46 -07:00
  • ec9eb28f4c gemma3: make embedding non-causal (#12297) Michael Yang 2025-10-27 19:54:08 -07:00
  • 1bdd816910 create: inherit FROM model's renderer/parser Devon Rifkin 2025-10-27 15:14:19 -07:00
  • 5d347f6d6f server: Consolidate embedding truncation in runner (#12730) nicole pardal 2025-10-27 11:59:12 -07:00
  • b97eb2b858 cloud: set the proxy content-type to the same as local models (#12759) Patrick Devine 2025-10-25 10:57:10 -07:00
  • ad6f6a1d29 llm: Change memory allocation backoff from exponential to incremental Jesse Gross 2025-10-23 11:31:25 -07:00
  • 6723a40be6 readme: add VT Code project to terminal community integrations (#12749) Vinh Nguyen 2025-10-24 02:29:50 +07:00
  • 3258a89b6e DRY out the runner lifecycle code (#12540) Daniel Hiltgen 2025-10-23 11:20:02 -07:00
  • 1c093e97af kvcache: Remove special case for reservation mask Jesse Gross 2025-10-22 16:00:43 -07:00
  • a8d9c2648e llamarunner: Record the time for all batches during prompt processing Jesse Gross 2025-10-16 16:27:45 -07:00
  • 0334e67ffd tools: parse tool calls that don't conform to ("name": name, "arguments": args} (#12738) frob 2025-10-22 20:34:27 +02:00
  • e0ead1adee embeddings: base64 encoding fix (#12715) nicole pardal 2025-10-22 11:27:44 -07:00
  • d515aed6c3 cloud: don't error sending empty messages (#12724) Patrick Devine 2025-10-21 18:12:14 -07:00
  • 7f551c41e7 Merge branch 'ollama:main' into main likelovewant 2025-10-21 19:38:31 +08:00
  • 5fe7ba1b9b runner: always truncate embeddings requests (#12714) Jeffrey Morgan 2025-10-20 16:47:05 -07:00
  • d2b63c19b3 fs(ggml): fill in arch prefix if necessary (#12646) Michael Yang 2025-10-20 16:42:18 -07:00
  • 94f110b35a model/parsers: remove warning for missing <think> tag for qwen3-vl (#12713) Jeffrey Morgan 2025-10-20 16:03:43 -07:00
  • 5d22953ba7 cuda: get driver version after props (#12707) Daniel Hiltgen 2025-10-20 10:57:27 -07:00
  • d245dffed8 rocm: give it more time to bootstrap (#12681) Daniel Hiltgen 2025-10-20 09:43:05 -07:00
  • cb13784a11 merge update v0.12.6 likelovewant 2025-10-18 23:03:13 +08:00
  • bc1a818fdc contiguous input per layer (#12686) Daniel Hiltgen 2025-10-17 18:39:18 -07:00
  • ba2253dc30 win: more verbose load failures (#12683) Daniel Hiltgen 2025-10-17 17:13:16 -07:00
  • 68e04c7ff8 test: harden scheduler tests (#12662) Daniel Hiltgen 2025-10-17 08:56:44 -07:00
  • 270679932f cuda: tidy up CC settings (#12668) Daniel Hiltgen 2025-10-16 16:39:30 -07:00
  • 65fb3ff49d renderers: add global flag for setting [img] tags (#12669) Jeffrey Morgan 2025-10-16 16:37:32 -07:00
  • e2a0b24435 Grace/qwen3 thinking (#12647) Grace 2025-10-16 15:29:41 -07:00
  • 1813ff85a0 cuda: bring back CC 5.2 (#12666) Daniel Hiltgen 2025-10-16 13:07:41 -07:00
  • b531777a66 test: add a few missing embedding models (#12661) Daniel Hiltgen 2025-10-16 09:36:25 -07:00
  • fe3ec8dbf0 Revert "Workaround broken NVIDIA iGPU free VRAM data (#12490)" (#12642) Daniel Hiltgen 2025-10-16 09:09:48 -07:00
  • c744134287 vulkan: Get FilterID from Backend for Vulkan (#12655) Thomas Stocker 2025-10-16 18:07:35 +02:00
  • 4be41d2d45 readme: add achatbot-go to community integrations (#12629) weedge 2025-10-16 12:54:15 +08:00
  • de670570c9 fs/ggml: fix function name in comment (#12630) zhetaicheleba 2025-10-16 13:53:38 +09:00
  • 201d93716e Merge pull request #12651 from ollama/drifkin/oai-conversion Devon Rifkin 2025-10-15 21:10:30 -07:00
  • 160cecc8e2 openai: make tool call conversion fns public Devon Rifkin 2025-10-15 20:54:58 -07:00
  • 8b6e5baee7 CI: Set up temporary opt-out Vulkan support (#12614) Daniel Hiltgen 2025-10-15 14:18:01 -07:00
  • 75d17fc6c2 perf: backport cuda iGPU sched spin (#12641) Daniel Hiltgen 2025-10-15 11:52:14 -07:00
  • 8fafc8af77 ml/backend/ggml: NVML fallback for unified memory GPUs (#12619) Santosh Bhavani 2025-10-15 13:40:06 -05:00
  • c3c85aa06c llm: Enable flash attention by default for gemma3 Jesse Gross 2025-10-15 10:22:03 -07:00
  • 0d713051a2 envconfig: default to port 443 when connecting to ollama.com (#12617) Jeffrey Morgan 2025-10-14 23:38:24 -07:00
  • c4c5a4a01e types: send index for tool calls (#12625) Parth Sareen 2025-10-14 19:35:15 -07:00
  • 3dcfd5f69e llm: Perform eviction when num_gpu is set with new estimates Jesse Gross 2025-10-14 17:21:16 -07:00
  • 53a969d509 Merge pull request #12621 from ollama/drifkin/any-of Devon Rifkin 2025-10-14 15:51:24 -07:00
  • 08fbb60bb2 qwen3-coder: support anyOf when parsing tool calls Devon Rifkin 2025-10-14 15:33:05 -07:00
  • 850da848c5 logs: fix bogus "0 MiB free" log line (#12590) Daniel Hiltgen 2025-10-14 11:26:28 -07:00
  • 2aba569a2a Vulkan based on #9650 (#11835) Thomas Stocker 2025-10-14 19:59:58 +02:00
  • fd8aa947f3 Merge pull request #12562 from ollama/drifkin/registries Devon Rifkin 2025-10-14 02:01:53 -07:00
  • ddaca643d0 add registries for parsers/renderers Devon Rifkin 2025-10-14 01:13:54 -07:00
  • 05982a95cb Qwen3VL Cloud Parser and Renderer (#12526) Grace 2025-10-13 16:52:33 -07:00
  • 4987f13d34 Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) Gabe Goodhart 2025-10-13 16:26:18 -06:00
  • e638f2acb6 runner: fix shifting on llama runner (#12604) Jeffrey Morgan 2025-10-13 13:46:33 -07:00
  • 18087f2ec7 Revert "use llama runner for qwen3 (#12556)" Michael Yang 2025-10-13 13:21:06 -07:00
  • 6c833d5f8d fix(qwen3): deepseek distill Michael Yang 2025-10-13 12:09:53 -07:00
  • 6544e14735 Reapply "add truncate and shift parameters" (#12582) Jeffrey Morgan 2025-10-11 16:06:14 -07:00
  • 5db8a818a1 Merge pull request #12581 from ollama/drifkin/renderer-api-generate Devon Rifkin 2025-10-11 14:10:23 -07:00
  • 6db8da9958 routes: fix built-in renderers for api/generate Devon Rifkin 2025-10-11 13:57:43 -07:00
  • 0c68ec8d6a discover: fix typo (#12565) frob 2025-10-11 21:06:02 +02:00
  • 70d9e363e1 doc: remove AMD EOL GPUs (#12567) Daniel Hiltgen 2025-10-10 17:16:29 -07:00
  • 1a2feb2a97 ollamarunner: fix deadlock Michael Yang 2025-10-10 16:38:12 -07:00
  • aab2190420 implement nvml for linux (#12517) Daniel Hiltgen 2025-10-10 15:15:56 -07:00
  • 629db9dc43 comment split Michael Yang 2025-10-09 16:13:03 -07:00
  • e0cd511661 fix test Michael Yang 2025-10-07 16:46:37 -07:00
  • 207332078f fix lint Michael Yang 2025-10-07 16:39:14 -07:00
  • 93085127f4 convert: slice gate_up weight Michael Yang 2025-10-06 16:05:38 -07:00
  • c00fa9cc2b convert: split gate_up bias Michael Yang 2025-10-06 14:55:55 -07:00
  • df411c4b02 refactor: using testing.B.Loop yajianggroup 2025-09-23 16:05:59 +08:00
  • 3d32249c74 use llama runner for qwen3 (#12556) Jeffrey Morgan 2025-10-09 19:08:21 -07:00
  • d681cd7c29 thinking: allow "think": false for non-thinking models (#12555) Patrick Devine 2025-10-09 18:46:00 -07:00
  • 47298fce39 refactor: use builtin max and min shengxinjing 2025-09-28 23:06:33 +01:00
  • 4a48937ef1 refactor: use builtin max and min shengxinjing 2025-09-25 21:25:37 +01:00
  • 967a82f52f ollamarunner: measure only active time Michael Yang 2025-09-29 12:29:26 -07:00
  • bbbc73d637 llamarunner: update metrics Michael Yang 2025-09-26 16:41:16 -07:00
  • 15e3611d3d logs: quiet down context canceled on completion and scheduler noise (#12553) Daniel Hiltgen 2025-10-09 10:37:47 -07:00
  • 77060d462c routes: structured outputs for gpt-oss (#12460) Parth Sareen 2025-10-08 19:13:38 -07:00
  • 1b91d4dda1 openai: change the reasonin_effort field to also take none Patrick Devine 2025-10-08 18:21:01 -07:00