Commit Graph

  • 20b53eaa72 tests: add tool calling integration test (#12232) Parth Sareen 2025-09-09 14:01:11 -07:00
  • 6745182885 tests: reduce stress on CPU to 2 models (#12161) Daniel Hiltgen 2025-09-09 09:32:15 -07:00
  • f810ec741c readme: add Clueless to community integrations (#12188) Kashyap Tanuku 2025-09-09 00:31:29 -04:00
  • e119783e66 llm: Clamp batch size to context size Jesse Gross 2025-09-08 17:33:31 -07:00
  • 1a558f98e2 runner: move harmony to runner (#12052) Parth Sareen 2025-09-08 15:07:59 -07:00
  • 7b91c9ce51 Hybrid and recurrent memory estimates (#12186) Gabe Goodhart 2025-09-08 15:53:22 -06:00
  • 950d33aa30 docs: show how to debug nvidia init failures (#12216) Daniel Hiltgen 2025-09-08 11:39:00 -07:00
  • 9714e38dd0 fix: nil pointer dereference if cache is nil (#12215) Michael Yang 2025-09-08 09:53:59 -07:00
  • 4378ae4ffa parser: don't check the file type of safetensors to prevent false negatives. (#12176) frob 2025-09-06 01:27:40 +02:00
  • 501cb38b8c Merge branch 'ollama:main' into main v0.11.10 likelovewant 2025-09-05 17:58:44 +08:00
  • 5994e8e8fd embedding gemma model (#12181) Michael Yang 2025-09-04 09:09:07 -07:00
  • 59e3a35203 Merge branch 'ollama:main' into main v0.11.9 likelovewant 2025-09-04 19:34:11 +08:00
  • b3e6120736 more logutil.Trace (#12177) Michael Yang 2025-09-03 17:24:39 -07:00
  • fb92b61754 logutil: add Trace and TraceContext helpers (#12110) Michael Yang 2025-09-02 13:09:12 -07:00
  • 8149a3c86e llm: Avoid underflow in free memory logging Jesse Gross 2025-09-02 10:47:33 -07:00
  • 0cc90a8186 harden uncaught exception registration (#12120) Daniel Hiltgen 2025-09-02 09:43:55 -07:00
  • e42300f25b ml: fix struct field name in comment (#12123) pxwanglu 2025-09-01 07:26:11 +08:00
  • 66e73809a1 readme: add NOMYO Router to community integrations (#12129) alpha-nerd-nomyo 2025-08-31 22:49:10 +02:00
  • c632fdbad8 Merge branch 'ollama:main' into main v0.11.8 likelovewant 2025-08-31 19:44:41 +08:00
  • 517807cdf2 perf: build graph for next batch async to keep GPU busy (#11863) Daniel Hiltgen 2025-08-29 14:20:28 -07:00
  • ead4a9a1d0 Always filter devices (#12108) Daniel Hiltgen 2025-08-29 12:17:31 -07:00
  • 4383a3ab7a readme: add Neuro SAN to community integrations (#12109) ofrancon 2025-08-28 12:27:13 -07:00
  • 9d97e6a9f1 ggml: Avoid allocating CUDA primary context on unused GPUs Jesse Gross 2025-08-26 14:17:43 -07:00
  • 1081532430 fix keep alive (#12041) Michael Yang 2025-08-27 11:51:25 -07:00
  • 59412fbb43 convert(gptoss): mxfp4 to ggml layout to avoid jit conversion (#12018) Michael Yang 2025-08-26 16:41:02 -07:00
  • 86834a2797 convert: fix tensor sorting (#12015) Michael Yang 2025-08-26 13:57:46 -07:00
  • 85ccf7354d gptoss: enable flash attention by default (#11996) Michael Yang 2025-08-26 13:34:45 -07:00
  • 30fb7e19f8 remove extra field attr (#11205) Michael Yang 2025-08-25 09:58:16 -07:00
  • d3450dd52e api: implement stringer for ToolFunctionParameters (#12038) Jeffrey Morgan 2025-08-22 16:26:48 -07:00
  • 4bcb04ad88 tools: avoid matching braces that are part of tool content (#12039) Jeffrey Morgan 2025-08-22 15:22:14 -07:00
  • e3d5708754 Merge pull request #12021 from ollama/drifkin/thinking-double-emit Devon Rifkin 2025-08-22 12:01:37 -07:00
  • 4be4dc8717 server: skip parsing initial <think> if provided in the prompt (#12024) Jeffrey Morgan 2025-08-22 12:00:16 -07:00
  • 109d4fc3b4 chore: remove redundant words in comment (#12028) zoupingshi 2025-08-23 02:00:27 +08:00
  • 2cb0a580f3 thinking: fix double emit when no opening tag Devon Rifkin 2025-08-21 21:03:12 -07:00
  • 7cce5aac76 harmony: move harmony parsing into a package (#12016) Parth Sareen 2025-08-21 13:56:22 -07:00
  • 131c496340 merge upstream and fix conflicts v0.11.6 likelovewant 2025-08-21 11:24:55 +08:00
  • 4ae4f47b16 gpt-oss: convert from hugging face format (#11907) Michael Yang 2025-08-20 15:39:18 -07:00
  • 073fa31df5 llm: Don't always evict models in CPU-only mode Jesse Gross 2025-08-20 12:51:45 -07:00
  • 91fc3c48e3 openai: remove reasoning as an api.Options (#11993) Michael Yang 2025-08-20 12:21:42 -07:00
  • 6de62664d9 Merge pull request #11973 from ollama/drifkin/bpe Devon Rifkin 2025-08-19 22:58:33 -07:00
  • 463a6caad8 model: add bpe roundtripping tests Devon Rifkin 2025-08-19 22:05:48 -07:00
  • fc5fb09f51 model: fix boundary in bpe Devon Rifkin 2025-08-19 18:34:49 -07:00
  • 05ccb17c6e kvcache: Use Cast instead of Copy for flash attention masks Jesse Gross 2025-08-19 09:52:18 -07:00
  • f804e8a460 disable output_all (#11959) Michael Yang 2025-08-18 17:45:40 -07:00
  • 9cfbffafc5 readme: add any-agent to community integrations (#11950) Kostis 2025-08-19 00:21:36 +03:00
  • 470d580205 readme: add Andes to community integrations (#11952) Ruslan Suleymanov 2025-08-19 02:20:28 +05:00
  • b517bb1c19 Merge pull request #11910 from ollama/drifkin/harmony-fn-names Devon Rifkin 2025-08-18 14:17:47 -07:00
  • e3ade453a8 llm: Check for nil memory data before printing Jesse Gross 2025-08-18 13:52:07 -07:00
  • 048bd4472a harmony: convert fn names to be valid ts identifiers Devon Rifkin 2025-08-14 17:17:25 -07:00
  • ec8bf5e6c5 Merge pull request #11875 from ollama/drifkin/print-template Devon Rifkin 2025-08-18 14:03:14 -07:00
  • 709bbb0b6d readme: add any-llm to community integrations (#11956) Kostis 2025-08-18 23:13:26 +03:00
  • abeec240f9 readme: add Serene Pub to community integrations (#11946) Jody Doolittle 2025-08-18 13:12:41 -07:00
  • df335aac09 gpt-oss: disable quantized kv cache (#11929) Michael Yang 2025-08-15 15:01:05 -07:00
  • 026bc29237 cli: show the default context length env setting in online help (#11928) Patrick Devine 2025-08-15 14:59:52 -07:00
  • 883d031268 docs: added missing comma in 'Ollama's Javascript library'' (#11915) Thomas Pelster 2025-08-15 23:45:01 +02:00
  • 5271ff8559 handle cgo flags in docker build (#11909) Daniel Hiltgen 2025-08-15 14:39:35 -07:00
  • d6f7233a1c test: improve scheduler/concurrency stress tests (#11906) Daniel Hiltgen 2025-08-15 14:37:54 -07:00
  • 8de1da4767 server: add debug option for printing out prompt instead of calling model Devon Rifkin 2025-08-15 13:52:50 -07:00
  • d925b5350c Revert "cuda: leverage JIT for smaller footprint (#11635)" (#11913) Daniel Hiltgen 2025-08-14 21:19:23 -07:00
  • 6eaf194b85 fix arm linux build when HWCAP2_SVE2 undefined (#11908) Daniel Hiltgen 2025-08-14 16:38:53 -07:00
  • d5a0d8d904 llm: New memory management Jesse Gross 2025-05-29 12:21:48 -07:00
  • ef7d26ba2c convert: skip reading into memory when possible (#11507) Michael Yang 2025-08-14 15:03:57 -07:00
  • 1a19df1f3a update vendored llama.cpp and ggml (#11823) Michael Yang 2025-08-14 14:42:58 -07:00
  • 7ccfd97a93 doc: clarify both rocm and main bundle necessary (#11900) Daniel Hiltgen 2025-08-14 12:54:55 -07:00
  • c385ca8672 test: add valid responses (#11902) Daniel Hiltgen 2025-08-14 11:07:13 -07:00
  • 837379a94c discovery: fix cudart driver version (#11614) Daniel Hiltgen 2025-08-13 15:43:33 -07:00
  • a24f90604f int: adjust a few models for integration tests (#11872) Daniel Hiltgen 2025-08-13 15:42:36 -07:00
  • dc5a645434 cuda: leverage JIT for smaller footprint (#11635) Daniel Hiltgen 2025-08-13 15:42:16 -07:00
  • bb71654ebe chore: fix some inconsistent function name in comment youzichuan 2025-08-13 16:22:45 +08:00
  • d4af9f04f9 Merge branch 'ollama:main' into main likelovewant 2025-08-13 12:36:50 +08:00
  • a343ae53a4 ggml: Use ordinal IDs for AMD GPUs on Linux when UUID is unavailable Jesse Gross 2025-08-11 17:01:07 -07:00
  • d0cf6c8281 fix(openai): handle reasoning_effort (#11868) Michael Yang 2025-08-12 11:02:01 -07:00
  • 8f4ec9ab28 discover: CPU supports flash attention Jesse Gross 2025-08-11 14:45:45 -07:00
  • dbfd7bd027 Merge pull request #11861 from ollama/drifkin/fix-parsing-error Devon Rifkin 2025-08-11 14:59:57 -07:00
  • ee04dbba51 server: fix error when parsing bad harmony tool calls Devon Rifkin 2025-08-11 14:09:13 -07:00
  • ea7657b54a sched: Add support for grouping GPUs (#10678) Daniel Andersen 2025-08-11 22:59:38 +02:00
  • 2c776f0780 CONTRIBUTING: Explicitly note docs:... as a good example (#11755) Michael Vorburger 2025-08-10 03:12:30 +02:00
  • 79f6376f5b ggml: No-alloc mode Jesse Gross 2025-07-23 14:18:24 -07:00
  • 756c78cfc7 ggml: Support closing backends Jesse Gross 2025-04-17 17:12:01 -07:00
  • d7f4f788d1 ggml: Use GGML's typedef'ed pointer types Jesse Gross 2025-08-06 11:39:08 -07:00
  • 114c3f2265 tests: add integration coverage for oss-gpt (#11696) Daniel Hiltgen 2025-08-07 15:06:57 -07:00
  • f2e9c9aff5 server: Reduce gpt-oss context length for small VRAM GPUs Jesse Gross 2025-08-07 13:49:26 -07:00
  • aa9d889522 Merge pull request #11765 from ollama/drifkin/thinking-without-content Devon Rifkin 2025-08-06 19:02:23 -07:00
  • 735c41f9ca openai: always provide reasoning Devon Rifkin 2025-08-06 18:54:20 -07:00
  • 223a619468 Merge pull request #11761 from ollama/drifkin/openai-tool-names Devon Rifkin 2025-08-06 17:53:25 -07:00
  • 759dd78dd6 openai: when converting role=tool messages, propagate the tool name Devon Rifkin 2025-08-06 17:00:24 -07:00
  • 44bc36d063 docs: update the faq (#11760) Patrick Devine 2025-08-06 16:55:57 -07:00
  • 8f14e1f5f6 Merge pull request #11759 from ollama/drifkin/oai-tool-calling Devon Rifkin 2025-08-06 16:11:31 -07:00
  • 203c137810 openai: allow for content _and_ tool calls in the same message Devon Rifkin 2025-08-06 15:50:30 -07:00
  • fa8be9e35c clean up debugging (#11756) Daniel Hiltgen 2025-08-06 13:31:22 -07:00
  • 8a75e9ee15 Update downloading to pulling in api.md (#11170) Gao feng 2025-08-07 02:33:09 +08:00
  • 9231379bce remove gfx900 likelovewant 2025-08-06 09:46:23 +08:00
  • c7ba6128b4 remove gfx900 likelovewant 2025-08-06 09:43:21 +08:00
  • 8970233a2b add likelovewant 2025-08-06 09:36:32 +08:00
  • cde948f976 fix gfx1200 likelovewant 2025-08-06 09:29:22 +08:00
  • 7c8aba0d83 Merge branch 'ollama:main' into main v0.11.2 likelovewant 2025-08-06 09:25:22 +08:00
  • 4742e12c23 docs: update turbo model name (#11707) Parth Sareen 2025-08-05 17:29:08 -07:00
  • 2d06977ade Merge pull request #11705 from ollama/drifkin/fn-schema Devon Rifkin 2025-08-05 17:02:42 -07:00
  • 30f8a68c4c tools: support anyOf types Devon Rifkin 2025-08-05 16:46:24 -07:00
  • e378e33421 win: static link msvc libs (#11612) Daniel Hiltgen 2025-08-05 16:10:42 -07:00