ollama-for-amd

mirror of https://github.com/likelovewant/ollama-for-amd.git synced 2025-12-25 07:58:01 +00:00

Author	SHA1	Message	Date
likelovewant	63a5f509ed	remove official support arches to down size	2024-08-02 13:30:46 +08:00
likelovewant	ca4c0c1a8f	Merge branch 'ollama:main' into main	2024-08-02 09:28:09 +08:00
Michael Yang	0ff42e84b0	Merge pull request #4756 from ollama/mxyng/convert2 refactor convert	2024-08-01 14:16:30 -07:00
likelovewant	0d4292b4b1	Merge branch 'ollama:main' into main	2024-08-01 18:30:28 +08:00
Michael Yang	df993fa37b	comments	2024-07-31 15:58:55 -07:00
Michael Yang	5e9db9fb0b	refactor convert	2024-07-31 15:58:33 -07:00
Michael Yang	0f3271db88	patches: phi3 default sliding window attention	2024-07-31 14:58:34 -07:00
Michael Yang	6b252918fb	update convert test to check result data	2024-07-31 10:59:38 -07:00
Michael Yang	5c1912769e	Merge pull request #5473 from ollama/mxyng/environ fix: environ lookup	2024-07-31 10:18:05 -07:00
likelovewant	1eb1dc32d2	Merge branch 'ollama:main' into main	2024-07-31 14:52:26 +08:00
likelovewant	ad5ad895fb	fix	2024-07-31 13:37:19 +08:00
jmorganca	afa8d6e9d5	patch gemma support	2024-07-30 18:07:29 -07:00
royjhan	1b44d873e7	Add Metrics to `api\embed` response (#5709 ) * add prompt tokens to embed response * rm slog * metrics * types * prompt n * clean up * reset submodule * update tests * test name * list metrics	2024-07-30 13:12:21 -07:00
likelovewant	fc296fd744	Remove llm/llama.cpp from Git index	2024-07-30 22:37:32 +08:00
likelovewant	e628246970	Restore llama.cpp from commit 6eeaeba	2024-07-30 20:43:59 +08:00
likelovewant	776aa9ceb2	resolve merge conflicts	2024-07-30 18:53:59 +08:00
Jeffrey Morgan	68ee42f995	update llama.cpp submodule to `6eeaeba1` (#6039 )	2024-07-29 13:20:26 -07:00
Tibor Schmidt	f3d7a481b7	feat: add support for min_p (resolve #1142 ) (#1825 )	2024-07-27 14:37:40 -07:00
likelovewant	91ba40fc45	Merge branch 'ollama:main' into main	2024-07-27 12:18:55 +08:00
Jeffrey Morgan	f2a96c7d77	llm: keep patch for llama 3 rope factors (#5987 )	2024-07-26 15:20:52 -07:00
likelovewant	86a1575ee3	fix api	2024-07-23 14:57:33 +08:00
likelovewant	fbfc13b6ca	Merge branch 'ollama:main' into main	2024-07-23 14:49:32 +08:00
Daniel Hiltgen	e12fff8810	Enable windows error dialog for subprocess startup Make sure if something goes wrong spawning the process, the user gets enough info to be able to try to self correct, or at least file a bug with details so we can fix it. Once the process starts, we immediately change back to the recommended setting to prevent the blocking dialog. This ensures if the model fails to load (OOM, unsupported model type, etc.) the process will exit quickly and we can scan the stdout/stderr of the subprocess for the reason to report via API.	2024-07-22 14:07:27 -07:00
Michael Yang	e2c3f6b3e2	string	2024-07-22 11:27:52 -07:00
Michael Yang	55cd3ddcca	bool	2024-07-22 11:27:21 -07:00
Michael Yang	35b89b2eab	rfc: dynamic environ lookup	2024-07-22 11:25:30 -07:00
Daniel Hiltgen	5784c05397	Merge pull request #5854 from dhiltgen/win_exit_status Refine error reporting for subprocess crash	2024-07-22 10:40:22 -07:00
Jeffrey Morgan	f8fedbda20	Update llama.cpp submodule commit to `d94c6e0c` (#5805 )	2024-07-22 12:42:00 -04:00
Daniel Hiltgen	a3c20e3f18	Refine error reporting for subprocess crash On windows, the exit status winds up being the search term many users search for and end up piling in on issues that are unrelated. This refines the reporting so that if we have a more detailed message we'll suppress the exit status portion of the message.	2024-07-22 08:52:16 -07:00
likelovewant	c44ff579a3	fix mismatch	2024-07-22 19:47:58 +08:00
likelovewant	04325ba40a	fix typo	2024-07-22 19:35:43 +08:00
likelovewant	3f03ae5808	update gen_windows.ps1 ,keep track with upstream	2024-07-22 19:00:40 +08:00
likelovewant	24641ae3a5	update gen_windows.ps1 ,keep track with upstream	2024-07-22 18:48:21 +08:00
likelovewant	5cae567ee8	megrge upstream update and reslove the conflicts	2024-07-22 17:00:43 +08:00
likelovewant	a8890fd2c6	fix conflicts	2024-07-22 08:10:12 +08:00
Jeffrey Morgan	5534f2cc6a	llm: consider `head_dim` in llama arch (#5817 )	2024-07-20 21:48:12 -04:00
Daniel Hiltgen	283948c83b	Adjust windows ROCm discovery The v5 hip library returns unsupported GPUs which wont enumerate at inference time in the runner so this makes sure we align discovery. The gfx906 cards are no longer supported so we shouldn't compile with that GPU type as it wont enumerate at runtime.	2024-07-20 15:17:50 -07:00
Jeffrey Morgan	1475eab95f	add patch for tekken (#5807 )	2024-07-20 13:41:21 -04:00
likelovewant	5cfa607627	Merge branch 'ollama:main' into main	2024-07-17 22:29:55 +08:00
Michael Yang	4a565cbf94	add chat and generate tests with mock runner	2024-07-16 09:39:31 -07:00
royjhan	b9f5e16c80	Introduce `/api/embed` endpoint supporting batch embedding (#5127 ) * Initial Batch Embedding * Revert "Initial Batch Embedding" This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29. * Initial Draft * mock up notes * api/embed draft * add server function * check normalization * clean up * normalization * playing around with truncate stuff * Truncation * Truncation * move normalization to go * Integration Test Template * Truncation Integration Tests * Clean up * use float32 * move normalize * move normalize test * refactoring * integration float32 * input handling and handler testing * Refactoring of legacy and new * clear comments * merge conflicts * touches * embedding type 64 * merge conflicts * fix hanging on single string * refactoring * test values * set context length * clean up * testing clean up * testing clean up * remove function closure * Revert "remove function closure" This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787. * remove function closure * remove redundant error check * clean up * more clean up * clean up	2024-07-15 12:14:24 -07:00
likelovewant	8c0f922c48	Merge branch 'ollama:main' into main	2024-07-14 00:23:59 +08:00
Jeffrey Morgan	ef98803d63	llm: looser checks for minimum memory (#5677 )	2024-07-13 09:20:05 -07:00
likelovewant	5505a018b2	Resolved merge conflicts	2024-07-12 20:44:04 +08:00
Josh	10e768826c	fix: quant err message (#5616 )	2024-07-11 17:24:29 -07:00
Jeffrey Morgan	c4cf8ad559	llm: avoid loading model if system memory is too small (#5637 ) * llm: avoid loading model if system memory is too small * update log * Instrument swap free space On linux and windows, expose how much swap space is available so we can take that into consideration when scheduling models * use `systemSwapFreeMemory` in check --------- Co-authored-by: Daniel Hiltgen <daniel@ollama.com>	2024-07-11 16:42:57 -07:00
Jeffrey Morgan	791650ddef	sched: only error when over-allocating system memory (#5626 )	2024-07-11 00:53:12 -07:00
Jeffrey Morgan	efbf41ed81	llm: dont link cuda with compat libs (#5621 )	2024-07-10 20:01:52 -07:00
Michael Yang	37a570f962	Merge pull request #5612 from ollama/mxyng/mem chatglm graph	2024-07-10 14:18:33 -07:00
Michael Yang	5a739ff4cb	chatglm graph	2024-07-10 13:43:47 -07:00

1 2 3 4 5 ...

658 Commits