likelovewant
706449c10d
Merge branch 'ollama:main' into main
2024-07-14 09:51:57 +08:00
jmorganca
f7ee012300
server: prepend system message in chat handler
2024-07-13 15:08:00 -07:00
likelovewant
d63280cf56
change back to 5.7
2024-07-14 01:02:00 +08:00
likelovewant
90807b2ad0
Merge branch 'ollama:main' into main
2024-07-14 00:58:33 +08:00
Jeffrey Morgan
1ed0aa8fea
server: fix context, load_duration and total_duration fields ( #5676 )
...
* server: fix `contet`, `load_duration` and `total_duration` fields
* Update server/routes.go
2024-07-13 09:25:31 -07:00
likelovewant
8c0f922c48
Merge branch 'ollama:main' into main
v0.2.3-alpha
2024-07-14 00:23:59 +08:00
Jeffrey Morgan
ef98803d63
llm: looser checks for minimum memory ( #5677 )
2024-07-13 09:20:05 -07:00
likelovewant
59254ee1f5
Merge branch 'ollama:main' into main
2024-07-13 23:55:36 +08:00
Jarek
02fea420e5
Add Kerlig AI, an app for macOS ( #5675 )
2024-07-13 08:33:46 -07:00
Michael Yang
22c5451fc2
fix system prompt ( #5662 )
...
* fix system prompt
* execute template when hitting previous roles
* fix tests
---------
Co-authored-by: jmorganca <jmorganca@gmail.com >
2024-07-12 21:04:44 -07:00
Michael Yang
ebc529cbb3
autodetect stop parameters from template
2024-07-12 16:01:23 -07:00
Patrick Devine
23ebbaa46e
Revert "remove template from tests"
...
This reverts commit 9ac0a7a50b .
2024-07-12 15:47:17 -07:00
Patrick Devine
9ac0a7a50b
remove template from tests
2024-07-12 15:41:31 -07:00
Michael Yang
e5c65a85df
Merge pull request #5653 from ollama/mxyng/collect-system
...
template: preprocess message and collect system
2024-07-12 12:32:34 -07:00
Jeffrey Morgan
33627331a3
app: also clean up tempdir runners on install ( #5646 )
2024-07-12 12:29:23 -07:00
Michael Yang
36c87c433b
template: preprocess message and collect system
2024-07-12 12:26:43 -07:00
likelovewant
5505a018b2
Resolved merge conflicts
2024-07-12 20:44:04 +08:00
likelovewant
c8d0651277
Resolved merge conflicts
2024-07-12 20:38:17 +08:00
Jeffrey Morgan
179737feb7
Clean up old files when installing on Windows ( #5645 )
...
* app: always clean up install dir; force close applications
* remove wildcard
* revert `CloseApplications`
* whitespace
* update `LOCALAPPDATA` var
2024-07-11 22:53:46 -07:00
Michael Yang
47353f5ee4
Merge pull request #5639 from ollama/mxyng/unaggregated-system
2024-07-11 17:48:50 -07:00
Josh
10e768826c
fix: quant err message ( #5616 )
2024-07-11 17:24:29 -07:00
Michael Yang
5056bb9c01
rename aggregate to contents
2024-07-11 17:00:26 -07:00
Jeffrey Morgan
c4cf8ad559
llm: avoid loading model if system memory is too small ( #5637 )
...
* llm: avoid loading model if system memory is too small
* update log
* Instrument swap free space
On linux and windows, expose how much swap space is available
so we can take that into consideration when scheduling models
* use `systemSwapFreeMemory` in check
---------
Co-authored-by: Daniel Hiltgen <daniel@ollama.com >
2024-07-11 16:42:57 -07:00
Michael Yang
57ec6901eb
revert embedded templates to use prompt/response
...
This reverts commit 19753c18c0 .
for compat. messages will be added at a later date
2024-07-11 14:49:35 -07:00
Michael Yang
e64f9ebb44
do no automatically aggregate system messages
2024-07-11 14:49:35 -07:00
likelovewant
514e9186d3
update the igpu support
2024-07-11 23:28:08 +08:00
Jeffrey Morgan
791650ddef
sched: only error when over-allocating system memory ( #5626 )
2024-07-11 00:53:12 -07:00
Jeffrey Morgan
efbf41ed81
llm: dont link cuda with compat libs ( #5621 )
2024-07-10 20:01:52 -07:00
Michael Yang
cf15589851
Merge pull request #5620 from ollama/mxyng/templates
...
update embedded templates
2024-07-10 17:16:24 -07:00
Michael Yang
19753c18c0
update embedded templates
2024-07-10 17:03:08 -07:00
Michael Yang
41be28096a
add system prompt to first legacy template
2024-07-10 17:03:08 -07:00
Michael Yang
37a570f962
Merge pull request #5612 from ollama/mxyng/mem
...
chatglm graph
2024-07-10 14:18:33 -07:00
Michael Yang
5a739ff4cb
chatglm graph
2024-07-10 13:43:47 -07:00
Jeffrey Morgan
4e262eb2a8
remove GGML_CUDA_FORCE_MMQ=on from build ( #5588 )
2024-07-10 13:17:13 -07:00
Daniel Hiltgen
4cfcbc328f
Merge pull request #5124 from dhiltgen/amd_windows
...
Wire up windows AMD driver reporting
2024-07-10 12:50:23 -07:00
Daniel Hiltgen
79292ff3e0
Merge pull request #5555 from dhiltgen/msvc_deps
...
Bundle missing CRT libraries
2024-07-10 12:50:02 -07:00
Daniel Hiltgen
8ea500441d
Merge pull request #5580 from dhiltgen/cuda_overhead
...
Detect CUDA OS overhead
2024-07-10 12:47:31 -07:00
Daniel Hiltgen
b50c818623
Merge pull request #5607 from dhiltgen/win_rocm_v6
...
Bump ROCm on windows to 6.1.2
2024-07-10 12:47:10 -07:00
Daniel Hiltgen
b99e750b62
Merge pull request #5605 from dhiltgen/merge_glitch
...
Remove duplicate merge glitch
2024-07-10 11:47:08 -07:00
Daniel Hiltgen
1f50356e8e
Bump ROCm on windows to 6.1.2
...
This also adjusts our algorithm to favor our bundled ROCm.
I've confirmed VRAM reporting still doesn't work properly so we
can't yet enable concurrency by default.
2024-07-10 11:01:22 -07:00
Daniel Hiltgen
22c81f62ec
Remove duplicate merge glitch
2024-07-10 09:01:33 -07:00
likelovewant
00beadf67e
update
2024-07-10 23:40:16 +08:00
likelovewant
61494fdb05
Update amd_windows.go
2024-07-10 23:28:53 +08:00
likelovewant
b0a43b1700
Update amd_windows.go
2024-07-10 21:43:21 +08:00
likelovewant
d788d8748b
Merge branch 'ollama:main' into main
v0.2.1-alpha
2024-07-10 12:32:28 +08:00
Daniel Hiltgen
73e2c8f68f
Fix context exhaustion integration test for small gpus
...
On the smaller GPUs, the initial model load of llama2 took over 30s (the
default timeout for the DoGenerate helper)
2024-07-09 16:24:14 -07:00
Daniel Hiltgen
f4408219e9
Refine scheduler unit tests for reliability
...
This breaks up some of the test scenarios to create a
more reliable set of tests, as well as adding a little more
coverage.
2024-07-09 16:00:08 -07:00
Daniel Hiltgen
2d1e3c3229
Merge pull request #5503 from dhiltgen/dual_rocm
...
Workaround broken ROCm p2p copy
2024-07-09 15:44:16 -07:00
royjhan
4918fae535
OpenAI v1/completions: allow stop token list ( #5551 )
...
* stop token parsing fix
* add stop test
2024-07-09 14:01:26 -07:00
royjhan
0aff67877e
separate request tests ( #5578 )
2024-07-09 13:48:31 -07:00