mirror of
https://github.com/likelovewant/ollama-for-amd.git
synced 2025-12-21 22:33:56 +00:00
docs: add docs for docs.ollama.com (#12805)
This commit is contained in:
@@ -1,9 +1,8 @@
|
||||
# Ollama Model File
|
||||
---
|
||||
title: Modelfile Reference
|
||||
---
|
||||
|
||||
> [!NOTE]
|
||||
> `Modelfile` syntax is in development
|
||||
|
||||
A model file is the blueprint to create and share models with Ollama.
|
||||
A Modelfile is the blueprint to create and share customized models using Ollama.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
@@ -73,26 +72,23 @@ To view the Modelfile of a given model, use the `ollama show --modelfile` comman
|
||||
ollama show --modelfile llama3.2
|
||||
```
|
||||
|
||||
> **Output**:
|
||||
>
|
||||
> ```
|
||||
> # Modelfile generated by "ollama show"
|
||||
> # To build a new Modelfile based on this one, replace the FROM line with:
|
||||
> # FROM llama3.2:latest
|
||||
> FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
|
||||
> TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
|
||||
>
|
||||
> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
|
||||
>
|
||||
> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
|
||||
>
|
||||
> {{ .Response }}<|eot_id|>"""
|
||||
> PARAMETER stop "<|start_header_id|>"
|
||||
> PARAMETER stop "<|end_header_id|>"
|
||||
> PARAMETER stop "<|eot_id|>"
|
||||
> PARAMETER stop "<|reserved_special_token"
|
||||
> ```
|
||||
```
|
||||
# Modelfile generated by "ollama show"
|
||||
# To build a new Modelfile based on this one, replace the FROM line with:
|
||||
# FROM llama3.2:latest
|
||||
FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
|
||||
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
|
||||
|
||||
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
|
||||
|
||||
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
|
||||
|
||||
{{ .Response }}<|eot_id|>"""
|
||||
PARAMETER stop "<|start_header_id|>"
|
||||
PARAMETER stop "<|end_header_id|>"
|
||||
PARAMETER stop "<|eot_id|>"
|
||||
PARAMETER stop "<|reserved_special_token"
|
||||
```
|
||||
|
||||
## Instructions
|
||||
|
||||
@@ -110,10 +106,13 @@ FROM <model name>:<tag>
|
||||
FROM llama3.2
|
||||
```
|
||||
|
||||
A list of available base models:
|
||||
<https://github.com/ollama/ollama#model-library>
|
||||
Additional models can be found at:
|
||||
<https://ollama.com/library>
|
||||
<Card title="Base Models" href="https://github.com/ollama/ollama#model-library">
|
||||
A list of available base models
|
||||
</Card>
|
||||
|
||||
<Card title="Base Models" href="https://ollama.com/library">
|
||||
Additional models can be found at
|
||||
</Card>
|
||||
|
||||
#### Build from a Safetensors model
|
||||
|
||||
@@ -124,10 +123,11 @@ FROM <model directory>
|
||||
The model directory should contain the Safetensors weights for a supported architecture.
|
||||
|
||||
Currently supported model architectures:
|
||||
* Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2)
|
||||
* Mistral (including Mistral 1, Mistral 2, and Mixtral)
|
||||
* Gemma (including Gemma 1 and Gemma 2)
|
||||
* Phi3
|
||||
|
||||
- Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2)
|
||||
- Mistral (including Mistral 1, Mistral 2, and Mixtral)
|
||||
- Gemma (including Gemma 1 and Gemma 2)
|
||||
- Phi3
|
||||
|
||||
#### Build from a GGUF file
|
||||
|
||||
@@ -137,7 +137,6 @@ FROM ./ollama-model.gguf
|
||||
|
||||
The GGUF file location should be specified as an absolute path or relative to the `Modelfile` location.
|
||||
|
||||
|
||||
### PARAMETER
|
||||
|
||||
The `PARAMETER` instruction defines a parameter that can be set when the model is run.
|
||||
@@ -148,18 +147,21 @@ PARAMETER <parameter> <parametervalue>
|
||||
|
||||
#### Valid Parameters and Values
|
||||
|
||||
| Parameter | Description | Value Type | Example Usage |
|
||||
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | -------------------- |
|
||||
| num_ctx | Sets the size of the context window used to generate the next token. (Default: 4096) | int | num_ctx 4096 |
|
||||
| repeat_last_n | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) | int | repeat_last_n 64 |
|
||||
| repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1) | float | repeat_penalty 1.1 |
|
||||
| temperature | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8) | float | temperature 0.7 |
|
||||
| seed | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0) | int | seed 42 |
|
||||
| stop | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile. | string | stop "AI assistant:" |
|
||||
| num_predict | Maximum number of tokens to predict when generating text. (Default: -1, infinite generation) | int | num_predict 42 |
|
||||
| top_k | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40) | int | top_k 40 |
|
||||
| top_p | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9) | float | top_p 0.9 |
|
||||
| min_p | Alternative to the top_p, and aims to ensure a balance of quality and variety. The parameter *p* represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with *p*=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. (Default: 0.0) | float | min_p 0.05 |
|
||||
| Parameter | Description | Value Type | Example Usage |
|
||||
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | -------------------- |
|
||||
| mirostat | Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0) | int | mirostat 0 |
|
||||
| mirostat_eta | Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1) | float | mirostat_eta 0.1 |
|
||||
| mirostat_tau | Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0) | float | mirostat_tau 5.0 |
|
||||
| num_ctx | Sets the size of the context window used to generate the next token. (Default: 2048) | int | num_ctx 4096 |
|
||||
| repeat_last_n | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) | int | repeat_last_n 64 |
|
||||
| repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1) | float | repeat_penalty 1.1 |
|
||||
| temperature | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8) | float | temperature 0.7 |
|
||||
| seed | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0) | int | seed 42 |
|
||||
| stop | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile. | string | stop "AI assistant:" |
|
||||
| num_predict | Maximum number of tokens to predict when generating text. (Default: -1, infinite generation) | int | num_predict 42 |
|
||||
| top_k | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40) | int | top_k 40 |
|
||||
| top_p | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9) | float | top_p 0.9 |
|
||||
| min_p | Alternative to the top*p, and aims to ensure a balance of quality and variety. The parameter \_p* represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with _p_=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. (Default: 0.0) | float | min_p 0.05 |
|
||||
|
||||
### TEMPLATE
|
||||
|
||||
@@ -201,9 +203,10 @@ ADAPTER <path to safetensor adapter>
|
||||
```
|
||||
|
||||
Currently supported Safetensor adapters:
|
||||
* Llama (including Llama 2, Llama 3, and Llama 3.1)
|
||||
* Mistral (including Mistral 1, Mistral 2, and Mixtral)
|
||||
* Gemma (including Gemma 1 and Gemma 2)
|
||||
|
||||
- Llama (including Llama 2, Llama 3, and Llama 3.1)
|
||||
- Mistral (including Mistral 1, Mistral 2, and Mixtral)
|
||||
- Gemma (including Gemma 1 and Gemma 2)
|
||||
|
||||
#### GGUF adapter
|
||||
|
||||
@@ -237,7 +240,6 @@ MESSAGE <role> <message>
|
||||
| user | An example message of what the user could have asked. |
|
||||
| assistant | An example message of how the model should respond. |
|
||||
|
||||
|
||||
#### Example conversation
|
||||
|
||||
```
|
||||
@@ -249,7 +251,6 @@ MESSAGE user Is Ontario in Canada?
|
||||
MESSAGE assistant yes
|
||||
```
|
||||
|
||||
|
||||
## Notes
|
||||
|
||||
- the **`Modelfile` is not case sensitive**. In the examples, uppercase instructions are used to make it easier to distinguish it from arguments.
|
||||
|
||||
Reference in New Issue
Block a user