backend: Support graph computation that does not return an output

There are two cases where we may not have an output after computing: - Prompt processing where the length of the input exceeds the batch size - Internal memory management operations such as cache defrag and shift
2025-12-21 22:33:56 +00:00 · 2025-02-03 19:35:12 -08:00
parent 0e38297f87
commit 4d4463b2bd
3 changed files with 22 additions and 14 deletions
--- a/ml/backend.go
+++ b/ml/backend.go
@@ -49,7 +49,7 @@ type Context interface {
 	FromIntSlice(s []int32, shape ...int) (Tensor, error)

 	Forward(Tensor)
-	Compute(Tensor) Tensor
+	Compute(...Tensor)
 	Close()
 }