Enforce prompt window and trim transcripts deterministically#90
Enforce prompt window and trim transcripts deterministically#90heusalagroupbot wants to merge 2 commits intomainfrom
Conversation
Add TrimMessagesToFit and PromptTokenBudget to deterministically fit chat transcripts within the model context, pinning system/developer, dropping oldest non‑pinned, and truncating pinned content when necessary. Wire trimming into pre‑stage and main agent loop before requests, and add unit tests. Fixes #75.
There was a problem hiding this comment.
Pull Request Overview
This PR implements deterministic prompt trimming to prevent token limit exceeded errors by adding a systematic approach to fit conversation transcripts within model context windows while preserving critical messages.
- Introduces
TrimMessagesToFitfunction with a clear policy: pin system/developer messages, drop oldest non-pinned messages first, and truncate content proportionally when needed - Adds
PromptTokenBudgetto calculate safe prompt limits considering completion token reservations - Enforces trimming in both main agent and pre-stage request paths to prevent API errors
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/oai/trim.go | Core trimming implementation with deterministic policy for message preservation and content truncation |
| internal/oai/trim_test.go | Comprehensive unit tests covering trimming behavior, edge cases, and policy validation |
| internal/oai/context_window.go | Utility function to calculate prompt token budget with safety margins |
| cmd/agentcli/run_agent.go | Integration of trimming logic in main agent request path |
| cmd/agentcli/prestage.go | Integration of trimming logic in pre-stage request path |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
Status: Rebased on latest main, tests are green locally. Proceeding with review follow-ups if any. |
|
Check out the review comment from copilot and implement it, @heusalagroupbot |
1 similar comment
|
Check out the review comment from copilot and implement it, @heusalagroupbot |
|
Rebase: rebased cleanly onto origin/main; pushed: no; tests: passed. |
|
Try again @heusalagroupbot. Check out the review comment from copilot review and implement it. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Summary
TrimMessagesToFitto fit prompts within model window while preserving system/dev, dropping oldest non‑pinned, and truncating pinned content as needed.PromptTokenBudgetand use it to reserve space for the completion.Context
Addresses
Input tokens exceed the configured limit of 272000 tokenserror in #75 by ensuring requests respect model context limits with a clear, deterministic policy.Scope
Test plan
go test ./...passes locally.internal/oai/trim_test.govalidate behavior.Closes #75.