Preserve tool-call JSON for deterministic local inference by latent-variable · Pull Request #22 · MiniMax-AI/Mini-Agent

latent-variable · 2025-11-18T03:50:20Z

Summary

add arguments_json to FunctionCall so we persist the exact tool-call payload returned by the model
have the OpenAI client reuse the stored JSON when constructing follow-up requests (falling back to a canonical dump only if the string is missing)
capture the raw JSON when parsing responses, keeping both the parsed dict (for tool execution) and the untouched string (for replay)

Problem

When Mini-Agent is configured to send requests to a local LM Studio endpoint (or any local serving stack with KV cache), each subsequent request must be byte-identical for the cached portion of the context. Today the request builder re-serializes every tool call in the transcript using json.dumps(..., sort_keys=True). That changes key ordering, whitespace, or float formatting compared to what the model actually emitted, meaning the tool call that was prepended to Request #2 is different from the one the model saw during Request #1. LM Studio therefore treats the assistant history as a cache miss, reprocessing all prior tokens (~12k tokens per turn in our setup), wasting latency and compute.

Testing

python - <<'PY' … (arguments_json present) ✅
python - <<'PY' … (fallback path) ✅
Manual: local LM Studio run now reports 99.88% prompt reuse on Request feat: Add Windows support with PowerShell integration #2 (previously ~33%).

AkairoDev · 2026-01-14T07:09:59Z

Thanks for this excellent PR! This is a well-designed optimization for local LLM inference with KV cache.

The solution is elegant:

Preserves original JSON string for deterministic replay
Falls back gracefully when arguments_json is missing
Backward compatible

The performance improvement (33% → 99.88% cache hit rate) is impressive! 🚀

However, there are merge conflicts with the current main branch. Could you please rebase and resolve the conflicts?

Looking forward to merging this! 🙏

Preserve raw tool-call JSON for deterministic local inference

91d7951

latent-variable force-pushed the fix/deterministic-local-tools branch from 2eba424 to 91d7951 Compare November 18, 2025 03:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve tool-call JSON for deterministic local inference#22

Preserve tool-call JSON for deterministic local inference#22
latent-variable wants to merge 1 commit intoMiniMax-AI:mainfrom
latent-variable:fix/deterministic-local-tools

latent-variable commented Nov 18, 2025

Uh oh!

AkairoDev commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

latent-variable commented Nov 18, 2025

Summary

Problem

Testing

Uh oh!

AkairoDev commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants