Your agent sent the email. Then sent it again. Then charged the customer twice. No exception. No crash.
That’s not “prompt failure.” That’s execution safety failure. Agents don’t behave like deterministic code: they retry, loop, re-plan, and replay tool calls. If your tools have side effects, you need an idempotency boundary.
🤖 The Hidden Behavior of AI Agents
In production, agents don’t “call a tool once.” They operate in loops: planning → tool call → parse response → re-plan → tool call again. When the tool response is delayed, truncated, or hard to interpret, the agent often retries.
- To the agent: “I’m trying again.”
- To your system: duplicate side effects.
💥 What Repeated AI Actions Break
Duplicates aren’t annoying. They’re expensive.
| Tool Action | Duplication Impact | Why it’s bad |
|---|---|---|
| Send email | Spam 📧 | Trust loss + deliverability damage |
| Charge card | Double charge 💸 | Refunds, disputes, churn |
| Create ticket | Duplicate issues 🎫 | Support load + messy triage |
| Write to DB | Corruption / races 🧨 | Hard-to-debug partial states |
| Trigger automation | Retry storm 🌀 | Cascading loops across systems |
🧠 “But I Validate in Code”
Validation helps correctness of one request. It doesn’t stop duplicate requests. You cannot rely on:
- prompt instructions (“don’t repeat”),
- agent memory state,
- client-side flags (“already executed”),
- or limiting retries.
Prompt safety is not execution safety. If it has side effects, dedupe it server-side.
❌ Common Fixes That Fail
| Attempt | Why it fails |
|---|---|
| “Don’t call the tool twice” prompt | Not enforceable. Agents re-plan and retry. |
| Store state in agent memory | Context truncation, hallucination, parallel agents. |
| Client-side dedupe | Crashes / timeouts / network retries bypass it. |
| Reduce max retries | Doesn’t stop duplicate tool calls that “look successful.” |
✅ The Architecture AI Systems Need
Tool calls must be idempotent: calling the same action twice produces the effect once. This is the missing layer between agent frameworks and your side-effect APIs.
Instead of:
you do:
🛡️ Execution Safety: The Minimal Reliable Spec
To be truly safe, your boundary needs four things:
- Stable idempotency key (same logical action → same key).
- In-flight lease (two concurrent calls don’t both execute).
- Result cache (retry returns the same output).
- Retry-aware responses (tell the agent “duplicate: true” to stop loops).
🔧 Code Examples (Google Loves <pre><code>)
LangChain: Fix “tool runs twice” with idempotency
This pattern targets the exact developer query: “LangChain tool double execution”. You compute a stable key from the conversation + tool + normalized args, then call OnceOnly.
import hashlib
import json
import requests
def stable_action_key(conversation_id: str, tool_name: str, args: dict) -> str:
normalized = json.dumps(args, sort_keys=True, separators=(",", ":"))
raw = f"{conversation_id}:{tool_name}:{normalized}".encode("utf-8")
return hashlib.sha256(raw).hexdigest()
def onceonly_check_lock(api_key: str, key: str, ttl: int = 86400, metadata: dict | None = None):
r = requests.post(
"https://api.onceonly.tech/v1/check-lock",
headers={
"Authorization": f"Bearer {api_key}", # once_live_***
"Content-Type": "application/json",
},
json={
"key": key,
"ttl": ttl,
"metadata": metadata or {},
},
timeout=30,
)
r.raise_for_status()
return r.json()
# Example tool call
conversation_id = "conv_123"
tool_name = "send_followup_email"
args = {"user_id": "u_123", "template": "followup_1"}
idem_key = stable_action_key(conversation_id, tool_name, args)
lock = onceonly_check_lock(
api_key="once_live_***",
key=idem_key,
ttl=86400,
metadata={"conversation_id": conversation_id, "tool": tool_name},
)
if lock.get("status") == "duplicate":
# Return cached result from your own storage (DB/Redis), keyed by idem_key
raise RuntimeError("duplicate tool call")
# New action: execute the side effect once, then cache the result under idem_key
# send_followup_email(**args)
print(result)
status=duplicate.
You short-circuit and return your cached result (or a safe “already processed” response) instead of executing again.
CrewAI: dedupe multi-agent or parallel tool calls
CrewAI often runs multiple agents or tasks in parallel. That increases duplicate risk (concurrency + retries). OnceOnly’s idempotency check plus your result cache prevents double execution.
import crypto from "crypto";
function stableKey({ taskId, toolName, args }) {
const normalized = JSON.stringify(args, Object.keys(args).sort());
return crypto.createHash("sha256").update(`${taskId}:${toolName}:${normalized}`).digest("hex");
}
export async function onceonlyExecute({ apiKey, taskId, toolName, action, args }) {
const idemKey = stableKey({ taskId, toolName, args });
const res = await fetch("https://api.onceonly.tech/v1/check-lock", {
method: "POST",
headers: {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json"
},
body: JSON.stringify({
key: idemKey,
ttl: 86400,
metadata: { taskId, toolName }
})
});
if (!res.ok) throw new Error(`OnceOnly error: ${res.status} ${await res.text()}`);
const lock = await res.json();
if (lock.status === "duplicate") throw new Error("duplicate tool call");
// New action: execute the side effect once, then cache the result under idemKey.
// return await callTool(action, args);
return { status: "locked", key: idemKey };
}
// Crew task/tool example
const out = await onceonlyExecute({
apiKey: "once_live_***",
taskId: "crew_task_778",
toolName: "create_ticket",
action: "create_ticket",
args: { project: "support", title: "User cannot login", priority: "P2" }
});
console.log(out);
AutoGPT: make retries harmless (same key → same side effect)
AutoGPT-style loops often re-run actions after partial failures. You want safe replays, not duplicated outcomes.
POST https://api.onceonly.tech/v1/check-lock
Authorization: Bearer once_live_***
Content-Type: application/json
{
"key": "7cfe0f4a... (sha256 of run_id + action + normalized params)",
"ttl": 86400,
"metadata": {
"tool": "charge_customer",
"invoice_id": "inv_778",
"amount_cents": 2900,
"currency": "usd"
}
}
{
"success": true,
"status": "locked",
"key": "7cfe0f4a...",
"ttl": 86400,
"first_seen_at": null
}
{
"success": false,
"status": "duplicate",
"key": "7cfe0f4a...",
"ttl": 86400,
"first_seen_at": "2026-01-24T20:15:00Z"
}
Your tool/backend should now short-circuit and return a safe “already processed” response (or your cached result) instead of re-executing.
🔁 Retry Logic That Doesn’t Create Duplicates
You still want retries (networks fail). The rule is: retry the request, but dedupe the side effect.
Recommended production settings
- Timeouts: fail fast client-side, retry with same idempotency key.
- Backoff: exponential backoff with jitter.
- Lease (in-flight lock): if the same key is currently executing, return a “processing” status or wait briefly then return cached result.
- TTL: keep results long enough to cover agent loops (hours/days, depending on your workflows).
🧩 The Missing Layer in AI Systems
Frameworks solve reasoning and tool selection. Your backend solves business logic. But neither guarantees exactly-once execution. That’s why agentic systems feel unpredictable in production: they’re missing a reliability boundary.
This is the same class of problem as exactly-once processing myths, plus the agent-specific chaos of probabilistic loops.
🚀 The 5-Minute Fix
Add an idempotency layer designed for AI agents and automation systems. Your agent becomes powerful — without becoming dangerous.
TL;DR: In AI systems, retries are behavior — not bugs. So your reliability layer must be real — not prompts.
❓ Frequently Asked Questions
Why do LangChain tools sometimes execute twice?
LangChain agents often re-plan and retry when tool outputs are slow, ambiguous, truncated, or not “useful enough” for the next step. Without idempotency, the same tool call replays the side effect.
CrewAI: why are there duplicate tool calls?
CrewAI commonly runs multi-agent flows and parallel tasks. Duplicates often come from concurrency (two agents call the same tool) plus retries (one agent replays after uncertainty). You need a shared server-side dedupe key.
AutoGPT: how do I stop repeating actions in loops?
AutoGPT-style loops are designed to iterate. You don’t stop the loop — you make the loop safe: wrap side effects with an idempotency key so repeated calls return the same stored result.
Why can’t I just tell the AI “don’t repeat actions”?
LLMs are probabilistic. Even with explicit instructions, they can misinterpret tool responses, forget prior context, hallucinate that a previous action failed, or re-evaluate the plan differently. Prompting reduces risk — but doesn’t guarantee safety.
How do I generate an idempotency key for tool calls?
Combine stable identifiers with action content:
sha256(conversation_or_task_id + tool_name + normalized_parameters).
Normalization matters: sort keys, remove transient fields, and keep consistent formatting.
What if the AI legitimately needs to do the same action twice?
Then it’s not the “same” request: the parameters differ, or you include a deliberate nonce. Idempotency dedupes identical requests; it does not block distinct intentions.
Does this work with OpenAI function calling / tool use?
Yes. OpenAI tool calling, Anthropic tool use, and any agent framework benefits. The model doesn’t “become idempotent.” Your execution layer becomes reliable.