Engineering

Stop AI Agents from Repeating Tool Calls
(Idempotency for LangChain, CrewAI, AutoGPT)

10 min read Topics: execution safety, retry logic, idempotency

Your agent sent the email. Then sent it again. Then charged the customer twice. No exception. No crash.

That’s not “prompt failure.” That’s execution safety failure. Agents don’t behave like deterministic code: they retry, loop, re-plan, and replay tool calls. If your tools have side effects, you need an idempotency boundary.

Symptom
Tool executes twice
LangChain / CrewAI / AutoGPT repeats the same call.
Root cause
Retry logic + uncertainty
Slow/ambiguous responses trigger replays.
Fix
Idempotency layer
Same key → same result → no duplicates.

🤖 The Hidden Behavior of AI Agents

In production, agents don’t “call a tool once.” They operate in loops: planning → tool call → parse response → re-plan → tool call again. When the tool response is delayed, truncated, or hard to interpret, the agent often retries.

  • To the agent: “I’m trying again.”
  • To your system: duplicate side effects.

💥 What Repeated AI Actions Break

Duplicates aren’t annoying. They’re expensive.

Tool Action Duplication Impact Why it’s bad
Send email Spam 📧 Trust loss + deliverability damage
Charge card Double charge 💸 Refunds, disputes, churn
Create ticket Duplicate issues 🎫 Support load + messy triage
Write to DB Corruption / races 🧨 Hard-to-debug partial states
Trigger automation Retry storm 🌀 Cascading loops across systems

🧠 “But I Validate in Code”

Validation helps correctness of one request. It doesn’t stop duplicate requests. You cannot rely on:

  • prompt instructions (“don’t repeat”),
  • agent memory state,
  • client-side flags (“already executed”),
  • or limiting retries.
Prompt safety is not execution safety. If it has side effects, dedupe it server-side.

❌ Common Fixes That Fail

Attempt Why it fails
“Don’t call the tool twice” prompt Not enforceable. Agents re-plan and retry.
Store state in agent memory Context truncation, hallucination, parallel agents.
Client-side dedupe Crashes / timeouts / network retries bypass it.
Reduce max retries Doesn’t stop duplicate tool calls that “look successful.”

✅ The Architecture AI Systems Need

Tool calls must be idempotent: calling the same action twice produces the effect once. This is the missing layer between agent frameworks and your side-effect APIs.

Instead of:

AI Agent → Your API (side effects)

you do:

AI Agent → OnceOnly Idempotency → Your API

🛡️ Execution Safety: The Minimal Reliable Spec

To be truly safe, your boundary needs four things:

  1. Stable idempotency key (same logical action → same key).
  2. In-flight lease (two concurrent calls don’t both execute).
  3. Result cache (retry returns the same output).
  4. Retry-aware responses (tell the agent “duplicate: true” to stop loops).

🔧 Code Examples (Google Loves <pre><code>)

LangChain: Fix “tool runs twice” with idempotency

This pattern targets the exact developer query: “LangChain tool double execution”. You compute a stable key from the conversation + tool + normalized args, then call OnceOnly.

import hashlib
import json
import requests

def stable_action_key(conversation_id: str, tool_name: str, args: dict) -> str:
    normalized = json.dumps(args, sort_keys=True, separators=(",", ":"))
    raw = f"{conversation_id}:{tool_name}:{normalized}".encode("utf-8")
    return hashlib.sha256(raw).hexdigest()

def onceonly_check_lock(api_key: str, key: str, ttl: int = 86400, metadata: dict | None = None):
    r = requests.post(
        "https://api.onceonly.tech/v1/check-lock",
        headers={
            "Authorization": f"Bearer {api_key}",  # once_live_***
            "Content-Type": "application/json",
        },
        json={
            "key": key,
            "ttl": ttl,
            "metadata": metadata or {},
        },
        timeout=30,
    )
    r.raise_for_status()
    return r.json()

# Example tool call
conversation_id = "conv_123"
tool_name = "send_followup_email"
args = {"user_id": "u_123", "template": "followup_1"}
idem_key = stable_action_key(conversation_id, tool_name, args)

lock = onceonly_check_lock(
    api_key="once_live_***",
    key=idem_key,
    ttl=86400,
    metadata={"conversation_id": conversation_id, "tool": tool_name},
)

if lock.get("status") == "duplicate":
    # Return cached result from your own storage (DB/Redis), keyed by idem_key
    raise RuntimeError("duplicate tool call")

# New action: execute the side effect once, then cache the result under idem_key
# send_followup_email(**args)
print(result)
What you get
If LangChain retries the same tool call, OnceOnly returns status=duplicate. You short-circuit and return your cached result (or a safe “already processed” response) instead of executing again.

CrewAI: dedupe multi-agent or parallel tool calls

CrewAI often runs multiple agents or tasks in parallel. That increases duplicate risk (concurrency + retries). OnceOnly’s idempotency check plus your result cache prevents double execution.

import crypto from "crypto";

function stableKey({ taskId, toolName, args }) {
  const normalized = JSON.stringify(args, Object.keys(args).sort());
  return crypto.createHash("sha256").update(`${taskId}:${toolName}:${normalized}`).digest("hex");
}

export async function onceonlyExecute({ apiKey, taskId, toolName, action, args }) {
  const idemKey = stableKey({ taskId, toolName, args });

  const res = await fetch("https://api.onceonly.tech/v1/check-lock", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${apiKey}`,
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      key: idemKey,
      ttl: 86400,
      metadata: { taskId, toolName }
    })
  });

  if (!res.ok) throw new Error(`OnceOnly error: ${res.status} ${await res.text()}`);
  const lock = await res.json();
  if (lock.status === "duplicate") throw new Error("duplicate tool call");

  // New action: execute the side effect once, then cache the result under idemKey.
  // return await callTool(action, args);
  return { status: "locked", key: idemKey };
}

// Crew task/tool example
const out = await onceonlyExecute({
  apiKey: "once_live_***",
  taskId: "crew_task_778",
  toolName: "create_ticket",
  action: "create_ticket",
  args: { project: "support", title: "User cannot login", priority: "P2" }
});

console.log(out);

AutoGPT: make retries harmless (same key → same side effect)

AutoGPT-style loops often re-run actions after partial failures. You want safe replays, not duplicated outcomes.

POST https://api.onceonly.tech/v1/check-lock
Authorization: Bearer once_live_***
Content-Type: application/json

{
  "key": "7cfe0f4a... (sha256 of run_id + action + normalized params)",
  "ttl": 86400,
  "metadata": {
    "tool": "charge_customer",
    "invoice_id": "inv_778",
    "amount_cents": 2900,
    "currency": "usd"
  }
}
Response same key returns same result
{
  "success": true,
  "status": "locked",
  "key": "7cfe0f4a...",
  "ttl": 86400,
  "first_seen_at": null
}
Retry behavior
{
  "success": false,
  "status": "duplicate",
  "key": "7cfe0f4a...",
  "ttl": 86400,
  "first_seen_at": "2026-01-24T20:15:00Z"
}

Your tool/backend should now short-circuit and return a safe “already processed” response (or your cached result) instead of re-executing.

🔁 Retry Logic That Doesn’t Create Duplicates

You still want retries (networks fail). The rule is: retry the request, but dedupe the side effect.

Recommended production settings

  • Timeouts: fail fast client-side, retry with same idempotency key.
  • Backoff: exponential backoff with jitter.
  • Lease (in-flight lock): if the same key is currently executing, return a “processing” status or wait briefly then return cached result.
  • TTL: keep results long enough to cover agent loops (hours/days, depending on your workflows).
Rule of thumb
If the tool can create a real-world side effect (email, charge, ticket, write, webhook), it must be idempotent. If it must be idempotent, it must have a stable key.

🧩 The Missing Layer in AI Systems

Frameworks solve reasoning and tool selection. Your backend solves business logic. But neither guarantees exactly-once execution. That’s why agentic systems feel unpredictable in production: they’re missing a reliability boundary.

This is the same class of problem as exactly-once processing myths, plus the agent-specific chaos of probabilistic loops.

🚀 The 5-Minute Fix

Add an idempotency layer designed for AI agents and automation systems. Your agent becomes powerful — without becoming dangerous.

TL;DR: In AI systems, retries are behavior — not bugs. So your reliability layer must be real — not prompts.

❓ Frequently Asked Questions

Why do LangChain tools sometimes execute twice?

LangChain agents often re-plan and retry when tool outputs are slow, ambiguous, truncated, or not “useful enough” for the next step. Without idempotency, the same tool call replays the side effect.

CrewAI: why are there duplicate tool calls?

CrewAI commonly runs multi-agent flows and parallel tasks. Duplicates often come from concurrency (two agents call the same tool) plus retries (one agent replays after uncertainty). You need a shared server-side dedupe key.

AutoGPT: how do I stop repeating actions in loops?

AutoGPT-style loops are designed to iterate. You don’t stop the loop — you make the loop safe: wrap side effects with an idempotency key so repeated calls return the same stored result.

Why can’t I just tell the AI “don’t repeat actions”?

LLMs are probabilistic. Even with explicit instructions, they can misinterpret tool responses, forget prior context, hallucinate that a previous action failed, or re-evaluate the plan differently. Prompting reduces risk — but doesn’t guarantee safety.

How do I generate an idempotency key for tool calls?

Combine stable identifiers with action content: sha256(conversation_or_task_id + tool_name + normalized_parameters). Normalization matters: sort keys, remove transient fields, and keep consistent formatting.

What if the AI legitimately needs to do the same action twice?

Then it’s not the “same” request: the parameters differ, or you include a deliberate nonce. Idempotency dedupes identical requests; it does not block distinct intentions.

Does this work with OpenAI function calling / tool use?

Yes. OpenAI tool calling, Anthropic tool use, and any agent framework benefits. The model doesn’t “become idempotent.” Your execution layer becomes reliable.