Webhook-driven systems look simple until real traffic arrives. Duplicate deliveries, retries, out-of-order events, and temporary downtime can silently corrupt data if your consumer is not idempotent. In this practical guide, you will build a production-ready Node.js webhook processor using Express, BullMQ, Redis, and PostgreSQL so each event is verified, queued, processed safely, and acknowledged with confidence.
Why idempotent webhook processing matters
Most providers (payments, CRM, chat, logistics) deliver webhooks with an “at least once” guarantee. That means duplicates are normal behavior, not edge cases. If your endpoint directly executes business logic on every delivery, you can trigger double refunds, duplicate orders, or repeated emails.
A safer model is:
- Verify webhook signatures before trusting payloads.
- Persist an idempotency record keyed by provider event ID.
- Queue accepted events for background processing.
- Use transactional updates so one event changes state exactly once.
If you are also tuning API performance, this pairs well with our PostgreSQL performance guide and trusted Docker CI pipeline setup.
Architecture and flow
Request path
- Provider sends webhook to
/webhooks/provider. - Server validates signature and timestamp.
- Server inserts
provider_event_idinto a dedupe table with unique constraint. - New events are enqueued in BullMQ; duplicates return
200 OKand stop. - Worker processes the job inside a database transaction.
Data model
Create a compact table to track delivery status:
CREATE TABLE webhook_events (
id BIGSERIAL PRIMARY KEY,
provider_event_id TEXT NOT NULL UNIQUE,
event_type TEXT NOT NULL,
payload JSONB NOT NULL,
status TEXT NOT NULL DEFAULT 'received',
received_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
processed_at TIMESTAMPTZ
);
CREATE INDEX idx_webhook_events_status_received_at
ON webhook_events (status, received_at DESC);This gives you deterministic deduplication via the unique key and operational visibility via status indexes.
Step 1: Verify signatures and enqueue safely
Never parse and trust payloads before cryptographic validation. The example below assumes HMAC-SHA256 with provider secret and timestamp protection.
import express from "express";
import crypto from "crypto";
import { Queue } from "bullmq";
import { Pool } from "pg";
const app = express();
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const queue = new Queue("webhooks", { connection: { url: process.env.REDIS_URL } });
app.post("/webhooks/provider", express.raw({ type: "application/json" }), async (req, res) => {
const signature = req.header("x-provider-signature") || "";
const timestamp = req.header("x-provider-timestamp") || "";
const rawBody = req.body; // Buffer
const signedPayload = `${timestamp}.${rawBody.toString("utf8")}`;
const expected = crypto
.createHmac("sha256", process.env.WEBHOOK_SECRET)
.update(signedPayload)
.digest("hex");
const safeEqual =
signature.length === expected.length &&
crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected));
const ageSec = Math.abs(Date.now() / 1000 - Number(timestamp));
if (!safeEqual || ageSec > 300) return res.status(401).send("invalid signature");
const event = JSON.parse(rawBody.toString("utf8"));
const eventId = event.id;
try {
const insert = await pool.query(
`INSERT INTO webhook_events(provider_event_id, event_type, payload)
VALUES ($1, $2, $3)
ON CONFLICT (provider_event_id) DO NOTHING
RETURNING id`,
[eventId, event.type, event]
);
if (insert.rowCount === 0) {
return res.status(200).send("duplicate ignored");
}
await queue.add("process-webhook", { eventId }, {
jobId: eventId,
attempts: 8,
backoff: { type: "exponential", delay: 2000 },
removeOnComplete: 1000,
removeOnFail: 5000
});
return res.status(202).send("accepted");
} catch (err) {
console.error("ingest error", err);
return res.status(500).send("server error");
}
});The key design choice is ON CONFLICT DO NOTHING plus jobId = eventId, which gives double protection against duplicate processing.
Step 2: Process jobs transactionally
Workers should be restart-safe and race-safe. Use explicit transactions and state transitions so failures are visible and retriable.
import { Worker } from "bullmq";
import { Pool } from "pg";
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
new Worker("webhooks", async job => {
const { eventId } = job.data;
const client = await pool.connect();
try {
await client.query("BEGIN");
const row = await client.query(
`SELECT id, status, payload
FROM webhook_events
WHERE provider_event_id = $1
FOR UPDATE`,
[eventId]
);
if (row.rowCount === 0) {
await client.query("ROLLBACK");
return;
}
const event = row.rows[0];
if (event.status === "processed") {
await client.query("COMMIT");
return;
}
// Example business action: upsert payment/order state once
await client.query(
`INSERT INTO orders (provider_order_id, amount, status)
VALUES ($1, $2, $3)
ON CONFLICT (provider_order_id)
DO UPDATE SET status = EXCLUDED.status`,
[event.payload.data.order_id, event.payload.data.amount, "paid"]
);
await client.query(
`UPDATE webhook_events
SET status = 'processed', processed_at = NOW()
WHERE provider_event_id = $1`,
[eventId]
);
await client.query("COMMIT");
} catch (err) {
await client.query("ROLLBACK");
throw err;
} finally {
client.release();
}
}, {
connection: { url: process.env.REDIS_URL },
concurrency: 20
});Reliability patterns you should add in production
1) Dead-letter strategy
After retry exhaustion, move failed jobs to a dead-letter queue and alert on count increase. This prevents silent data drift and supports rapid recovery.
2) Observability
Track: ingest latency, duplicate rate, queue depth, retry count, and end-to-end processing time. If you are standardizing telemetry, our OpenTelemetry-focused API article offers practical metric and tracing ideas transferable to Node.js.
3) Replay tooling
Build an admin-safe replay command that requeues events by ID range and state. Controlled replay is essential after outages, schema changes, or downstream incidents.
4) Event contract versioning
Store provider version in payload metadata, and route to versioned handlers. This reduces breakage when providers add fields or deprecate formats.
How this connects to broader architecture
Idempotent webhooks complement event-driven systems and outbox workflows. If you are designing distributed consistency end to end, revisit our transaction-safe outbox pattern walkthrough. For auth-critical callback endpoints, pair this with our passkey-first account security guide.
Conclusion
Reliable webhook processing is less about frameworks and more about guarantees: verify every request, deduplicate at write time, process in transactions, and observe everything. With this architecture, your Node.js services can accept retries and partial failures without corrupting business state. Start with one provider integration, add metrics from day one, and your webhook layer will scale with confidence instead of becoming an operational risk.
FAQ
Should I return 200 or 202 for webhook ingestion?
If you enqueue for async processing, 202 Accepted is semantically accurate for new events. For known duplicates, 200 OK is usually fine to stop provider retries.
Is Redis enough for idempotency?
Redis is useful for queueing and short-lived locks, but database-level uniqueness on provider_event_id is the stronger source of truth for durable idempotency.
How long should I keep webhook event payloads?
Commonly 30 to 90 days depending on compliance and debugging needs. Keep enough retention to support replay and incident investigation.
Can I process webhooks synchronously?
You can for low volume, but async queues improve resilience under burst traffic and downstream slowness, especially when providers aggressively retry.

Leave a Reply