Developers

Quick Start

Request a partner key from the VETS Coin team.
Use the OpenAPI spec to generate a client (or integrate directly).
Sign requests using the partner signing scheme described in the API guide.
Start in low-volume mode, monitor errors, then scale.

API Query Presets

Saved preset queries for quick copy/run examples in the Developers hub.

Preset	Request	Description	Actions
System Status (24h)	GET /api/public/system-status/uptime?hours=24	Quick uptime check for the last 24 hours.	Run
Incidents (7d)	GET /api/public/system-status/incidents?hours=168&limit=30	Recent incident windows with active/resolved spans.	Run
Latency Percentiles	GET /api/public/latency-percentiles?hours=24	Public p50/p95/p99 latency telemetry.	Run
Trust Manifest	GET /trust.json	Machine-readable trust controls and evidence pointers.	Run

Integration Wizard + Schema Explorer

Use guided setup for your first call, then inspect endpoints/fields/examples in the explorer.

Getting Started Wizard Wizard JSON Schema Explorer Schema Explorer JSON

Audio Share Link Helper

Generate direct destination share links for track pages or audio files with channel-specific actions.

Open Audio Share Helper Audio Share Cookbook Landing KPI Definitions Audio Share API (JSON)

curl -sS "https://vets-coin.com/api/public/audio-share-links?track_url=/faq&title=VETS%20Audio&text=Open%20this%20audio%20link%20directly.&channels=x,telegram,email"

curl -sS "https://vets-coin.com/api/public/audio-share-links/validate?track_url=/faq&channels=x,email"

curl -sS "https://vets-coin.com/api/public/audio-share-links/preview?track_url=/faq&title=Audio%20Preview&text=Campaign%20Preview"

curl -sS -X POST "https://vets-coin.com/api/public/audio-share-links/validate/batch" -H "Content-Type: application/json" -d '{"defaults":{"channels":"x,email"},"items":[{"track_url":"/faq"},{"track_url":"/faq","title":"Campaign B"}]}'

curl -sS "https://vets-coin.com/api/public/audio-share-links/expand?short_url=https://vets-coin.com/s/audio/abc123"

curl -sS "https://vets-coin.com/api/public/audio-share-links/channels.json"

curl -sS "https://vets-coin.com/api/public/audio-share-links/errors.json"

curl -sS "https://vets-coin.com/api/public/audio-share-links/warnings.json"

curl -sS "https://vets-coin.com/api/public/audio-share-links/guidance.json"

curl -sS "https://vets-coin.com/api/public/audio-share-links/policy.json"

curl -sS "https://vets-coin.com/api/public/audio-share-links/health"

Quickstart Sandbox Verification

Deterministic API-key sandbox check for client bootstrap automation.

Run Sandbox Verify

curl -sS \"https://vets-coin.com/api/public/quickstart/verify-key?api_key=sandbox_demo_key&client=cli\"

Webhook Simulator

Canned webhook payload scenarios for receiver validation and replay drills.

Scenarios JSON Run Endpoint

Scenario	Event Type	Run
Donation Claim Created	donation.claim
Donation Claim Redeemed	donation.claim_redeemed
Webhook Delivery Failed	webhook.delivery_failed

API Compatibility Canary

Strict-validation canary routes for early adopter testing before broad rollout.

Canary Registry JSON

curl -sS -X POST \"https://vets-coin.com/api/canary/echo\" -H \"Content-Type: application/json\" -d '{\"message\":\"hello\",\"request_id\":\"canary-1\"}'

Strict validation: on

SDK Starter Kits (Generated)

Download pre-generated starters built from the live OpenAPI specs.

Python SDK Starter ZIP TypeScript SDK Starter ZIP SDK Manifest JSON SDK Manifest Signature Release Manifest JSON Release Manifest Signature

python flask_api/scripts/generate_sdk_starters.py --out-dir flask_api/docs/sdk

OpenAPI Changelog

Baseline-to-current API diff published for integration planning and change audits.

OpenAPI Changelog (Markdown) OpenAPI Changelog (JSON) OpenAPI Changelog (RSS) OpenAPI Changelog (Atom) Latency Percentiles (JSON) Latency Percentiles (RSS) Status Feed Index (JSON) Status Feed Health (JSON) Status Feed Health (CSV)

python flask_api/scripts/generate_openapi_changelog.py --spec flask_api/docs/openapi.yaml --spec flask_api/docs/openapi-transparency.yaml --baseline-dir flask_api/docs/openapi/baseline

Auth Headers Quickref

Partner mutation requests should include these headers.

X-Partner-Key: <KEY_ID>

X-Partner-Timestamp: <UNIX_SECONDS>

X-Partner-Signature: <HMAC_SHA256_HEX>

Idempotency-Key: <UUID_OR_UNIQUE_TOKEN>

python flask_api/scripts/sign_partner_request.py --base-url https://vets-coin.com --method POST --path /api/partner/capabilities --key-id "<KEY_ID>" --secret "<SECRET>" --json '{"capability":"claims"}' --print-only

Partner Error Taxonomy

Use error_code for branching logic. Keep error for operator logs/UI.

Error Code	HTTP	Retry Class	Client Action
rate_limited	429	retryable	Back off; honor Retry-After and X-RateLimit-Reset.
replay_detected	409	retryable	Regenerate nonce/idempotency key and retry once.
idempotency_replay	409	safe-noop	Treat as duplicate success path; fetch latest state.
unauthorized	401	fail-fast	Rotate/check partner credentials and request signature inputs.
forbidden	403	fail-fast	Missing scope; request scope upgrade or use correct key.
db_unavailable	503	retryable	Retry with jittered backoff; open incident if persistent.
server_error	500	retryable	Retry with capped backoff and capture request_id.

Webhook Replay & Verification

For partner webhook receivers: verify each event signature and keep replay/testing commands handy.

Incoming headers: X-Webhook-Id, X-Webhook-Event, X-Webhook-Timestamp, X-Webhook-Signature

Signature formula: hex(HMAC_SHA256(webhook_secret, f"{timestamp}.{raw_body_json}"))

Payload schema: {"event_id":"123","event_type":"salutes.credit","data":{"...event payload..."}}

Verification tip: compute HMAC against the exact raw request body string before JSON reserialization.

python flask_api/scripts/sign_partner_request.py --base-url https://vets-coin.com --method POST --path /api/partner/webhooks/42/test --key-id "<KEY_ID>" --secret "<SECRET>" --print-only

Admin replay route (admin session required): POST /admin/partners/webhook-events/<event_id>/replay

Webhook Receiver Pseudo-Handler (Flask)

Minimal receiver pattern: verify signature, block replay, and ack idempotently.

raw=request.get_data(as_text=True); ts=request.headers.get("X-Webhook-Timestamp",""); sig=request.headers.get("X-Webhook-Signature",""); expected=hmac_sha256_hex(secret, f"{ts}.{raw}"); event_id=request.headers.get("X-Webhook-Id",""); if not hmac.compare_digest(sig, expected): return {"success":False,"error":"unauthorized"}, 401; if replay_cache_seen(event_id): return {"success":True,"replayed":True}, 200; process_event_idempotently(event_id, raw); return {"success":True}, 200

Webhook Replay Cache TTL Guidance

Recommended retention window for webhook event-id dedupe keys.

Store each X-Webhook-Id in a fast replay cache for at least 24h (48h preferred for delayed retries).

Example: redis SETEX webhook:event:<event_id> 172800 1

Webhook Secret Rotation Overlap (Receiver)

During secret rotation, accept either active secret for a short overlap window, then retire old.

valid=False; for candidate in [WEBHOOK_SECRET_CURRENT, WEBHOOK_SECRET_PREVIOUS]: expected=hmac_sha256_hex(candidate, f"{ts}.{raw}"); valid = valid or hmac.compare_digest(sig, expected); if not valid: return {"success":False,"error":"unauthorized"}, 401

Rotation rule: keep previous secret for <=24h overlap, then remove it from verifier list.

Webhook Timestamp Skew Guard

Reject signatures outside a short timestamp window to reduce replay surface.

now=int(time.time()); ts=int(request.headers.get("X-Webhook-Timestamp","0")); if abs(now - ts) > 300: return {"success":False,"error":"stale_timestamp"}, 401

Clock source guidance: sync receivers with NTP/chrony so valid requests are not rejected by drift.

Webhook Event-Type Allowlist Guard

Acknowledge unknown event types without side effects to keep receiver pipelines resilient.

allowed={"salutes.credit","salutes.debit","donation.claimed"}; event_type=request.headers.get("X-Webhook-Event",""); if event_type not in allowed: log_unknown_event(event_type); return {"success":True,"ignored":True}, 200

Webhook Delivery-ID Persistence Guard

Persist webhook event IDs with a unique key so retries cannot duplicate state changes.

Schema rule: CREATE UNIQUE INDEX ux_webhook_events_event_id ON webhook_events(event_id);

Receiver rule: insert event_id before side effects; on duplicate-key return {"success":True,"duplicate":True}, 200.

Webhook Async-Ack Processing Pattern

Acknowledge quickly, process safely in background workers, and retry from queue on transient failures.

enqueue_result=queue_push({"event_id":event_id,"payload":raw}); if not enqueue_result.ok: return {"success":False,"error":"queue_unavailable"}, 503; return {"success":True,"queued":True}, 200

Worker rule: process queued event idempotently; on transient error requeue with capped backoff + dead-letter threshold.

Webhook Dead-Letter Replay Pattern

Support operator-triggered re-drive by delivery ID so failed events can be replayed safely.

Replay API sketch: POST /admin/partners/webhook-events/<event_id>/replay -> {"success":true,"event_id":"...","requeued":true}

Worker rule: before re-drive, check event_id already processed; if yes, ack duplicate and skip side effects.

Webhook Processing-State Lifecycle

Track a simple state model so dashboards and alerts can identify stuck or failing deliveries.

State path: queued -> processing -> succeeded | failed

Schema suggestion: webhook_events(event_id, state, attempts, last_error, updated_at_utc)

Webhook Retry Policy Pattern

Use bounded retries with exponential backoff to avoid hot-loop failures.

Retry schedule example (seconds): [5, 15, 60, 300, 900] with max_attempts=5 then dead-letter.

Pseudocode: delay=min(900, 5 * (2 ** (attempt-1))); attempt>=5 -> state=failed_dead_letter

Webhook Observability Metrics

Track a minimal metrics set so operators can detect reliability regressions quickly.

Core metrics: webhook_success_rate_5m, webhook_retry_rate_5m, webhook_dead_letter_count_24h.

Example formulas: success_rate = succeeded / total; retry_rate = retried / total; dead_letter_count = count(state="failed_dead_letter").

Webhook Alert Threshold Starters

Baseline thresholds to start with before tuning to real traffic patterns.

Page if success_rate_5m < 0.98 OR dead_letter_count_24h > 0 OR retry_rate_5m > 0.10.

Warn if success_rate_5m < 0.995 for 3 consecutive windows.

Webhook SQL Rollup Query (Hourly)

Use an hourly rollup query to power reliability widgets without scanning raw event logs each request.

SELECT date_trunc('hour', updated_at_utc) AS hour_utc, COUNT(*) AS total, SUM(CASE WHEN state='succeeded' THEN 1 ELSE 0 END) AS succeeded, SUM(CASE WHEN attempts > 1 THEN 1 ELSE 0 END) AS retried, SUM(CASE WHEN state='failed_dead_letter' THEN 1 ELSE 0 END) AS dead_letter FROM webhook_events WHERE updated_at_utc >= NOW() - INTERVAL '24 hours' GROUP BY 1 ORDER BY 1 DESC;

Webhook Prometheus Query Starters

Starter PromQL-style panels for success, retry, and dead-letter trend visibility.

Success rate (5m): sum(rate(vets_webhook_events_total{state="succeeded"}[5m])) / sum(rate(vets_webhook_events_total[5m]))

Retry rate (5m): sum(rate(vets_webhook_events_total{retried="true"}[5m])) / sum(rate(vets_webhook_events_total[5m]))

Dead-letter count (24h): increase(vets_webhook_events_total{state="failed_dead_letter"}[24h])

Webhook Triage Action Matrix

Map metric breaches to immediate actions so incident response is deterministic.

Signal	Threshold	Immediate Action
success_rate_5m	< 0.98	Page on-call, inspect queue backlog and signature failures.
retry_rate_5m	> 0.10	Check upstream latency/error spikes, raise worker concurrency temporarily.
dead_letter_count_24h	> 0	Run dead-letter replay flow by event_id after fix validation.

Webhook Incident Timeline Pattern

Capture first/last seen timestamps so postmortems can quantify blast radius and duration.

Track fields: incident_id, first_seen_utc, last_seen_utc, duration_seconds, affected_event_count.

Duration formula: duration_seconds = EXTRACT(EPOCH FROM (last_seen_utc - first_seen_utc)).

Webhook Error-Budget Burn-Rate

Track burn-rate against your webhook success SLO to detect fast reliability erosion.

Burn-rate formula: (1 - success_rate_window) / (1 - target_slo). Example target_slo=0.999.

Action hint: burn_rate_5m > 2.0 + burn_rate_1h > 1.0 => page and gate risky deploys.

Webhook Postmortem Checklist

Use a fixed checklist to keep incident learning loops consistent and auditable.

Checklist: impact, customer scope, first_seen_utc, last_seen_utc, root_cause, corrective_action, owner, due_date_utc.

Closure rule: incident stays open until corrective action is merged, deployed, and replay validation passes.

Webhook Runbook Escalation Pattern

Define escalation ownership and update cadence before incidents happen.

Role assignment: designate Incident Commander (IC), Communications Lead, and Technical Owner at incident open.

Cadence: status updates every 15 minutes while active; trigger executive update if duration >= 60 minutes or user impact is severe.

Webhook Status-Page Messaging Pattern

Use consistent status phases so users and partners understand incident progression.

Phase order: degraded -> investigating -> monitoring -> resolved

Message template: "[phase] webhook delivery latency elevated; next update in 15 minutes."

Webhook Stakeholder Update Template

Keep partner, internal, and executive updates aligned from one structured template.

Partner update: current_status, affected_endpoints, expected_next_update_utc, workaround_available.

Internal ops update: suspected_root_cause, mitigation_progress, blockers, owner_on_point.

Executive summary: user_impact_level, ETA_confidence, decision_requests, reputational_risk_notes.

Webhook Integration-Release Checklist

Use a pre-release checklist to reduce deployment risk for partner webhook changes.

Checklist: run preflight, deploy canary partner key, monitor retry/dead-letter metrics for 30 minutes, keep rollback switch ready.

Rollback rule: if success_rate_5m drops below SLO or dead_letter_count increases, rollback immediately and replay impacted event_ids.

Webhook Key-Rotation Rollout Checklist

Rotate keys without downtime by running old/new credentials in a controlled overlap window.

Step 1: issue new key + secret and validate against sandbox/test webhook route.

Step 2: dual-run old+new key for 24h, monitor auth failures, then revoke old key immediately after stable window.

Step 3: confirm no traffic on old key_id for 15m before final revoke commit.

Webhook Signature-Version Migration Checklist

Migrate signing schemes with overlap windows and a fixed deprecation cutoff.

Migration plan: accept v1 + v2 signatures for overlap window, emit v2-only from sender, track v1 traffic decay.

Cutoff rule: publish cutoff_date_utc, alert partners 14d/7d/1d, reject v1 after cutoff with explicit upgrade error.

Webhook Payload-Schema Versioning Pattern

Version payload contracts explicitly so receivers can parse safely during schema evolution.

Envelope example: {"schema_version":"2","event_id":"...","event_type":"...","data":{...}}

Compatibility rule: keep backward parsing support for at least one release window before removing old fields.

Webhook Schema Deprecation Timeline

Publish a fixed timeline so partners can migrate before breaking schema removals.

Timeline: announce deprecation_date_utc, run dual-support window, enforce removal_date_utc.

Communication cadence: notify at T-30d, T-14d, T-7d, and T-1d with upgrade examples.

API Deprecation Calendar

Machine-readable deprecation schedule for endpoint sunset planning.

curl -sS "https://vets-coin.com/developers/deprecations.json"

curl -sS "https://vets-coin.com/developers/deprecations-playbook.md"

curl -sS "https://vets-coin.com/developers/deprecations-playbook.json"

curl -sS "https://vets-coin.com/developers/deprecations.rss"

Generated migration playbook: /developers/deprecations-playbook.md • /developers/deprecations-playbook.json • /developers/deprecations.rss

No active endpoint sunsets are currently scheduled.

Webhook Compatibility Test Matrix

Validate sender/receiver version combinations before changing production defaults.

Sender Version	Receiver Version	Expected Outcome
v1	v1	Pass (legacy baseline)
v2	v1	Pass only during dual-support window
v1	v2	Pass only during dual-support window
v2	v2	Pass (post-cutover target)

Webhook Contract-Test Checklist

Run deterministic contract tests before promoting webhook schema changes.

Checklist: required_fields_present, optional_fields_tolerated, unknown_fields_ignored, signature_verification_passes.

Gate rule: block release if any contract test fails on canary receiver fixtures.

Webhook Replay-Test Scenario

Simulate duplicate deliveries to verify idempotent receiver behavior.

Scenario: send identical payload + event_id twice within replay-cache window.

Expected: first delivery applies side effects; second returns success with duplicate/replayed indicator and no additional mutation.

Webhook Latency SLO Targets

Use percentile-based SLO targets to detect delivery-path regressions before failures spike.

Target example: P50 < 250ms, P95 < 1000ms, P99 < 3000ms for end-to-end webhook processing latency.

Alert starter: page when P95 > 1500ms for 3 consecutive 5m windows OR P99 > 5000ms in any 5m window.

Webhook Queue-Backlog SLO Targets

Track queue depth and oldest-message age so delayed processing is detected early.

Target example: queue_depth < 500 and oldest_message_age_seconds < 120 during steady state.

Alert starter: page when queue_depth > 2000 OR oldest_message_age_seconds > 600 for 10 minutes.

Webhook DLQ-Drain Runbook

Replay dead-lettered events in controlled batches to avoid reintroducing overload.

Batch strategy: replay 100 events per batch, wait 60s cooldown, then re-check latency + backlog before next batch.

Verification checks: error rate stable, queue_depth recovering, no duplicate side effects, replayed event_ids marked succeeded.

Webhook Canary-Failure Rollback

If canary delivery quality regresses, roll back quickly before broad partner impact.

Immediate action: disable canary key_id, stop new canary deliveries, and revert sender route to stable key.

Recovery action: replay canary window events (start_ts..end_ts) through stable pipeline with idempotency safeguards enabled.

Exit criteria: success_rate_5m returns above SLO, retry/dead-letter rates normalize, and canary replay backlog is fully drained.

Webhook Canary-Success Promotion Checklist

If canary quality remains healthy, promote traffic in controlled steps with rollback guardrails.

Promotion plan: 1% -> 5% -> 25% -> 50% -> 100%; hold each step for at least 15 minutes.

Gate each step on stable success_rate_5m, retry_rate_5m, dead_letter_count_24h, queue_depth, and latency percentiles.

Rollback guardrail: immediately revert to previous step if SLO breach persists for 2 consecutive 5m windows.

Webhook Rollback-Drill Cadence

Run routine rollback drills so incident response stays fast and predictable.

Cadence: run a scheduled rollback simulation at least once per month and after major webhook pipeline changes.

Drill checklist: trigger synthetic SLO breach, disable canary key, replay drill window, verify stable recovery in dashboards.

Evidence to retain: timeline timestamps, operator actions, metric screenshots, and confirmed replay completion count.

Webhook Incident Command: First 10 Minutes

Use a fixed opening sequence so critical incident actions happen immediately and in order.

Minute 0-2: assign IC + technical owner, declare incident channel, snapshot success/retry/dead-letter + backlog metrics.

Minute 2-5: decide contain action (disable canary key, pause risky rollout, cap replay) and log rationale.

Minute 5-10: publish first status update, set next update timer (15m), and open action checklist with owners.

Webhook Incident Comms Cadence

Keep predictable update clocks across audiences during active incidents.

Partner-facing updates: every 30 minutes while degraded, include affected endpoints + expected next update time.

Internal ops updates: every 15 minutes, include metric deltas, mitigation status, and current blocker owner.

Executive updates: every 60 minutes (or on major change), include user impact, risk level, and ETA confidence.

Webhook Incident Closure Checklist

Close incidents only after objective recovery verification and documented handoff.

Recovery gate: success_rate_5m above SLO for 30m, retry/dead-letter rates back to baseline, and backlog fully drained.

Data gate: replay queue empty, no unowned failed events, and incident timeline updated with final root-cause statement.

Comms gate: publish resolved update, record customer impact window, and link postmortem owner + due date.

Webhook Post-Incident Handoff Packet

Standardize handoff artifacts so follow-up work does not drift after incident closure.

Required fields: incident_id, severity, start/end_utc, affected endpoints, replay count, unresolved risks.

Action tracker: each corrective action must include owner, ETA, dependency, and verification check.

Handoff rule: schedule a 24h review checkpoint to confirm action status and detect any regression signal.

Webhook Corrective-Action Verification Ledger

Track every corrective action to completion with clear verification evidence.

Ledger columns: action_id, owner, due_date_utc, status, dependency, verification_check, verified_at_utc.

Status model: planned -> in_progress -> blocked -> verified -> closed (only close after verification evidence is linked).

Audit trail: capture changed_by + changed_at_utc on every status transition and store immutable comment history.

Webhook Dependency-Risk Register

Track upstream dependency risks so incident response includes owner, blast radius, and fallback paths.

Register fields: dependency_name, service_owner, oncall_contact, blast_radius, fallback_mode, mitigation_runbook, last_tested_utc.

Risk scoring: classify critical/high/medium by user-impact scope + single-point-of-failure likelihood.

Governance rule: run dependency failover test at least quarterly and attach evidence link to each register row.

Webhook Dependency Failover-Drill Matrix

Define expected fallback behavior and recovery targets per dependency before incidents occur.

Dependency	Fallback Mode	RTO Target	Drill Cadence
primary_webhook_queue	Switch producer to secondary queue cluster	< 5m	Monthly
signature_validation_store	Read-through cache with strict TTL + deny-on-miss guard	< 10m	Quarterly
metrics_ingestion	Buffer locally and backfill on recovery	< 15m	Quarterly

Verification: each drill must record actual_rto, fallback_result, and follow-up action if target is missed.

Webhook Dependency Alert-Routing Matrix

Map each dependency breach signal to the right pager owner and escalation path.

Signal / Breach	Primary Pager Owner	Escalation Path
queue_depth > 2000 for 10m	Webhook Platform On-Call	Escalate to Incident Commander at +10m if unresolved
signature_validation_errors_rate > 2%	Security/API Auth On-Call	Escalate to Security Lead + IC immediately
dead_letter_count_24h increase > threshold	Reliability On-Call	Escalate to Platform Manager at +15m; start replay runbook

Routing rule: each alert route must include backup owner and escalation timeout to prevent notification dead-ends.

Webhook Dependency Escalation Decision Tree

Use a deterministic branch when dependency failures require containment, failover, or replay actions.

Branch 1 (contain): if auth/signature failure rate spikes and cause is unknown, pause risky rollout and gate new mutations.

Branch 2 (failover): if primary dependency outage is confirmed and fallback is healthy, switch traffic to fallback immediately.

Branch 3 (replay): when dependency recovers, run bounded replay batches only after queue and latency SLOs are stable.

Escalation trigger: if no branch restores SLO within 15 minutes, escalate to IC + platform lead and open incident bridge.

Webhook Dependency Freeze-Threshold Policy

Define automatic mutation-freeze gates so severe dependency failures cannot cascade into larger data integrity incidents.

Freeze gate A: trigger mutation_freeze=true when signature_validation_errors_rate > 5% for 5 minutes.

Freeze gate B: trigger mutation_freeze=true when dead_letter_rate_5m > 2% and queue_depth > 3000 simultaneously.

Unfreeze rule: require 15 minutes of SLO-stable metrics plus explicit IC approval and audit-log note.

Webhook Freeze-Override Governance

Allow emergency overrides only under strict authority, dual-approval, and timed expiry controls.

Who can override: Incident Commander + Platform Lead only (no single-user override for production freeze state).

Approval model: require dual approval (ic_approved=true and platform_approved=true) before override_active=true.

Expiry rule: auto-expire override in 30 minutes unless re-approved; emit audit event on activate, renew, and expire.

Webhook Override Threshold-Exception Process

Use this compact process when freeze thresholds need a time-boxed exception during active incident response.

Approver quorum: require 2 of 3 approvals (IC, Platform Lead, Security Lead) before threshold_exception_active=true.

Expiry cap: enforce hard expiry in 30 minutes; renewal requires fresh quorum + explicit incident status update.

Audit note minimums: reason, impacted endpoints, projected risk window, rollback trigger, and owner of next review.

Webhook Override Audit-Log Schema

Use a consistent audit schema so every override lifecycle action is traceable and reviewable.

Required fields: override_id, action, actor_id, actor_role, reason_code, reason_note, expires_at_utc, state, created_at_utc.

State model: requested -> approved -> active -> renewed -> expired (or revoked).

Audit guarantees: append-only records, immutable timestamps, and link to incident_id for every override event.

Webhook Override-Review Cadence

Review active overrides on a fixed cadence so emergency controls do not drift into long-lived risk.

Daily review: list all override_active=true records, verify business justification, and confirm next expiry timestamp.

Stale-alert rule: page on-call if any override remains active > 24h or has no linked incident/update note.

Closure rule: convert active override to expired/revoked within 15 minutes after risk condition clears.

Webhook Override Emergency-Breakglass Policy

Permit single-actor emergency override only for extreme availability scenarios and force rapid expiry.

Breakglass path: allow single actor only when incident severity is critical and dual-approval path is unavailable.

Forced expiry: breakglass override expires in 10 minutes with no silent extension; renewal requires fresh explicit action.

Control rule: page IC + security lead immediately and require post-incident review note within 24 hours.

Webhook Override Revocation Protocol

Revoke overrides quickly and consistently once the risk condition clears or misuse is detected.

Revocation trigger: unauthorized use, stale override, or restored system health beyond unfreeze criteria.

Execution steps: set override_active=false, restore default freeze policy, and run rollback validation checks.

Notification rule: send revoke event to IC, security lead, and operations channel with reason + timestamp.

Webhook Override Incident-Communication Template

Use consistent messaging at override activation, update, and revocation checkpoints.

Activate message: "Override activated" + override_id + reason + forced expiry + next update time.

Update message: current risk status + remaining override time + mitigation progress + expected revoke window.

Revoke message: "Override revoked" + revoke reason + restored controls + follow-up actions owner/ETA.

Webhook Override Postmortem Addendum

Capture override-specific outcomes so postmortems include control-side effects and residual risk.

Document what override changed: policy gates bypassed, mutation paths affected, and time window active.

Residual risk section: outstanding data reconciliation, delayed replay impacts, and temporary control gaps.

Closure requirement: assign explicit owner + due date for each residual risk item before incident closure.

Webhook Override KPI Tracking

Track override usage and quality metrics so governance decisions are backed by clear trend data.

Core KPIs: override_activation_count_30d, override_avg_duration_minutes, override_stale_ratio_30d.

Derived KPI: stale ratio = stale_overrides_30d / total_overrides_30d (stale means active > 24h).

Alert starter: page governance owner if stale ratio exceeds 5% or avg duration exceeds 60 minutes for 2 weeks.

Webhook Override Trend-Review Checklist

Run a weekly governance review to turn override metrics into concrete decisions and action items.

Weekly agenda: review activation_count_30d trend, avg_duration trend, stale_ratio trend, and top incident categories.

Decision checkpoint: keep, tighten, or relax override thresholds based on two-week trend direction.

Output requirement: record decisions, owner, ETA, and expected KPI impact for each approved change.

Webhook Override Policy-Change Guardrail

Apply override policy threshold changes inside controlled windows and auto-revert quickly if reliability degrades.

Change window: apply threshold updates only during low-risk windows (Tue-Thu 14:00-18:00 UTC) and never during an active incident.

Rollback criterion: if success_rate_5m drops by > 0.5% or projected stale_ratio_30d rises above 5% within 30 minutes, rollback immediately.

Change record rule: store before/after KPI snapshots, approving owner, rollback owner, and rollback ETA in the same audit event.

Partner Mutation End-to-End (curl)

Minimal credit mutation flow: prepare body, sign headers, send request, then branch on status code.

export BASE_URL="https://vets-coin.com" KEY_ID="<KEY_ID>" SECRET="<SECRET>" PATH="/api/salutes/credit" BODY='{"user_id":"4","amount":100,"reason":"event_participation","source":"partner_portal"}'

python flask_api/scripts/sign_partner_request.py --base-url "$BASE_URL" --method POST --path "$PATH" --key-id "$KEY_ID" --secret "$SECRET" --json "$BODY" --idempotency-key "idem-$(date +%s)" --print-only

curl -sS -X POST "$BASE_URL$PATH" -H "Content-Type: application/json" -H "X-Partner-Key: <KEY_ID>" -H "X-Partner-Timestamp: <UNIX_SECONDS>" -H "X-Partner-Signature: <HMAC_SHA256_HEX>" -H "X-Partner-Nonce: <NONCE_HEX>" -H "Idempotency-Key: <UNIQUE_KEY>" --data "$BODY"

Handling: 2xx = success; 401/403 = fail fast and rotate/fix key scope; 409/429 = retry with fresh nonce+idempotency key and exponential backoff.

Partner Mutation Response Parsing (curl -w)

Use a deterministic branch to avoid treating auth/rate-limit errors as success.

HTTP_CODE=$(curl -sS -o /tmp/vets_partner_response.json -w "%{http_code}" -X POST "$BASE_URL$PATH" -H "Content-Type: application/json" -H "X-Partner-Key: <KEY_ID>" -H "X-Partner-Timestamp: <UNIX_SECONDS>" -H "X-Partner-Signature: <HMAC_SHA256_HEX>" -H "X-Partner-Nonce: <NONCE_HEX>" -H "Idempotency-Key: <UNIQUE_KEY>" --data "$BODY")

case "$HTTP_CODE" in 2*) echo "success";; 401|403) echo "fatal_auth_or_scope";; 409|429) echo "safe_retry_new_nonce_and_idempotency";; *) echo "inspect /tmp/vets_partner_response.json";; esac

Partner Error Payload Classification (jq)

Classify JSON error payloads for retry-safe vs fail-fast decisions.

ERR_CODE=$(jq -r '.error // \"unknown\"' /tmp/vets_partner_response.json)

case "$ERR_CODE" in replay_detected|idempotency_replay|rate_limited) echo "retryable";; unauthorized|forbidden) echo "fail_fast_auth_scope";; *) echo "manual_triage";; esac

Rate Limit Defaults

Current runtime defaults for common integration paths.

Flow	Limit	Config Key
Public API endpoints	90 / minute	RATE_LIMIT_PUBLIC_API_PER_MIN
Partner webhook endpoints	120 / minute	RATE_LIMIT_WEBHOOK_PER_MIN
Sandbox partner keys	30 / minute	PARTNER_SANDBOX_RATE_LIMIT_PER_MINUTE

Scope-to-Endpoint Matrix

Use this map when issuing partner keys and least-privilege scopes.

Scope	Typical Endpoints	Notes
read	GET /api/salutes/balance, GET /api/partner/user-lookup, GET /api/partner/wallet-info/<wallet>, GET /api/partner/users/<id>	Default read-only partner data access.
credit	POST /api/salutes/credit	Mutation scope; requires nonce + idempotency key.
debit	POST /api/salutes/debit	Mutation scope; requires nonce + idempotency key.
ledger	GET /api/salutes/ledger	Read-only transaction and audit history access.
donation	POST /api/partner/donation-claim	Donation claim trigger workflows.
users	POST /api/partner/users, PATCH /api/partner/users/<id>, POST /api/partner/users/<id>/wallets	Partner user lifecycle and wallet linking actions.
webhooks	GET/POST /api/partner/webhooks, DELETE /api/partner/webhooks/<id>, POST /api/partner/webhooks/<id>/test	Webhook endpoint management and test dispatch.
public	GET /api/public-stats, GET /api/public/system-status, GET /api/transactions/latest	No partner auth required.

Common 4xx/5xx Responses

Fast triage guide for partner integrations and automation hooks.

Status	Typical Cause	What To Do
400	Invalid payload or missing required fields.	Validate request body/query values and resend.
401	Missing/invalid partner auth signature or timestamp.	Re-sign request with current timestamp and correct secret.
403	Forbidden scope, disabled key, or admin-only route.	Verify key status/scopes and endpoint access policy.
409	Idempotency replay or business-state conflict.	Use a fresh idempotency key and re-check current state.
413	Request body exceeds API payload guardrails.	Reduce payload size or split into smaller requests.
429	Rate limit exceeded.	Back off with retry/jitter and reduce burst concurrency.
503	Dependency unavailable (DB/RPC) or temporary safe-mode gates.	Retry with backoff; monitor `/status` and alert endpoints.
500	Unexpected server error.	Capture request ID + payload hash and report for investigation.

Auth Error JSON Examples

Use these to build deterministic client error handling paths.

401 unauthorized: {"success":false,"error":"unauthorized"}

403 forbidden scope: {"success":false,"error":"forbidden"}

409 replay/idempotency: {"success":false,"error":"replay_detected"} or {"success":false,"error":"idempotency_replay"}

429 rate-limited: {"success":false,"error":"rate_limited"}

Retry/Backoff Strategy

Recommended behavior for resilient clients (especially around 429 and 503).

Retry only on 429/503/timeout; use exponential backoff with jitter (1s, 2s, 4s, 8s, max 30s).

curl -sS --retry 5 --retry-all-errors --retry-delay 1 "https://vets-coin.com/api/public/system-status"

Always send a fresh Idempotency-Key on mutation retries; never reuse a key for a different payload.

Related Pages

Transparency API Snippets

Copy/paste ready examples for anomaly JSON endpoints.

curl -sS "https://vets-coin.com/transparency/audit-anomalies/summary.json?run=latest"

curl -sS "https://vets-coin.com/transparency/audit-anomalies/runs.json"

curl -sS "https://vets-coin.com/transparency/audit-anomalies/trend.json?metric=rows&limit=200"

curl -sS "https://vets-coin.com/transparency/audit-anomalies/alerts.json?sigs_increase_threshold_pct=50&rows_increase_threshold_pct=50"

curl -sS "https://vets-coin.com/transparency/audit-anomalies/alerts.json?sigs_increase_threshold_pct=25&rows_increase_threshold_pct=25"

curl -sS "https://vets-coin.com/transparency/audit-anomalies/alerts.json?sigs_increase_threshold_pct=100&rows_increase_threshold_pct=100"

curl -sS "https://vets-coin.com/transparency/audit-anomalies/alerts.csv" -o audit_anomaly_alerts.csv

curl -sS "https://vets-coin.com/transparency/audit-anomalies/schema-registry.json"

curl -sS "https://vets-coin.com/transparency/audit-anomalies/diff/compare.json?run_a=latest&run_b=latest&include_rows=true&row_limit=25"

curl -sS "https://vets-coin.com/transparency/audit-anomalies/diff/compare.csv?run_a=latest&run_b=latest" -o audit_anomaly_compare.csv

curl -sS "https://vets-coin.com/status.json"

curl -sS "https://vets-coin.com/api/public/system-status"

curl -sS "https://vets-coin.com/api/public/system-status/trend?limit=288"

curl -sS "https://vets-coin.com/api/public/system-status/uptime?hours=168"

curl -sS "https://vets-coin.com/api/public/system-status/incidents?hours=168&limit=30"

curl -sS "https://vets-coin.com/api/public/deprecations/header-simulator?id=sample_deprecation&endpoint=/api/public/system-status"

curl -sS "https://vets-coin.com/developers/migration-status.json?id=sample_deprecation"

Transparency Endpoint Map

Use this quick table to choose the right endpoint for your integration task.

Endpoint	Best For	Output
/transparency/audit-anomalies/summary.json	Dashboard headers, run health, latest compare deltas	JSON
/transparency/audit-anomalies/runs.json	Run selectors, sync loops, available-history discovery	JSON
/transparency/audit-anomalies/trend.json	Charts, run-over-run monitoring, alert trend baselines	JSON
/transparency/audit-anomalies/alerts.json	Threshold-based alert snapshots for automation and paging	JSON
/transparency/audit-anomalies/alerts.csv	Spreadsheet-friendly current alert posture and threshold context	CSV
/transparency/audit-anomalies/schema-registry.json	Versioned field definitions and deprecation timelines for anomaly JSON payloads	JSON
/developers/deprecations.json	Endpoint deprecation calendar with announce/sunset windows and migration pointers	JSON
/developers/deprecations.rss	RSS feed of API deprecation windows for subscriber-based reminder workflows	RSS
/developers/deprecations-playbook.md	Auto-generated migration playbook with per-endpoint operational checklists	Markdown
/developers/deprecations-playbook.json	Tooling-friendly migration playbook companion with structured checklist steps	JSON
/developers/api-errors.json	Error catalog with remediation notes and retry guidance by `error_code`	JSON
/status.json	Alias for latest system status payload (same shape as `/api/public/system-status`)	JSON
/api/public/system-status	Partner-facing uptime, monitor freshness, and cron automation status	JSON
/api/public/system-status/trend	Lightweight rolling history for uptime charts and alert trend baselines	JSON
/api/public/system-status/uptime	Windowed availability percentages and per-check degraded rates	JSON
/api/public/system-status/incidents	Resolved/active incident windows with duration and affected checks	JSON
/transparency/audit-anomalies/diff/export.json	Selected-vs-latest anomaly type analysis	JSON
/transparency/audit-anomalies/diff/compare.json	Arbitrary run-to-run reconciliation and drift checks	JSON
/transparency/audit-anomalies/diff/compare.csv	Spreadsheet workflows and manual audit packs	CSV

Integration Checklist

Poll runs.json every 5-15 minutes to detect newly available complete runs.
Use summary.json for top-level status and run-to-latest deltas.
Use trend.json for charting and threshold alerts (recommend alert when bad count increases run-over-run).
Use alerts.json as the paging signal endpoint when your thresholds are crossed.
Use schema-registry.json to pin field compatibility checks before parser updates.
Use /api/public/system-status as a lightweight heartbeat for partner automation and uptime probes.
Use /api/public/system-status/trend for simple uptime trend charts and incident postmortems.
Use /api/public/system-status/uptime for SLO-style percentages over 24h/7d/30d windows.
Use /api/public/system-status/incidents for machine-readable outage windows and postmortem timelines.
Use diff/compare.json for machine checks and diff/compare.csv for manual reconciliation packets.
Cache responses for at least 60 seconds; these are audit snapshots, not per-transaction streaming endpoints.

What You Can Build