Best Practices | AgentPhone

Follow these best practices to build reliable, low-latency integrations with the AgentPhone API.

Voice latency

Use streaming (NDJSON) responses for voice webhooks

Streaming is the single biggest improvement you can make to perceived voice latency. Instead of making the caller wait in silence for your full response, send an interim chunk immediately and let TTS start speaking while your server continues processing.

Always use NDJSON streaming for voice webhooks. Return Content-Type: application/x-ndjson so TTS can start on the first chunk.
Send an interim filler chunk like “Let me check on that” before calling your LLM or external API. The caller hears natural speech instead of silence.
Stream LLM tokens as they arrive rather than waiting for the complete response. Each NDJSON line with "interim": true is spoken immediately.
Keep webhook response time under 5 seconds. If processing takes longer, stream an interim acknowledgement within the first second.
Voice webhooks have a 30-second default timeout (configurable from 5–120 seconds per webhook). If your server doesn’t begin responding in time, the request is cancelled and the caller hears silence for that turn. This is especially important for tool-calling agents — always stream an interim chunk before running external API calls.

Tool calls and external APIs

If your webhook calls an LLM with tools (e.g., calendar lookup, database query, CRM search), the total round-trip can add up quickly: LLM decides to call a tool (~1-3s) + external API responds (~1-5s) + LLM summarises (~1-3s). Always stream {"text": "Let me check on that.", "interim": true} as your first NDJSON line so the caller hears something immediately while your tools run in the background.

See the Calls guide for the full streaming response format, tool-calling examples, and troubleshooting tips.

Security

Never expose API keys in client-side code or public repositories. Use environment variables.
Always verify webhook signatures using the X-Webhook-Signature and X-Webhook-Timestamp headers to ensure requests are from AgentPhone.
Check the timestamp — reject requests older than 5 minutes to prevent replay attacks.
Use HTTPS for all webhook endpoints in production.
Rotate API keys periodically and revoke unused keys.

See Webhooks > Security for verification code examples.

Webhook reliability

Return 200 OK quickly — process webhooks asynchronously if your handler needs to do heavy work. Slow responses trigger retries.
Handle duplicate deliveries — use the X-Webhook-ID header for idempotency. Retries can cause the same event to be delivered multiple times.
Implement retry logic — webhooks are automatically retried on failure (up to 6 attempts over ~24 hours), but handle transient failures gracefully on your end.
Log all webhook deliveries for debugging and audit purposes. Use GET /v1/webhooks/deliveries to monitor delivery status.

Pagination

Most list endpoints support offset-based pagination with limit and offset parameters. Message endpoints use cursor-based pagination with before and after timestamps.

Offset-based (agents, numbers, conversations, calls)

GET /v1/numbers?limit=20&offset=0

Parameter	Description
`limit`	Number of results per page (default: 20, max: 100)
`offset`	Number of results to skip (default: 0)

Cursor-based (messages)

GET /v1/numbers/:id/messages?limit=50&before=2025-01-15T12:00:00Z

Parameter	Description
`limit`	Number of results per page (default: 50, max: 200)
`before`	ISO 8601 timestamp — get messages before this time
`after`	ISO 8601 timestamp — get messages after this time

Response format

1 {
2   "data": [...],
3   "hasMore": true,
4   "total": 42
5 }

Always check hasMore before fetching the next page. Use cursor-based pagination for time-ordered data as it’s more efficient than offset-based.

Performance

Use webhooks instead of polling for real-time updates.
Use SSE transcript streaming for live call monitoring — GET /v1/calls/:id/transcript/stream delivers turns in real time as a Server-Sent Events stream. No polling needed. See Calls > Stream transcript (SSE).
Use cursor-based pagination (before/after) for loading message history.
Cache frequently accessed data like phone number lists.
Respect rate limits — implement exponential backoff for retries. Check the Retry-After header on 429 responses.

Error handling

Always check response status before processing data.
Implement retry logic for transient errors (429, 500, 502).
Log errors with context for debugging.
Handle validation errors gracefully — check the details array for field-level errors.

See the Error Handling page for the full error response format and code examples.