Conversation Flow

Stateless Conversation Completion

The stateless complete endpoint provides a powerful way to have complete conversation interactions without creating or managing persistent conversation records. This is ideal for one-off interactions, API integrations, or scenarios where you don't need to maintain conversation history in the platform.

Unlike the conversation-based complete endpoint, the stateless complete endpoint allows you to provide the entire conversation context in a single request, including all previous messages. This gives you complete control over conversation state and history management.

To use stateless conversation completion:

POST /api/v1/conversation/complete Content-Type: application/json { "botId": "bot_abc123", "messages": [ { "type": "user", "text": "What is your return policy?" } ] }

http

The messages array must contain at least one message and represents the full conversation history that the AI will use as context.

Configuration Options

The stateless complete endpoint supports extensive configuration:

Bot Configuration:

  • botId: ID of an existing bot to use (optional if providing backstory)
  • backstory: Custom instructions for the AI (optional if using botId)
  • model: Specific AI model to use (overrides bot default)
  • datasetId: Dataset to provide as knowledge base
  • skillsetId: Skillset to provide abilities
  • privacy: Enable privacy mode for PII handling
  • moderation: Enable content moderation

Message History:

  • messages: Array of message objects with type and text (required)
    • Each message must specify type (user, bot, context, activity)
    • Each message must include text content
    • Messages are processed in order to build conversation context

Attachments:

  • attachments: Array of attachment URLs to include in the conversation
    • Each attachment must specify a url field
    • Attachments are processed and included as context

Contact Association:

  • contactId: Associate the interaction with a specific contact
    • Can be an existing contact ID string
    • Can be a contact object with fingerprint for trusted contact creation

Advanced Stateless Features

Function Calling:

Function calling enables the AI to invoke external functions during a conversation, allowing it to access real-time data, perform computations, or interact with external systems. The platform supports two primary patterns for providing function results: static data for predetermined responses, and channels for dynamic, real-time function execution.

Basic Function Definition:

Define functions using the standard JSON Schema format:

POST /api/v1/conversation/complete Content-Type: application/json { "backstory": "You are a helpful assistant with access to weather data", "messages": [ { "type": "user", "text": "What's the weather in Tokyo?" } ], "functions": [ { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City name" } }, "required": ["location"] } } ] }

http

When the AI determines that a function call is necessary, it will return a message indicating the function to call and its arguments.

Function Results: Static Data Pattern:

For functions with predetermined or mock data, you can provide static results directly in the function definition. This is ideal for testing, development, or scenarios where function outputs are predictable:

POST /api/v1/conversation/complete Content-Type: application/json { "backstory": "You are a store assistant", "messages": [ { "type": "user", "text": "What's the status of order #12345?" } ], "functions": [ { "name": "get_order_status", "description": "Retrieve order status information", "parameters": { "type": "object", "properties": { "orderId": { "type": "string", "description": "The order ID to look up" } }, "required": ["orderId"] }, "result": { "data": { "status": "shipped", "tracking": "1Z999AA10123456784", "estimatedDelivery": "2025-11-25" } } } ] }

http

With static results, the AI will automatically receive the predefined data when it calls the function, without requiring any external interaction. The conversation continues seamlessly with the function result incorporated into the AI's response.

Function Results: Channel-Based Pattern:

For dynamic function execution where results depend on real-time data or external systems, use the channel-based pattern. This approach enables true remote function calling where your application executes the function and publishes the result back to the AI:

POST /api/v1/conversation/complete Content-Type: application/json { "backstory": "You are a helpful assistant with access to weather data", "messages": [ { "type": "user", "text": "What's the weather in Tokyo?" } ], "functions": [ { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City name" } }, "required": ["location"] }, "result": { "channel": "weather-channel-abc123xyz456" } } ] }

http

Channel-Based Function Calling Workflow:

  1. Define Function with Channel: Include a result.channel property in your function definition. The channel ID must be at least 16 characters long for security.

  2. AI Invokes Function: When the AI determines the function should be called, the streaming response will include a function call message with:

    • The function name (get_weather)
    • The function arguments ({ "location": "Tokyo" })
    • The channel name for publishing results
  3. Execute Function Locally: Your application receives the function call, executes the actual function with the provided arguments, and obtains the result.

  4. Publish Result to Channel: Send the function result back to the AI by publishing to the channel:

POST /api/v1/channel/weather-channel-abc123xyz456/publish Content-Type: application/json { "message": {"temperature": 18, "conditions": "partly cloudy", "humidity": 65} }

http

  1. AI Processes Result: The platform delivers the result to the AI in real-time, and the AI incorporates it into its response to the user.

Complete Channel-Based Example:

Here's a full example showing the request-response flow:

// Step 1: Initial request with channel-based function const response = await fetch('/api/v1/conversation/complete', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ backstory: "You help users with weather information", messages: [{ type: "user", text: "What's the weather like in London?" }], functions: [{ name: "get_weather", description: "Retrieve current weather data", parameters: { type: "object", properties: { location: { type: "string" } } }, result: { channel: "my-weather-channel-abc123" } }] }) }); // Step 2: Process streaming response const reader = response.body.getReader(); const decoder = new TextDecoder(); while (true) { const { done, value } = await reader.read(); if (done) break; const chunk = decoder.decode(value); const lines = chunk.split('\n').filter(line => line.trim()); for (const line of lines) { const event = JSON.parse(line); // Step 3: Detect function call if (event.type === 'message' && event.data.type === 'activity' && event.data.function) { const { function: fnName, args, channel } = event.data; // Step 4: Execute function locally const weatherData = await fetchWeatherAPI(args.location); // Step 5: Publish result to channel await fetch(`/api/v1/channel/${channel}/publish`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ message: weatherData }) }); } } }

javascript

Important Channel Considerations:

  • Channel ID Length: Channel IDs must be at least 16 characters for security and collision avoidance. Use cryptographically random strings.

  • Channel Uniqueness: Each channel ID should be unique to the conversation session. Don't reuse channel IDs across different interactions.

  • Real-Time Delivery: Messages are delivered in real-time to active subscribers. The AI must be actively waiting for the result when you publish.

  • No Message Persistence: Channels don't maintain message history. Results published before the AI is listening or after it has moved on will be lost.

  • Result Format: The message field in channel publish requests accepts a string. For structured data (like function results), serialize to JSON.

  • Timeout Handling: Implement timeout logic in your application. If a function takes too long, the conversation may timeout (800-second maximum).

Choosing Between Static and Channel-Based Results:

Use Static Results When:

  • Testing function calling implementations
  • Working with mock or sample data during development
  • Function outputs are predetermined and don't change
  • You want immediate responses without external function execution
  • Building demos or prototypes

Use Channel-Based Results When:

  • Function results depend on real-time external data (APIs, databases)
  • Function execution requires external system interaction
  • Results vary based on input parameters
  • You need to perform actual computations or business logic
  • Building production systems with dynamic data

Multiple Functions:

You can define multiple functions in a single request, mixing static and channel-based results:

POST /api/v1/conversation/complete Content-Type: application/json { "backstory": "You are a helpful assistant", "messages": [{ "type": "user", "text": "Help me plan my day" }], "functions": [ { "name": "get_weather", "description": "Get current weather", "parameters": { ... }, "result": { "channel": "weather-channel-abc123" } }, { "name": "get_calendar", "description": "Get calendar events", "parameters": { ... }, "result": { "channel": "calendar-channel-def456" } }, { "name": "get_time", "description": "Get current time", "parameters": { ... }, "result": { "data": { "currentTime": "2025-11-20T10:30:00Z", "timezone": "UTC" } } } ] }

http

The AI will intelligently choose which function(s) to call based on the user's query and will handle the results appropriately whether they're static or channel-based.

Extensions (Trusted Sessions Only):

For trusted API sessions, you can provide inline extensions:

  • extensions.backstory: Additional instructions for this specific interaction
  • extensions.datasets: Inline dataset records as arrays of text and metadata
  • extensions.skillsets: Inline skillset definitions with abilities
  • extensions.features: Enable specific features for this interaction

Multi-Turn Conversations

The stateless endpoint excels at multi-turn conversations where you manage state externally:

POST /api/v1/conversation/complete Content-Type: application/json { "botId": "bot_abc123", "messages": [ { "type": "user", "text": "I need help with my order" }, { "type": "bot", "text": "I'd be happy to help! What's your order number?" }, { "type": "user", "text": "It's #12345" } ] }

http

In this example, you're providing the previous exchange as context, and the AI will generate a response considering the full conversation history. You would store this history in your own application and add the AI's response to it for the next turn.

Response Structure

The stateless complete endpoint returns streaming events similar to the stateful complete endpoint:

  • send_result: Confirmation of the user's message processing
  • Streaming tokens: The AI's response delivered incrementally
  • receive_result: The complete AI response with usage statistics
  • result: Final result with complete response and metadata

When to Use Stateless vs Stateful

Use Stateless Complete When:

  • You want complete control over conversation state and history
  • You're integrating with existing systems that maintain their own conversation storage
  • You need temporary or one-off AI interactions without persistence
  • You want to avoid managing conversation IDs and lifecycle
  • You have specific requirements for how conversation data is stored
  • You're building a custom conversation UI with your own state management

Use Stateful Complete When:

  • You want the platform to manage conversation history automatically
  • You need conversation persistence for later retrieval or analysis
  • You're building a traditional chat interface
  • You want to leverage platform features like conversation search and analytics
  • You need conversation metadata and timestamps managed automatically

Performance and Resource Considerations

The stateless complete endpoint:

  • Can handle conversations of any length (limited by message array size and token limits)
  • Processes all messages to build context each time (no cached state)
  • Uses streaming for real-time response delivery
  • Has the same 800-second maximum duration as stateful endpoints
  • Counts tokens and messages toward your usage limits

Integration Patterns

Pattern 1: External State Management

Store conversation history in your application's database and include it in each request. This works well for applications that already have sophisticated state management.

Pattern 2: Temporary Interactions

Use stateless complete for one-time interactions where you don't need to maintain history, such as form validation, content generation, or quick queries.

Pattern 3: Hybrid Approach

Use stateless complete for testing and development, then transition to stateful conversations in production for better performance and built-in history management.

Best Practices:

  • Keep message arrays manageable - very long conversations should be summarized or truncated
  • Include only the most recent and relevant messages for context
  • Use contact association to track interactions across multiple stateless requests
  • Implement proper error handling for streaming responses
  • Monitor token usage as stateless calls can be more token-intensive than stateful
  • Consider rate limits when making frequent stateless requests
  • Cache bot configuration (botId, backstory, etc.) to avoid redundant specifications

Important Notes:

  • Stateless complete requests do not create conversation records in the platform
  • All context must be provided in each request's messages array
  • The AI has no memory between stateless requests unless you provide previous messages
  • Attachments are processed for each request and count toward usage limits
  • Contact tracking can link multiple stateless interactions to the same user

Execution Limits

When working with agentic conversations that involve function calling, tool usage, or iterative reasoning, you can control the execution bounds using the limits parameter. This allows you to prevent runaway conversations and manage resource consumption.

The limits object accepts three optional properties:

  • iterations: Maximum number of agentic iterations. Controls how many times the model can iterate through tool calls and responses. Each iteration represents a complete cycle of the model calling a tool and processing its result.

  • continuations: Maximum number of model continuations. Controls how many times the model can continue generating after reaching a stop condition. This is useful for limiting extended responses or chain-of-thought reasoning.

  • calls: Maximum number of function/tool calls. Controls how many total function calls can be made during the conversation, regardless of which iteration they occur in.

POST /api/v1/conversation/complete Content-Type: application/json { "botId": "bot_abc123", "messages": [ { "type": "user", "text": "Research the top 3 competitors and summarize their features" } ], "limits": { "iterations": 5, "calls": 10 } }

http

When a limit is reached, the conversation will stop and return a result with an end.reason of iteration. This allows your application to detect when processing was bounded and handle it appropriately.

Default Behavior:

If no limits are specified, the system uses sensible defaults that balance capability with resource protection. For most use cases, the defaults are sufficient, but complex agentic workflows may benefit from explicit limits.

Use Cases for Limits:

  • Preventing infinite loops in recursive tool usage
  • Controlling costs for complex multi-step operations
  • Ensuring predictable response times for user-facing applications
  • Limiting resource usage for untrusted or experimental conversations

Completion End Reasons

Every conversation completion returns an end object that explains why the completion finished. Understanding these reasons helps you build robust applications that handle different completion scenarios appropriately.

The end.reason field contains one of the following values:

  • stop: The model finished generating naturally, reaching a logical conclusion to its response. This is the most common and expected outcome for successful completions.

  • length: The response was truncated because it reached the maximum token limit. The response may be incomplete, and you might need to continue the conversation or increase token limits.

  • activity: The model invoked a function or tool during processing. When using static results, this is handled automatically. With channel-based integrations, your application must provide the result.

  • error: An error occurred during completion. Check the response for error details and handle accordingly.

  • iteration: The conversation reached one of the configured limits (iterations, continuations, or calls). The response contains whatever was generated before hitting the limit.

Example Response with End Reason:

{ "text": "Based on my research, here are the top features...", "usage": { "token": 1250 }, "end": { "reason": "stop" } }

json

Handling Different Reasons:

const result = await client.conversation.complete({ botId: "bot_abc123", messages: [{ type: "user", text: "Help me with my order" }] }); switch (result.end.reason) { case "stop": // Normal completion, display the response displayMessage(result.text); break; case "length": // Response was truncated, might want to continue displayMessage(result.text); showWarning("Response was truncated due to length"); break; case "iteration": // Hit execution limits, show partial result displayMessage(result.text); showWarning("Processing was limited to prevent excessive usage"); break; case "error": // Something went wrong showError("An error occurred during processing"); break; }

javascript

Best Practices:

  • Always check end.reason when processing responses programmatically
  • Handle length truncation gracefully, especially for long-form content
  • Monitor iteration occurrences to tune your limits appropriately
  • Log error reasons for debugging and monitoring
  • Use stop as the expected success case in your application logic