Say Text

Make the avatar speak the provided text using text-to-speech.

Endpoint

POST /v1/avatar-session/:sessionId/say-text

Authentication

Header: Authorization: Bearer {token}

Path Parameters

Parameter	Type	Required	Description
`sessionId`	string	Yes	Session ID from create-session

Request Body

{
  "text": "string",
  "turnId": "string",
  "isFinal": boolean
}

Parameters

Field	Type	Required	Description
`text`	string	Yes	Text for the avatar to speak
`turnId`	string	Yes	Unique identifier for this speech turn (e.g., `turn-123`, `turn-${Date.now()}-${random}`)
`isFinal`	boolean	Yes	Whether this is the final chunk (true = complete text, false = more coming)

Response

Success (200)

{
  "success": true
}

Error Responses

400 Bad Request

{
  "success": false,
  "error": "Text is required"
}

401 Unauthorized

{
  "success": false,
  "error": "Invalid or expired token"
}

404 Not Found

{
  "success": false,
  "error": "Session not found"
}

410 Gone

{
  "success": false,
  "error": "Session ended"
}

Examples

cURL

curl -X POST https://api.avatar.us.kaltura.ai/v1/avatar-session/session-123/say-text \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello from Kaltura Avatar!"}'

JavaScript (fetch) - Complete Text

const turnId = `turn-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;

await fetch(`https://api.avatar.us.kaltura.ai/v1/avatar-session/${sessionId}/say-text`, {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${token}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    text: 'Hello from Kaltura Avatar!',
    turnId: turnId,
    isFinal: true,
  }),
});

With turnId and isFinal

await fetch(`https://api.avatar.us.kaltura.ai/v1/avatar-session/${sessionId}/say-text`, {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${token}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    text: 'This is the final message.',
    turnId: 'turn-456',
    isFinal: true,
  }),
});

Usage Notes

Required Parameters Explained

turnId (String, Required)

Unique identifier for this speech turn
Used to track which audio corresponds to which text
Example formats: turn-123, turn-${Date.now()}-${random}
Same turnId should be used for all chunks of a single turn

isFinal (Boolean, Required)

true: Text is complete; avatar starts speaking immediately
false: Text is incomplete (more chunks coming); avatar waits
Useful for LLM streaming scenarios where text arrives gradually

Text Constraints

Maximum length varies by plan (check with Kaltura)
Special characters are supported
Multiple languages supported

Turn Sequences (LLM Streaming)

Use turnId and isFinal for streaming text:

// First part of turn
await sayText({
  text: 'Hello, how are you?',
  turnId: 'turn-1',
  isFinal: false,
});

// Continue turn
await sayText({
  text: 'I hope you are doing well.',
  turnId: 'turn-1',
  isFinal: false,
});

// Final part of turn
await sayText({
  text: 'Let me know if you need anything.',
  turnId: 'turn-1',
  isFinal: true,
});

Text Processing

The avatar will:

Convert text to speech using configured voice
Generate lip-sync animation
Stream the result via WebRTC

Queue Behavior

Text messages are queued and played in order. To interrupt, use the interrupt endpoint.

Complete Example

Simple Text (Complete)

async function makeAvatarSpeak(sessionId, token, text) {
  try {
    const turnId = `turn-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;

    const response = await fetch(`https://api.avatar.us.kaltura.ai/v1/avatar-session/${sessionId}/say-text`, {
      method: 'POST',
      headers: {
        Authorization: `Bearer ${token}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        text,
        turnId,
        isFinal: true,  // Complete message
      }),
    });

    if (!response.ok) {
      const error = await response.json();
      throw new Error(error.error);
    }

    console.log('Avatar speaking:', text);
  } catch (error) {
    console.error('Failed to make avatar speak:', error);
  }
}

// Usage
await makeAvatarSpeak('session-123', 'eyJhbGciOiJIUzI1...', 'Hello from Kaltura Avatar!');

Streaming Text (LLM Streaming)

async function makeAvatarSpeakStreaming(sessionId, token, textChunks) {
  const turnId = `turn-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;

  try {
    for (let i = 0; i < textChunks.length; i++) {
      const isLastChunk = i === textChunks.length - 1;

      const response = await fetch(`https://api.avatar.us.kaltura.ai/v1/avatar-session/${sessionId}/say-text`, {
        method: 'POST',
        headers: {
          Authorization: `Bearer ${token}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          text: textChunks[i],
          turnId,  // Same for all chunks
          isFinal: isLastChunk,  // true only on last chunk
        }),
      });

      if (!response.ok) {
        const error = await response.json();
        throw new Error(error.error);
      }
    }
    console.log('Avatar completed streaming');
  } catch (error) {
    console.error('Failed to stream text:', error);
  }
}

// Usage with streaming chunks
const chunks = ['Hello, ', 'this is ', 'streaming ', 'text.'];
await makeAvatarSpeakStreaming('session-123', 'eyJhbGciOiJIUzI1...', chunks);

Next Steps

Say Audio - Use audio files instead
Interrupt - Stop the avatar mid-speech
End Session - Clean up when done

Endpoint​

Authentication​

Path Parameters​

Request Body​

Parameters​

Response​

Success (200)​

Error Responses​

Examples​

cURL​

JavaScript (fetch) - Complete Text​

With turnId and isFinal​

Usage Notes​

Required Parameters Explained​

Text Constraints​

Turn Sequences (LLM Streaming)​

Text Processing​

Queue Behavior​

Complete Example​

Simple Text (Complete)​

Streaming Text (LLM Streaming)​

Next Steps​