Skip to main content

Say Audio

Make the avatar speak from an uploaded audio file.

Endpoint

POST /v1/avatar-session/:sessionId/say-audio

Authentication

Header: Authorization: Bearer {token}

Path Parameters

ParameterTypeRequiredDescription
sessionIdstringYesSession ID from create-session

Request

Content-Type: multipart/form-data

Form Fields

FieldTypeRequiredDescription
audioFileYesAudio file — MP3 at 44.1 kHz required
turnIdstringYesUnique identifier for this speech turn
durationstringYesAudio duration in seconds (e.g. "3.42")

Response

Success (200)

{
"success": true
}

Error Responses

  • 400 - Bad request (invalid audio file)
  • 401 - Unauthorized (invalid token)
  • 404 - Not found (session doesn't exist)
  • 410 - Gone (session ended)
  • 500 - Server error

Examples

cURL

curl -X POST https://api.avatar.us.kaltura.ai/v1/avatar-session/session-123/say-audio \
-H "Authorization: Bearer $TOKEN" \
-F "audio=@speech.mp3" \
-F "turnId=turn-456" \
-F "duration=3.42"

JavaScript (FormData)

const formData = new FormData();
formData.append('audio', audioFile);
formData.append('turnId', 'turn-456');
formData.append('duration', '3.42');

await fetch(`https://api.avatar.us.kaltura.ai/v1/avatar-session/${sessionId}/say-audio`, {
method: 'POST',
headers: {
Authorization: `Bearer ${token}`,
},
body: formData,
});

With File Input

<input type="file" id="audio-file" accept=".mp3,audio/mpeg" />
<button id="upload-btn">Upload Audio</button>

<script>
async function getAudioDuration(file) {
const arrayBuffer = await file.arrayBuffer();
const audioCtx = new AudioContext();
const decoded = await audioCtx.decodeAudioData(arrayBuffer);
await audioCtx.close();
return decoded.duration;
}

document.getElementById('upload-btn').addEventListener('click', async () => {
const fileInput = document.getElementById('audio-file');
const file = fileInput.files[0];

const duration = await getAudioDuration(file);
const turnId = crypto.randomUUID();

const formData = new FormData();
formData.append('audio', file);
formData.append('turnId', turnId);
formData.append('duration', duration.toString());

await fetch(`https://api.avatar.us.kaltura.ai/v1/avatar-session/${sessionId}/say-audio`, {
method: 'POST',
headers: {
Authorization: `Bearer ${token}`,
},
body: formData,
});
});
</script>

Usage Notes

Audio Requirements

  • Format: MP3 (required)
  • Sample rate: 44.1 kHz (required)

The backend expects 44.1 kHz MP3. Other sample rates or formats will result in degraded or incorrect playback.

File Size Limits

Check with your Kaltura representative for file size limits.

Use Cases

  • Pre-recorded speech
  • Custom voice synthesis
  • Multi-lingual content
  • Professional voiceovers

Next Steps