Production Deployment

Best practices for deploying Kaltura Avatar in production.

Security

1. API Key Protection

Never Expose API Keys

Never include your Kaltura Session in client-side code. Always use backend servers.

Backend only:

// ✅ Good - Backend (Node.js)
const KS = process.env.AVATAR_KS;

// ❌ Bad - Frontend
const KS = 'abc123'; // Exposed to users!

Environment variables:

# .env (never commit this file)
AVATAR_KS=your-secret-key
AVATAR_BASE_URL=https://api.avatar.us.kaltura.ai/v1/avatar-session

.gitignore:

.env
.env.local
.env.production

2. Use HTTPS

Always serve your application over HTTPS:

// ✅ Good
const baseUrl = 'https://api.avatar.example.com';

// ❌ Bad
const baseUrl = 'http://api.avatar.example.com';

3. Token Security

Tokens are session-specific
Don't store tokens long-term
Clear tokens when session ends
Don't share tokens between users

// ✅ Good - Clear after use
const { sessionId, token } = await createSession();
// ... use session ...
await endSession();
sessionStorage.removeItem('token');

// ❌ Bad - Storing indefinitely
localStorage.setItem('token', token);

4. Input Validation

Validate all user input:

// Backend validation
app.post('/api/avatar/say-text', async (req, res) => {
  const { text } = req.body;

  // Validate
  if (!text || typeof text !== 'string') {
    return res.status(400).json({ error: 'Invalid text' });
  }

  if (text.length > 1000) {
    return res.status(400).json({ error: 'Text too long' });
  }

  // Sanitize
  const cleanText = text.trim();

  // Call API
  // ...
});

5. Rate Limiting

Implement rate limiting to prevent abuse:

import rateLimit from 'express-rate-limit';

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // Limit each IP to 100 requests per windowMs
  message: 'Too many requests, please try again later',
});

app.use('/api/avatar/', limiter);

Architecture

Recommended: Backend + Frontend

┌─────────────┐      ┌─────────────┐      ┌──────────────┐
│   Browser   │◄────►│   Backend   │◄────►│ Avatar API   │
│             │      │             │      │              │
│ Client SDK  │      │ Express/    │      │ (Kaltura)    │
│ (Display)   │      │ FastAPI     │      │              │
└─────────────┘      └─────────────┘      └──────────────┘

Benefits:

Kaltura Session stays secure
Server-side validation
Audit logging
Rate limiting
User authentication

Backend Structure

// server.js
import express from 'express';
import { avatarRoutes } from './routes/avatar.js';
import { authMiddleware } from './middleware/auth.js';
import { errorHandler } from './middleware/error.js';

const app = express();

// Middleware
app.use(express.json());
app.use(authMiddleware);

// Routes
app.use('/api/avatar', avatarRoutes);

// Error handling
app.use(errorHandler);

// Start server
const PORT = process.env.PORT || 3001;
app.listen(PORT, () => {
  console.log(`Server running on port ${PORT}`);
});

Performance

1. Session Management

Use Redis for session storage:

import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

// Store session
await redis.setex(
  `avatar:${sessionId}`,
  3600, // 1 hour TTL
  JSON.stringify({ token, createdAt: Date.now() })
);

// Retrieve session
const sessionData = await redis.get(`avatar:${sessionId}`);

2. Connection Pooling

Reuse HTTP connections:

import fetch from 'node-fetch';
import http from 'http';
import https from 'https';

const httpAgent = new http.Agent({ keepAlive: true });
const httpsAgent = new https.Agent({ keepAlive: true });

const response = await fetch(url, {
  agent: url.startsWith('https') ? httpsAgent : httpAgent,
});

3. Caching

Cache avatar/voice IDs:

const cache = new Map();

async function getAvatarConfig(avatarId) {
  if (cache.has(avatarId)) {
    return cache.get(avatarId);
  }

  const config = await fetchAvatarConfig(avatarId);
  cache.set(avatarId, config);

  return config;
}

4. Load Balancing

Use load balancers for high traffic:

# nginx.conf
upstream backend {
    server backend1:3001;
    server backend2:3001;
    server backend3:3001;
}

server {
    listen 80;
    location /api/ {
        proxy_pass http://backend;
    }
}

Monitoring

1. Logging

Log all avatar operations:

import winston from 'winston';

const logger = winston.createLogger({
  level: 'info',
  format: winston.format.json(),
  transports: [new winston.transports.File({ filename: 'error.log', level: 'error' }), new winston.transports.File({ filename: 'combined.log' })],
});

// Log session creation
logger.info('Session created', {
  sessionId,
  avatarId,
  userId,
  timestamp: new Date(),
});

2. Error Tracking

Use error tracking services:

import * as Sentry from '@sentry/node';

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  environment: process.env.NODE_ENV,
});

// Catch errors
app.use(Sentry.Handlers.errorHandler());

3. Metrics

Track key metrics:

import { Counter, Histogram } from 'prom-client';

const sessionCreations = new Counter({
  name: 'avatar_sessions_created_total',
  help: 'Total avatar sessions created',
});

const sessionDuration = new Histogram({
  name: 'avatar_session_duration_seconds',
  help: 'Avatar session duration',
});

// Increment
sessionCreations.inc();

// Record duration
const end = sessionDuration.startTimer();
// ... session lifetime ...
end();

Scaling

Horizontal Scaling

Run multiple backend instances:

# PM2 cluster mode
pm2 start server.js -i max

Stateless Design

Keep servers stateless:

Store sessions in Redis
Use JWT for authentication
Don't store state in memory

CDN

Serve static assets via CDN:

Frontend bundle
Images and assets
Not API endpoints

Error Handling

Graceful Degradation

try {
  await session.createSession({ ... });
} catch (error) {
  // Log error
  logger.error('Session creation failed', { error, userId });

  // Show user-friendly message
  showError('Avatar unavailable. Please try again later.');

  // Fallback: Show static image or text chat
  enableFallbackMode();
}

Retry Logic

async function createSessionWithRetry(maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await createSession();
    } catch (error) {
      if (attempt === maxRetries) throw error;

      // Wait with exponential backoff
      await new Promise((r) => setTimeout(r, 1000 * Math.pow(2, attempt)));
    }
  }
}

Health Checks

Backend Health Endpoint

app.get('/health', async (req, res) => {
  try {
    // Check database
    await redis.ping();

    // Check Avatar API
    const response = await fetch(`${AVATAR_BASE_URL}/health`);

    if (!response.ok) {
      throw new Error('Avatar API unhealthy');
    }

    res.json({
      status: 'healthy',
      timestamp: new Date(),
      uptime: process.uptime(),
    });
  } catch (error) {
    res.status(503).json({
      status: 'unhealthy',
      error: error.message,
    });
  }
});

Deployment Checklist

Before Deployment

After Deployment

Docker Example

# Dockerfile
FROM node:20-alpine

WORKDIR /app

COPY package*.json ./
RUN npm ci --production

COPY . .

EXPOSE 3001

CMD ["node", "server.js"]

# docker-compose.yml
version: '3.8'

services:
  backend:
    build: .
    ports:
      - '3001:3001'
    environment:
      - AVATAR_KS=${AVATAR_KS}
      - REDIS_URL=redis://redis:6379
    depends_on:
      - redis

  redis:
    image: redis:alpine
    ports:
      - '6379:6379'

Security​

1. API Key Protection​

2. Use HTTPS​

3. Token Security​

4. Input Validation​

5. Rate Limiting​

Architecture​

Recommended: Backend + Frontend​

Backend Structure​

Performance​

1. Session Management​

2. Connection Pooling​

3. Caching​

4. Load Balancing​

Monitoring​

1. Logging​

2. Error Tracking​

3. Metrics​

Scaling​

Horizontal Scaling​

Stateless Design​

CDN​

Error Handling​

Graceful Degradation​

Retry Logic​

Health Checks​

Backend Health Endpoint​

Deployment Checklist​

Before Deployment​

After Deployment​

Docker Example​

See Also​