API Docs

    Overview
    Authentication
    Check Permission
    Text APIs
      Text Translation
      Text to Speech
      Text to Speech (Voice Cloning)
      Multi-Speaker Text to Text
      Multi-Speaker Text to Speech (Voice Cloning)
    Speech Recognition APIs
      Speech to Text
      Multi-Speaker Speech to Text
    Video APIs
      Video Subtitling
      Video Translation
      Video Translation (Voice Cloning)

API Docs

Video Translation – Voice Cloning

Translate spoken content in videos into other languages while preserving the original speaker’s voice using AI-powered voice cloning.

Create Video Translation with Voice Cloning Request

POST

/api/vt-w-vc

This endpoint translates spoken content in a video from one language to another while cloning the original speaker’s voice. The pipeline includes speech recognition, translation, and voice-cloned speech synthesis to generate a translated video output.

Requests are processed asynchronously. Once accepted, the API returns a unique log_id that can be used to track processing status and retrieve the final translated video.


Request Body

project_title (string, required)

A human-readable title to identify the video translation project.

Example: "My Project"

file (file, optional)

Video file to be translated. Required if youtube_url is not provided.

Example: "video.mp4"

youtube_url (string, optional)

YouTube video URL to translate. Required if file is not provided. Do not provide both file and youtube_url.

Example: "https://www.youtube.com/watch?v=XXXX"

input_language (string, required)

Language spoken in the source video.

View example →

output_language (string, required)

Target language into which the video should be translated.

View example →

avatar (string, required)

Voice avatar used for cloning the original speaker’s voice.

Example: "daaji"

ttt_model (string, required)

Text Translation model used to translate the recognized speech.

View example →

stt_model (string, required)

Speech to Text model used to extract spoken content from the video.

Example: "narris_fast"

speed (number, optional)

Playback speed of the translated, voice-cloned audio. Defaults to 1.

Example: 1

pitch (number, optional)

Pitch adjustment for the cloned voice output. Defaults to 0.

0

Note: You must provide either a video file or a youtube_url. Providing both or neither will result in a validation error.


Response

On successful submission, the API returns a unique log_id.
Use this log_id with the Fetch Video Translation Voice Cloning By ID endpoint to retrieve the translated video and processing metadata.

{
  "log_id": "694d96ef7d5247d58c02984c"
}

curl

curl --location 'https://api.narris.io/api/vt-w-vc' \
--header 'x-api-key: YOUR_API_KEY' \
--form 'project_title="My Project"' \
--form 'file=@"/path/to/video.mp4"' \
--form 'input_language="english"' \
--form 'output_language="hindi"' \
--form 'avatar="daaji"' \
--form 'ttt_model="narris"' \
--form 'stt_model="narris_fast"' \
--form 'speed="1"' \
--form 'pitch="0"'

Fetch Video Translation with Voice Cloning List

GET

/api/vt-w-vc/logs

This endpoint allows you to fetch a paginated list of previously created video translation requests that use voice cloning.

Each entry represents a video translation job with voice cloning and includes its current status, input video reference, and timestamps.


Request Body

page (number, optional)

Page number for pagination.

Example: 1

limit (number, optional)

Number of records to return per page.

Example: 10


Response

On success, the API returns a paginated list of video translation with voice cloning logs.
Each log contains a unique _id which can be used with the Fetch Video Translation with Voice Cloning By ID endpoint to retrieve detailed output and metadata.

{
  "total": 1,
  "page": 1,
  "limit": 10,
  "logs": [
    {
      "_id": "694ae800734f6600f2cfa64a",
      "project_title": "My Project",
      "input_file": "https://lingui-dev.s3.amazonaws.com/input/trim_copy_1766516735897.mp4",
      "status": "failed",
      "createdAt": "2025-12-23T19:05:36.359Z",
      "finishedAt": "2025-12-23T19:06:06.504Z"
    }
  ]
}

curl

curl --location 'https://api.narris.io/api/vt-w-vc/logs?page=1&limit=10' \
--header 'Content-Type: application/json' \
--header 'x-api-key: YOUR_API_KEY'

Fetch Video Translation with Voice Cloning By ID

GET

/api/vt-w-vc/{log_id}

This endpoint allows you to fetch the complete details of a video translation with voice cloning request using its unique log_id.

The response includes input video details, source and target languages, speech recognition and translation models used, voice cloning configuration, processing status, timestamps, and the final translated video output if available.


Request Body

log_id (string, required)

Unique identifier of the video translation with voice cloning request returned during creation or from logs.

Example: "694ae800734f6600f2cfa64a"


Response

On success, the API returns detailed information about the video translation with voice cloning request.
If processing is completed successfully, the output_file field will contain the translated video URL.
If the request fails, the status field will be failed and no output file will be returned.

{
  "_id": "694ae800734f6600f2cfa64a",
  "project_title": "My Project",
  "input_file": "https://lingui-dev.s3.amazonaws.com/input/trim_copy_1766516735897.mp4",
  "input_language": "english",
  "output_language": "hindi",
  "avatar": "daaji",
  "speed": "1",
  "pitch": "0",
  "ttt_model": "narris",
  "stt_model": "narris_fast",
  "status": "failed",
  "createdAt": "2025-12-23T19:05:36.359Z",
  "updatedAt": "2025-12-23T19:06:06.504Z",
  "finishedAt": "2025-12-23T19:06:06.504Z"
}

curl

curl --location 'https://api.narris.io/api/vt-w-vc/694ae800734f6600f2cfa64a' \
--header 'Content-Type: application/json' \
--header 'x-api-key: YOUR_API_KEY'

Notes for Developers

• Video Translation with Voice Cloning requests are processed asynchronously. Always store the returned log_id to track progress.
• Voice cloning preserves speaker characteristics such as tone, pitch, and speaking style while translating spoken content.
• Processing time may be longer than standard video translation due to speech synthesis and voice cloning steps.
• Use the Fetch Video Translation Voice Cloning List endpoint to view all jobs and Fetch Video Translation Voice Cloning By ID to retrieve the final dubbed video output.