Endpoints

POST /api/v1/transcribe — Audio or video URL → word-level transcript
POST /api/v1/captions/render — Video URL + transcript → video with burned-in captions
GET /api/v1/resources/caption-styles — List all 63 caption presets

Quickstart — transcribe

curl -X POST https://api.reelsbuilder.ai/api/v1/transcribe \
  -H "Authorization: Bearer $REELSBUILDER_API_KEY" \
  -H "Idempotency-Key: $(uuidgen)" \
  -H "Content-Type: application/json" \
  -d '{
    "media_url": "https://your-cdn.example.com/podcast.mp3",
    "language_hint": "en",
    "diarization": true,
    "output_formats": ["json", "srt", "vtt"]
  }'

Initial response

{
  "success": true,
  "data": {
    "job_id": "tjob_01HKZ...",
    "status": "queued",
    "estimated_completion_sec": 35,
    "source": {
      "duration_sec": 1842,
      "format": "audio/mp3"
    }
  },
  "meta": { "request_id": "req_...", "credits_used": 7, "credits_remaining": 993 }
}

Completion webhook

POST https://your-app.example.com/webhooks/transcript
X-RB-Event-Type: transcript.completed

{
  "event_id": "evt_...",
  "event_type": "transcript.completed",
  "data": {
    "job_id": "tjob_01HKZ...",
    "status": "completed",
    "transcript": {
      "text": "So the way we ship 10x faster is by writing the documentation first...",
      "language": "en",
      "confidence": 0.97,
      "duration_sec": 1842,
      "word_count": 4218,
      "words": [
        { "text": "So", "start": 0.12, "end": 0.31, "confidence": 0.99, "speaker": "spk_1" },
        { "text": "the", "start": 0.31, "end": 0.42, "confidence": 0.99, "speaker": "spk_1" },
        { "text": "way", "start": 0.42, "end": 0.58, "confidence": 0.98, "speaker": "spk_1" }
      ],
      "speakers": [
        { "id": "spk_1", "estimated_minutes_active": 22.4 },
        { "id": "spk_2", "estimated_minutes_active": 8.1 }
      ]
    },
    "output_files": {
      "json": "https://cdn.reelsbuilder.ai/t/tjob_.../transcript.json",
      "srt": "https://cdn.reelsbuilder.ai/t/tjob_.../transcript.srt",
      "vtt": "https://cdn.reelsbuilder.ai/t/tjob_.../transcript.vtt"
    }
  }
}

Parameters

Field	Type	Required	Description
`media_url`	URL	yes	Public HTTPS URL to audio (mp3, wav, m4a, flac, ogg) or video (mp4, mov, webm). Max 4 hours, 500MB.
`language_hint`	string	no	ISO 639-1 code. If omitted, language is auto-detected.
`diarization`	boolean	no	Tag each word with a `speaker` ID. Default `false`.
`output_formats`	string[]	no	Subset of `json`, `srt`, `vtt`, `ass`, `txt`. Default `["json"]`.
`caption_style`	string	no	If provided alongside `ass` in output_formats, generates an ASS file styled with one of the 63 caption presets.
`filter_profanity`	boolean	no	Replace profanity with asterisks. Default `false`.
`webhook_url`	URL	recommended	HTTPS URL for completion callback.

Render captions into a video

Once you have a transcript, render burned-in karaoke captions onto the source video:

curl -X POST https://api.reelsbuilder.ai/api/v1/captions/render \
  -H "Authorization: Bearer $REELSBUILDER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "video_url": "https://your-cdn.example.com/source.mp4",
    "transcript_job_id": "tjob_01HKZ...",
    "caption_style": "neon_outline_yellow",
    "position": "lower_third",
    "max_words_per_line": 4,
    "highlight_active_word": true
  }'

Caption style catalog

63 presets covering 5 categories. The full list is queryable at GET /api/v1/resources/caption-styles. Notable presets:

Default styles — default_white, default_yellow, default_black_box
Neon — neon_outline_pink, neon_outline_yellow, neon_glow_cyan
Bold — impact_yellow, impact_white_outline, shadowed_red
Karaoke-highlighted — karaoke_word_bounce, karaoke_word_color_swap, karaoke_progressive_fill
Branded — Auto-derived from a brand_id; uses the brand's primary + accent colors and font.

Supported languages

Powered by ElevenLabs Scribe v2 — 99 languages with word-level timestamps. Highest-accuracy tiers: English, Spanish, French, German, Portuguese, Italian, Dutch, Polish, Russian, Japanese, Korean, Chinese (Mandarin), Arabic, Hindi, Turkish.

Full list at GET /api/v1/resources/transcribe-languages.

Examples

TypeScript — transcribe + render captions

const auth = { Authorization: `Bearer ${process.env.REELSBUILDER_API_KEY}` };

// 1. Kick off transcription
const r1 = await fetch("https://api.reelsbuilder.ai/api/v1/transcribe", {
  method: "POST",
  headers: { ...auth, "Content-Type": "application/json", "Idempotency-Key": crypto.randomUUID() },
  body: JSON.stringify({
    media_url: "https://your-cdn.example.com/source.mp4",
    language_hint: "en",
    output_formats: ["json"],
  }),
});
const transcribeJob = (await r1.json()).data;

// 2. Wait for it (or use webhook). Once complete...

// 3. Render captions
const r2 = await fetch("https://api.reelsbuilder.ai/api/v1/captions/render", {
  method: "POST",
  headers: { ...auth, "Content-Type": "application/json", "Idempotency-Key": crypto.randomUUID() },
  body: JSON.stringify({
    video_url: "https://your-cdn.example.com/source.mp4",
    transcript_job_id: transcribeJob.job_id,
    caption_style: "karaoke_word_bounce",
    position: "lower_third",
  }),
});
const renderJob = (await r2.json()).data;
console.log(`Render job: ${renderJob.job_id}`);

Pricing

Transcribe: 1 credit per 4 minutes of audio (rounded up). 1-hour podcast = 15 credits.
Diarization: +50% on transcribe cost. 1-hour podcast with diarization = 23 credits.
Caption rendering: 5 credits per minute of output video.

Latency

Transcribe p50: ~3 seconds per minute of source audio (so a 10-min clip transcribes in ~30s)
Caption render p50: ~5 seconds per minute of source video
Max source length: 4 hours

curl -X POST https://api.reelsbuilder.ai/api/v1/transcribe \ -H "Authorization: Bearer $REELSBUILDER_API_KEY" \ -H "Idempotency-Key: $(uuidgen)" \ -H "Content-Type: application/json" \ -d '{ "media_url": "https://your-cdn.example.com/podcast.mp3", "language_hint": "en", "diarization": true, "output_formats": ["json", "srt", "vtt"] }'

Initial response

{ "success": true, "data": { "job_id": "tjob_01HKZ...", "status": "queued", "estimated_completion_sec": 35, "source": { "duration_sec": 1842, "format": "audio/mp3" } }, "meta": { "request_id": "req_...", "credits_used": 7, "credits_remaining": 993 } }

Completion webhook

POST https://your-app.example.com/webhooks/transcript X-RB-Event-Type: transcript.completed { "event_id": "evt_...", "event_type": "transcript.completed", "data": { "job_id": "tjob_01HKZ...", "status": "completed", "transcript": { "text": "So the way we ship 10x faster is by writing the documentation first...", "language": "en", "confidence": 0.97, "duration_sec": 1842, "word_count": 4218, "words": [ { "text": "So", "start": 0.12, "end": 0.31, "confidence": 0.99, "speaker": "spk_1" }, { "text": "the", "start": 0.31, "end": 0.42, "confidence": 0.99, "speaker": "spk_1" }, { "text": "way", "start": 0.42, "end": 0.58, "confidence": 0.98, "speaker": "spk_1" } ], "speakers": [ { "id": "spk_1", "estimated_minutes_active": 22.4 }, { "id": "spk_2", "estimated_minutes_active": 8.1 } ] }, "output_files": { "json": "https://cdn.reelsbuilder.ai/t/tjob_.../transcript.json", "srt": "https://cdn.reelsbuilder.ai/t/tjob_.../transcript.srt", "vtt": "https://cdn.reelsbuilder.ai/t/tjob_.../transcript.vtt" } } }

Parameters

Field

Type

Required

Description

media_url

URL

yes

Public HTTPS URL to audio (mp3, wav, m4a, flac, ogg) or video (mp4, mov, webm). Max 4 hours, 500MB.

language_hint

string

ISO 639-1 code. If omitted, language is auto-detected.

diarization

boolean

Tag each word with a speaker ID. Default false.

output_formats

string[]

Subset of json, srt, vtt, ass, txt. Default ["json"].

caption_style

string

If provided alongside ass in output_formats, generates an ASS file styled with one of the 63 caption presets.

filter_profanity

boolean

Replace profanity with asterisks. Default false.

webhook_url

URL

recommended

HTTPS URL for completion callback.

Render captions into a video

Once you have a transcript, render burned-in karaoke captions onto the source video:

curl -X POST https://api.reelsbuilder.ai/api/v1/captions/render \ -H "Authorization: Bearer $REELSBUILDER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "video_url": "https://your-cdn.example.com/source.mp4", "transcript_job_id": "tjob_01HKZ...", "caption_style": "neon_outline_yellow", "position": "lower_third", "max_words_per_line": 4, "highlight_active_word": true }'

Caption style catalog

63 presets covering 5 categories. The full list is queryable at GET /api/v1/resources/caption-styles. Notable presets:

Default styles — default_white, default_yellow, default_black_box

Neon — neon_outline_pink, neon_outline_yellow, neon_glow_cyan

Bold — impact_yellow, impact_white_outline, shadowed_red

Karaoke-highlighted — karaoke_word_bounce, karaoke_word_color_swap, karaoke_progressive_fill

Branded — Auto-derived from a brand_id; uses the brand's primary + accent colors and font.

Examples

TypeScript — transcribe + render captions

Transcribe + Captions API

Endpoints

Quickstart — transcribe

Initial response

Completion webhook

Parameters

Render captions into a video

Caption style catalog

Supported languages

Examples

TypeScript — transcribe + render captions

Pricing

Latency

See also

Product

Solutions

Resources

Earn

Tools

Legal

Transcribe + Captions API

Endpoints

Quickstart — transcribe

Initial response

Completion webhook

Parameters

Render captions into a video

Caption style catalog

Supported languages

Examples

TypeScript — transcribe + render captions

Pricing

Latency

See also

Product

Solutions

Resources

Earn

Tools

Legal