Vulavula Logo
Speech To Text (STT)

Sync Transcribe

The Sync Transcription endpoint accepts audio files and returns transcribed text within the same HTTP request/response cycle.

POST https://api.lelapa.ai/v1/transcribe/sync

Supported Audio Formats

  • WAV
  • MP3
  • FLAC
  • AAC
  • OGG
  • MP4
  • AIFF
  • OPUS

Important Note: Processing may take up to 2min30s or fail altogether on long, low-quality or complex audio.

Request

Headers

HeaderTypeRequiredDescription
X-CLIENT-TOKENstringYesAPI token generated for the project
Content-TypestringYesMust be set to multipart/form-data

Form Data

ParameterTypeRequiredDescription
filefileYesAudio file content in bytes

Query Parameters

ParameterTypeRequiredDescription
lang_codestringNoLanguage code for transcription. If not specified, language will be auto-detected.
diarisebooleanNoEnable diarisation (Default: false)
detect_musicbooleanNoEnable music on hold detection (Default: false)

Supported Language Codes

  • afr - Afrikaans
  • zul - isiZulu
  • sot - Sesotho
  • eng - South African English
  • fra - African French
  • cs-zul - Code-switched isiZulu (alpha)

Response

🟢 200 OK
The request was successful.

JSONCode
{ "id": "5f15e81b-53c2-4c5c-a779-1f6776100543", "upload_file_size": null, "audio_length_seconds": 20.0, "sample_rate": null, "channels": null, "frame_rate": null, "mime_type": null, "language_code": "eng", "warnings": null, "diarisation_result": { "timeline": [ { "start_time": 0.0, "end_time": 2.14, "type": "silence" }, { "start_time": 2.14, "end_time": 10.54, "type": "speech", "speaker_id": "speaker0", "text": "Thank you so much for calling. You're through to Thandi, how can help you?" }, { "start_time": 10.54, "end_time": 15.98, "type": "speech", "speaker_id": "speaker1", "text": "Hi, Thandi, I would like to settle my bill for yesterday." }, { "start_time": 15.98, "end_time": 20.0, "type": "music" } ], "words": [ { "word": "Thank", "start_time": 2.14, "end_time": 2.45, "confidence": 0.94, "weight": null, "word_intensity": null, "best_path": true, "speaker_id": "speaker0" }, { "word": "you", "start_time": 2.45, "end_time": 2.60, "confidence": 0.91, "weight": null, "word_intensity": null, "best_path": true, "speaker_id": "speaker0" } // More words would follow here... ] }, "transcription_text": "Thank you so much for calling. You're through to Thandi, how can help you? Hi, Thandi, I would like to settle my bill for yesterday.", "transcription_status": "COMPLETED", "error_message": null, "status_datetime": "2025-04-16T10:45:00Z", "upload_datetime": "2025-04-16T10:43:12Z" }

Error Responses

🔴 400 Bad Request
The request was malformed or contained invalid data
🔴 413 Payload Too Large
File size exceeds limit
🔴 415 Unsupported Media Type
Invalid audio file format
🟠 401 Unauthorized
The client token is missing or invalid.
🔴 500 Internal Server Error
An unexpected error occurred on the server.
🔴 504 Gateway Timeout Error
We are currently experiencing exceptionally high load. Retry later.

Code Examples

Code
import requests from pprint import pprint # Configuration API_KEY = "<INSERT-TOKEN>" WAV_FILE = "<INSERT-PATH-TO-AUDIO>" def file_data(path: str): with open(path, "rb") as f: return f.read() url = "https://api.lelapa.ai/v1/transcribe/sync" headers = { "X-CLIENT-TOKEN": API_KEY, } files = { 'file': ('audio.wav', file_data(WAV_FILE), 'audio/wav') } # Optional parameters params = { "lang_code": "<INSERT-LANGUAGE-CODE>", "diarise": 1, } resp = requests.post(url, headers=headers, files=files, params=params) pprint(resp.json())
Last modified on