Speech To Text (STT)
Sync Transcription (file upload)
Transcribe V1 has been deprecated!
The Transcribe V1 endpoint has been deprecated and will be removed in a future release. This documentation has been updated to reference the newer v2alpha API, which includes improvements and ongoing support. If you're still using V1, we recommend migrating to v2alpha as soon as possible to ensure continued functionality. API Reference
The Sync File Transcription endpoint accepts audio files and returns transcribed text within the same HTTP request/response cycle.
POST
https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/sync/file
Supported Audio Formats
- WAV
- MP3
- FLAC
- AAC
- OGG
- MP4
- AIFF
- OPUS
Important Note: Processing may take up to 2min30s or fail altogether on long, low-quality or complex audio.
Request
Headers
Header | Type | Required | Description |
---|---|---|---|
X-CLIENT-TOKEN | string | Yes | API token generated for the project |
Content-Type | string | Yes | Must be set to multipart/form-data |
Form Data
Parameter | Type | Required | Description |
---|---|---|---|
file | file | Yes | Audio file content in bytes |
Query Parameters
Parameter | Type | Required | Description |
---|---|---|---|
lang_code | string | No | Language code for transcription. If not specified, language will be auto-detected. |
diarise | boolean | No | Enable diarisation (Default: false) |
detect_music | boolean | No | Enable music on hold detection (Default: false) |
Supported Language Codes
afr
- Afrikaanszul
- isiZulusot
- Sesothoeng
- South African Englishfra
- African French
Response
🟢 200 OK
The request was successful.
{ "id": "5f15e81b-53c2-4c5c-a779-1f6776100543", "upload_file_size": null, "audio_length_seconds": 20.0, "sample_rate": null, "channels": null, "frame_rate": null, "mime_type": null, "language_code": "eng", "warnings": null, "diarisation_result": { "timeline": [ { "start_time": 0.0, "end_time": 2.14, "type": "silence" }, { "start_time": 2.14, "end_time": 10.54, "type": "speech", "speaker_id": "speaker0", "text": "Thank you so much for calling. You're through to Thandi, how can help you?" }, { "start_time": 10.54, "end_time": 15.98, "type": "speech", "speaker_id": "speaker1", "text": "Hi, Thandi, I would like to settle my bill for yesterday." }, { "start_time": 15.98, "end_time": 20.0, "type": "music" } ], "words": [ { "word": "Thank", "start_time": 2.14, "end_time": 2.45, "confidence": 0.94, "weight": null, "word_intensity": null, "best_path": true, "speaker_id": "speaker0" }, { "word": "you", "start_time": 2.45, "end_time": 2.60, "confidence": 0.91, "weight": null, "word_intensity": null, "best_path": true, "speaker_id": "speaker0" } // More words would follow here... ] }, "transcription_text": "Thank you so much for calling. You're through to Thandi, how can help you? Hi, Thandi, I would like to settle my bill for yesterday.", "transcription_status": "COMPLETED", "error_message": null, "status_datetime": "2025-04-16T10:45:00Z", "upload_datetime": "2025-04-16T10:43:12Z" }json
Error Responses
🔴 400 Bad Request
The request was malformed or contained invalid data
🔴 413 Payload Too Large
File size exceeds limit
🔴 415 Unsupported Media Type
Invalid audio file format
🟠 401 Unauthorized
The client token is missing or invalid.
🔴 500 Internal Server Error
An unexpected error occurred on the server.
🔴 504 Gateway Timeout Error
We are currently experiencing exceptionally high load. Retry later.
Code Examples
import requests from pprint import pprint # Configuration API_KEY = "<INSERT-TOKEN>" WAV_FILE = "<INSERT-PATH-TO-AUDIO>" def file_data(path: str): with open(path, "rb") as f: return f.read() url = "https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/sync/file" headers = { "X-CLIENT-TOKEN": API_KEY, } files = { 'file': ('audio.wav', file_data(WAV_FILE), 'audio/wav') } # Optional parameters params = { "lang_code": "<INSERT-LANGUAGE-CODE>", "diarise": 1, } resp = requests.post(url, headers=headers, files=files, params=params) pprint(resp.json())python