Vulavula Logo
Speech To Text (STT)

Batch Transcribe

Create Batch

POST https://api.lelapa.ai/v1/transcribe/batch

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

TerminalCode
curl - X 'POST' \ 'https://api.lelapa.ai/v1/transcribe/batch?lang_code=[INSERT-LANGUAGE-CODE]' \ - H 'X-CLIENT-TOKEN: <INSERT_TOKEN>'

RESPONSE

🟢 201 Created
The request was successful.

JSONRESPONSE example (200)
{ "batch_id": "[BATCH_ID]", "blob_endpoint": "[BLOB_ENDPOINT]", "sas_token": "[SAS_TOKEN]" }
🟠 401 Unauthorized
The client token is missing or invalid.

Upload Files

This will batch upload all files in the current folder ("-s .") to the destination container ("-d .").

TerminalCode
az storage blob upload-batch -s . -d . \ --blob-endpoint "<blob_endpoint>" \ --sas-token "<redacted>"

Process Batch

POST https://api.lelapa.ai/api/v1/transcribe/batch/{batch_id}/process?diarise=1&lang_code=[INSERT_LANGUAGE_CODE]

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

PARAMS

diairise int optional Boolean value to enable diarisation. Set to 1 to enable diarisation. Set to 0 or remove to disable diarisation.

lang_code string optional

This is the language code that we are going to use to transcribe the uploaded audio.

Optionally, you can specify a language code to specify which model you’re speaking on. The following language codes are valid

Code
Afrikaans - "afr" isiZulu - "zul" Sesotho - "sot" South African English - "eng" African French - "fra" Code-switched isiZulu (alpha) - "cs-zul"

If no language code is specified, our built-in language ID will select the most probable language.

TerminalCode
curl - X 'POST' \ 'https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/batch/{batch_id}/process?diarise=1' \ - H 'X-CLIENT-TOKEN: [INSERT_TOKEN]'

RESPONSE

🟢 202 Accepted
The request was successful.

🟢 204 No Content
The batch has already been processed.

🟠 401 Unauthorized
The client token is missing or invalid.

Get All Customer Batch Transcriptions

GET https://api.lelapa.ai/v1/transcribe/batch

This endpoint retrieves all batch transcriptions for a customer.

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

RESPONSE

🟢 200 OK
The request was successful.

JSONRESPONSE example (200)
[ { "id": "[batch_id]", "keychain_id": "00000000-0000-0000-0000-000000000000", "status": "PROCESSING", "language_code": null, "created_at": "2025-02-10T09:47:31.716555", "updated_at": "2025-02-10T09:50:20.596171" }, { "id": "[batch_id]", "keychain_id": "00000000-0000-0000-0000-000000000000", "status": "COMPLETE", "language_code": null, "created_at": "2025-02-07T15:57:53.059573", "updated_at": "2025-02-07T16:00:42.100059" } ]
🟠 401 Unauthorized
The client token is missing or invalid.

Get Batch Transcription Results

GET https://api.lelapa.ai/v1/transcribe/batch/{batch_id}/transcriptions

This endpoint retrieves the transcriptions for a batch.

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

PARAMS

batch_id string required

The id of the batch to retrieve transcriptions for.

RESPONSE

🟢 200 OK
The request was successful.

JSONRESPONSE example (200)
[ { "id": "5f15e81b-53c2-4c5c-a779-1f6776100543", "upload_file_size": null, "audio_length_seconds": 20.0, "sample_rate": null, "channels": null, "frame_rate": null, "mime_type": null, "language_code": "eng", "warnings": null, "diarisation_result": { "timeline": [ { "start_time": 0.0, "end_time": 2.14, "type": "silence" }, { "start_time": 2.14, "end_time": 10.54, "type": "speech", "speaker_id": "speaker0", "text": "Thank you so much for calling. You're through to Thandi, how can help you?" }, { "start_time": 10.54, "end_time": 15.98, "type": "speech", "speaker_id": "speaker1", "text": "Hi, Thandi, I would like to settle my bill for yesterday." }, { "start_time": 15.98, "end_time": 20.0, "type": "music" } ], "words": [ { "word": "Thank", "start_time": 2.14, "end_time": 2.45, "confidence": 0.94, "weight": null, "word_intensity": null, "best_path": true, "speaker_id": "speaker0" }, { "word": "you", "start_time": 2.45, "end_time": 2.60, "confidence": 0.91, "weight": null, "word_intensity": null, "best_path": true, "speaker_id": "speaker0" } // More words would follow here... ] }, "transcription_text": "Thank you so much for calling. You're through to Thandi, how can help you? Hi, Thandi, I would like to settle my bill for yesterday.", "transcription_status": "COMPLETED", "error_message": null, "status_datetime": "2025-04-16T10:45:00Z", "upload_datetime": "2025-04-16T10:43:12Z" } ]
Code
</div> </details> <details> <summary>🟠 `401 Unauthorized`</summary> <div> <div>The client token is missing or invalid.</div> </div> </details> <details> <summary>🟠 `404 Not Found`</summary> <div> <div>The batch with the given ID does not exist.</div> </div> </details> ## Transcribe Usage Report `GET` <span href="" style={{"cursor": "none", "pointer-events": "none"}}>https://api.lelapa.ai/v1/transcribe/usage</span> This endpoint retrieves the usage report for the transcribe service. HEADERS **X-CLIENT-TOKEN** _string_ <span style={{"color": "red"}}>required</span> API token generated for the project. PARAMS **start_date_time** _string_ <span style={{"color": "red"}}>required</span> ISO formatted date string (YYYY-MM-DDTHH:MM:SSZ). The start date of the report. **end_date_time** _string_ <span style={{"color": "red"}}>required</span> ISO formatted date string (YYYY-MM-DDTHH:MM:SSZ). The end date of the report. **interval** _string_ <span style={{"color": "red"}}>required</span> The interval to group the usage report by. Valid values are `minute`, `hour`, and `day`. RESPONSE <details> <summary>🟢 `200 OK`</summary> <div> <div>The request was successful.</div> <br/> ```json title="RESPONSE example (200)" [ "usage": [ { "period": "2025-02-16T00:00:00", "invocations": 0, "successful_invocations": 0, "failed_invocations": 0, "pending_invocations": 0, "seconds_transcribed": 0.0, "tokens": 0 } ], "total_invocations": 0, "total_successful_invocations": 0, "total_failed_invocations": 0, "total_pending_invocations": 0, "total_seconds_transcribed": 0, "total_tokens": 0, "start_date_time": "2025-02-03T12:00:00+00:00", "end_date_time": "2025-02-15T14:00:01+00:00", "interval": "day" }
🟠 401 Unauthorized
The client token is missing or invalid.
🟠 400 Bad request
Invalid date format. Use ISO format (YYYY-MM-DDTHH:MM)
Start date must be before end date
Invalid interval. Use one of: 'minute', 'hour', 'day'
Minute interval only supports a range of 2 hours or less
Hour interval only supports a range of 2 days or less
Day interval only supports a range of 60 days or less
Last modified on