Vulavula Logo
Speech To Text (STT)

Batch Transcribe

Transcribe V1 has been deprecated!

The Transcribe V1 endpoint has been deprecated and will be removed in a future release. This documentation has been updated to reference the newer v2alpha API, which includes improvements and ongoing support. If you're still using V1, we recommend migrating to v2alpha as soon as possible to ensure continued functionality. API Reference

Create Batch

POST https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/batch?lang_code=[INSERT-LANGUAGE-CODE]

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

PARAMS

lang_code string optional

This is the language code that we are going to use to transcribe the uploaded audio.

Optionally, you can specify a language code to specify which model you’re speaking on. The following language codes are valid

Afrikaans - "afr"
isiZulu - "zul"
Sesotho - "sot"
South African English - "eng"
African French - "fra"
text

If no language code is specified, our built-in language ID will select the most probable language.

curl - X 'POST' \
    'https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/batch?lang_code=[INSERT-LANGUAGE-CODE]' \
    - H 'X-CLIENT-TOKEN: <INSERT_TOKEN>'
bash

RESPONSE

🟢 201 Created
The request was successful.

RESPONSE example (200)
{
     "batch_id": "[BATCH_ID]",
     "blob_endpoint": "[BLOB_ENDPOINT]",
     "sas_token": "[SAS_TOKEN]"
 }
json
🟠 401 Unauthorized
The client token is missing or invalid.

Upload Files

This will batch upload all files in the current folder ("-s .") to the destination container ("-d .").

az storage blob upload-batch -s . -d .  \
                --blob-endpoint "<blob_endpoint>" \
                --sas-token "<redacted>"
bash

Process Batch

POST https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/batch/{batch_id}/process?diarise=1

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

PARAMS

diairise int optional Boolean value to enable diarisation. Set to 1 to enable diarisation. Set to 0 or remove to disable diarisation.

curl - X 'POST' \
    'https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/batch/{batch_id}/process?diarise=1' \
    - H 'X-CLIENT-TOKEN: [INSERT_TOKEN]'
bash

RESPONSE

🟢 202 Accepted
The request was successful.

🟢 204 No Content
The batch has already been processed.

🟠 401 Unauthorized
The client token is missing or invalid.

Get All Customer Batch Transcriptions

GET https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/batch

This endpoint retrieves all batch transcriptions for a customer.

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

RESPONSE

🟢 200 OK
The request was successful.

RESPONSE example (200)
[
    {
        "id": "[batch_id]",
        "customer_id": 5,
        "project_id": null,
        "keychain_id": null,
        "status": "PROCESSING",
        "language_code": null,
        "created_at": "2025-02-10T09:47:31.716555",
        "updated_at": "2025-02-10T09:50:20.596171"
    },
    {
        "id": "[batch_id]",
        "customer_id": 24,
        "project_id": null,
        "keychain_id": null,
        "status": "COMPLETE",
        "language_code": null,
        "created_at": "2025-02-07T15:57:53.059573",
        "updated_at": "2025-02-07T16:00:42.100059"
    }
]
json
🟠 401 Unauthorized
The client token is missing or invalid.

Get Batch Transcription Results

GET https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/batch/{batch_id}/transcriptions

This endpoint retrieves the transcriptions for a batch.

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

PARAMS

batch_id string required

The id of the batch to retrieve transcriptions for.

RESPONSE

🟢 200 OK
The request was successful.

RESPONSE example (200)
[
  {
    "id": "5f15e81b-53c2-4c5c-a779-1f6776100543",
    "upload_file_size": null,
    "audio_length_seconds": 20.0,
    "sample_rate": null,
    "channels": null,
    "frame_rate": null,
    "mime_type": null,
    "language_code": "eng",
    "warnings": null,
    "diarisation_result": {
      "timeline": [
        {
          "start_time": 0.0,
          "end_time": 2.14,
          "type": "silence"
        },
        {
          "start_time": 2.14,
          "end_time": 10.54,
          "type": "speech",
          "speaker_id": "speaker0",
          "text": "Thank you so much for calling. You're through to Thandi, how can help you?"
        },
        {
          "start_time": 10.54,
          "end_time": 15.98,
          "type": "speech",
          "speaker_id": "speaker1",
          "text": "Hi, Thandi, I would like to settle my bill for yesterday."
        },
        {
          "start_time": 15.98,
          "end_time": 20.0,
          "type": "music"
        }
      ],
      "words": [
        {
          "word": "Thank",
          "start_time": 2.14,
          "end_time": 2.45,
          "confidence": 0.94,
          "weight": null,
          "word_intensity": null,
          "best_path": true,
          "speaker_id": "speaker0"
        },
        {
          "word": "you",
          "start_time": 2.45,
          "end_time": 2.60,
          "confidence": 0.91,
          "weight": null,
          "word_intensity": null,
          "best_path": true,
          "speaker_id": "speaker0"
        }
        // More words would follow here...
      ]
    },
    "transcription_text": "Thank you so much for calling. You're through to Thandi, how can help you? Hi, Thandi, I would like to settle my bill for yesterday.",
    "transcription_status": "COMPLETED",
    "error_message": null,
    "status_datetime": "2025-04-16T10:45:00Z",
    "upload_datetime": "2025-04-16T10:43:12Z"
  }
]
json
</div>
</details>
<details>
  <summary>🟠 `401 Unauthorized`</summary>
  <div>
    <div>The client token is missing or invalid.</div>
  </div>
</details>
<details>
  <summary>🟠 `404 Not Found`</summary>
  <div>
    <div>The batch with the given ID does not exist.</div>
  </div>
</details>

## Transcribe Usage Report

`GET` <span href="" style={{"cursor": "none", "pointer-events": "none"}}>https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/usage</span>

This endpoint retrieves the usage report for the transcribe service.

HEADERS

**X-CLIENT-TOKEN** _string_ <span style={{"color": "red"}}>required</span>

API token generated for the project.

PARAMS

**start_date_time** _string_ <span style={{"color": "red"}}>required</span>
ISO formatted date string (YYYY-MM-DDTHH:MM:SSZ). The start date of the report.

**end_date_time** _string_ <span style={{"color": "red"}}>required</span>
ISO formatted date string (YYYY-MM-DDTHH:MM:SSZ). The end date of the report.

**interval** _string_ <span style={{"color": "red"}}>required</span>
The interval to group the usage report by. Valid values are `minute`, `hour`, and `day`.

RESPONSE

<details>
  <summary>🟢 `200 OK`</summary>
  <div>
    <div>The request was successful.</div> <br/>
```json title="RESPONSE example (200)"
[
    "usage": [
        {
            "period": "2025-02-16T00:00:00",
            "invocations": 0,
            "successful_invocations": 0,
            "failed_invocations": 0,
            "pending_invocations": 0,
            "seconds_transcribed": 0.0,
            "tokens": 0
        }
    ],
    "total_invocations": 0,
    "total_successful_invocations": 0,
    "total_failed_invocations": 0,
    "total_pending_invocations": 0,
    "total_seconds_transcribed": 0,
    "total_tokens": 0,
    "start_date_time": "2025-02-03T12:00:00+00:00",
    "end_date_time": "2025-02-15T14:00:01+00:00",
    "interval": "day"
}
plain
🟠 401 Unauthorized
The client token is missing or invalid.
🟠 400 Bad request
Invalid date format. Use ISO format (YYYY-MM-DDTHH:MM)
Start date must be before end date
Invalid interval. Use one of: 'minute', 'hour', 'day'
Minute interval only supports a range of 2 hours or less
Hour interval only supports a range of 2 days or less
Day interval only supports a range of 60 days or less