Skip to main content

Vulavula Transcribe

Vulavula transcribe section

Sync Transcription​

POST https://vulavula-services.lelapa.ai/api/v1/transcribe/sync

The sync endpoint enables users to run transcription and receive results immediately. You submit an audio file as a blob and wait synchronously for the result, which is returned once the HTTP request completes.

BODY PARAMS

file_name string required

The name of the file. (Note: I think this will be changing really soon)

audio_blob Blob/string of bytes required

Blob or string of bytes of the file. The string of bytes should be base64 encoded.

file_size int64 required

The size of the file. The file size should not exceed 1GB.

The parameter is no longer used, but is still not optional! You can default a value like 0.

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

import requests
import base64

try:
#file path
FILE_TO_TRANSCRIBE = "<<AUDIO FILE PATH>>"

# Open file in binary mode
with open(FILE_TO_TRANSCRIBE, 'rb') as file:
# Read file
file_content = file.read()

# Encode file content
encoded_content = base64.b64encode(file_content)

# Decode bytes to string
encoded_string = encoded_content.decode('utf-8')

request_body_json = {
"file_name": FILE_TO_TRANSCRIBE,
"audio_blob": encoded_string,
"file_size": 0, # this parameter is no longer used, but is still not optional! sorry!
}

headers={
"X-CLIENT-TOKEN": "<INSERT_TOKEN>",
}

response = requests.post(
"https://vulavula-services.lelapa.ai/api/v1/transcribe/sync",
json=request_body_json,
headers=headers,
)
except ValueError:
print("Response is not in JSON format")

# Handling the response

# Get the status code
print(f'Status Code: {response.status_code}')

# If the response is in JSON format, you can get it as a dictionary:
response_json = response.json() # Converts response to JSON format
print(f'Response JSON: {response_json}')

RESPONSES

🟒 200 OK
The request was successful.

RESPONSE example (200)
{
"message": "success",
"text": "sikubingelele esikomkene ndabenzehoranokqale mini sibingelelenakuwe ska holele ni lotholhlalo zohlakazi at itithi lezindabani",
"language_id": "sot"
}

RESPONSE BODY PARAMS

Object
message string required.
A status message indicating the success of the transcription request.
text string required.
The transcribed text of the provided audio file.
language_id string required.
The language code representing the detected language of the transcription, in this case, "sot" for Sesotho.
πŸ”΄ 400 Bad Request
The request was malformed or contained invalid data
🟠 401 Unauthorized
The client token is missing or invalid.
πŸ”΄ 500 Internal Server Error
An unexpected error occurred on the server.

Fast Transcribe endpoint​

POST https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/fast

FORM DATA

upload string of bytes required

String of bytes of the file.

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

CONTENT-TYPE string required

Set content-type to multipart/form-data

PARAMS

lang_code string optional

This is the language code that we are going to use to transcribe the uploaded audio.

Optionally, you can specify a language code to specify which model you’re speaking on. The following language codes are valid

Afrikaans - "afr"
isiZulu - "zul"
Sesotho - "sot"
South African English - "eng"
African French - "fra"

If no language code is specified, our built-in language ID will select the most probable language.

    import requests
import json
from pprint import pprint

# set these

API_KEY = "<INSERT-TOKEN>"
WAV_FILE = "<INSERT-PATH-TO-AUDIO>"

def file_data(path: str):
with open(path, "rb") as f:
return f.read()

url = "https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/fast"

headers = {
"X-CLIENT-TOKEN": API_KEY,
}

files = {
'upload': ('fra_colours.wav', file_data(WAV_FILE), 'audio/wav')
}

# if language code is known, set it
params = {
"lang_code": "<INSERT-LANGUAGE-CODE>"
}

resp = requests.post(url, headers=headers, files=files, params=params)

pprint(resp.json())

RESPONSE

🟒 200 OK
The request was successful.

RESPONSE example (200)
{
"text": "sikubingelele esikomkene ndabenzehoranokqale mini sibingelelenakuwe ska holele ni lotholhlalo zohlakazi at itithi lezindabani",
"lang_code": "sot"
"diarisation: {}
}

RESPONSE BODY PARAMS

Object
text string required.
The transcribed text of the provided audio file.
lang_code string required.
The language code representing the detected/provided language of the transcription, in this case, "sot" for Sesotho.
diarisation dict required.
Diarisation data when available.
🟠 401 Unauthorized
The client token is missing or invalid.
πŸ”΄ 415 Unsupported Media Type
Not a valid WAV file.
πŸ”΄ 500 Internal Server Error
An unexpected error occurred on the server.
πŸ”΄ 503 The request timed out
The request timed out - please try again.

Transcribe SDK method​

Transcribe an audio file by specifying the file path, with an optional webhook URL for asynchronous result delivery. The maximum audio length supported is 30 minutes.

This method simplifies the process by combining both the upload and transcription steps into a single, easy-to-use SDK method.

pdm add vulavula
from vulavula import VulavulaClient
client = VulavulaClient("<INSERT_TOKEN>")

upload_id, transcription_result = client.transcribe(
"path/to/your/audio/file.wav",
webhook="<INSERT_URL>"
)
print("Transcription Submit Success:", transcription_result) #A success message, data is sent to webhook

Get Transcribed Text​

This method allows one to ping the server to get the transcribed text. This is helpful for those who do not want to use a webhook.

pdm add vulavula
import time
while client.get_transcribed_text(upload_id)['message'] == "Item has not been processed.":
time.sleep(5)
print("Waiting for transcribe to complete...")
client.get_transcribed_text(upload_id)

File upload​

POST https://vulavula-services.lelapa.ai/api/v1/transport/file-upload

This API uploads the recording file to our azure storage. Currently can only upload one file at a time and does not cater for bulk uploads, this will be updated in the near future.

BODY PARAMS

file_name string required

The name of the file. (Note: I think this will be changing really soon)

audio_blob Blob/string of bytes required

Blob or string of bytes of the file. The string of bytes should be base64 encoded.

file_size int64 required

The size of the file. The file size should not exceed 1GB.

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

pdm add vulavula
from vulavula import VulavulaClient
client = VulavulaClient("<INSERT_TOKEN>")

upload_result = client.upload_file('path/to/your/file')
print(upload_result)
RESPONSE example (200)

{
"upload_id": "240a180d-553f-456f-912e-14cdc628869c",
"customer_id": 5,
"project_id": 5
}

RESPONSES

OBJECT Response from uploading

200 OK

upload_id string This is the file's upload id

400 Bad request

status_code int64 Response status code

detail string Details about the error

401 Unauthorized
404 Not found
500 Internal Server Error

Process​

POST https://vulavula-services.lelapa.ai/api/v1/transcribe/process/{upload_id}

This endpoint triggers the transcribe process.

BODY PARAMS

webhook string optional

This is the webhook url where transcription results and errors will be sent to.

language_code string optional

This is the language code that we are going to use to transcribe the uploaded audio.

Optionally, you can specify a language code to specify which model you’re speaking on. The following language codes are valid

Afrikaans - "afr"
isiZulu - "zul"
Sesotho - "sot"
South African English - "eng"

If no language code is specified, our built-in language ID will select the most probably language.

HEADERS

X-CLIENT-TOKEN string required

API token generated for the project.

pdm add vulavula
from vulavula import VulavulaClient
client = VulavulaClient("<INSERT_TOKEN>")

transcription_result = client.transcribe_process('<upload_id>', '<webhook>')
print(transcription_result) #transcribed message is sent to webhook
RESPONSE example (200)

{"message": "Message sent to process queue."}

RESPONSES

OBJECT Response from processing

200 OK

message string Message to indicate that process was triggered successfully

401 Unauthorized
500 Internal Server Error

Webhook response​

The webhook response object will look like this

upload_id string File upload id

postprocessed_text list[dict] (Optional) The audio transcription.

status_code int64 Response status code.

language_id string (Optional) The language detected from audio.