Vulavula Transcribe
Vulavula transcribe section
Sync Transcriptionβ
POST
https://vulavula-services.lelapa.ai/api/v1/transcribe/sync
The sync
endpoint enables users to run transcription and receive results immediately. You submit an audio file as a blob and wait synchronously for the result, which is returned once the HTTP request completes.
BODY PARAMS
file_name string required
The name of the file. (Note: I think this will be changing really soon)
audio_blob Blob/string of bytes required
Blob or string of bytes of the file. The string of bytes should be base64 encoded.
file_size int64 required
The size of the file. The file size should not exceed 1GB.
The parameter is no longer used, but is still not optional! You can default a value like 0.
HEADERS
X-CLIENT-TOKEN string required
API token generated for the project.
- Python HTTP
import requests
import base64
try:
#file path
FILE_TO_TRANSCRIBE = "<<AUDIO FILE PATH>>"
# Open file in binary mode
with open(FILE_TO_TRANSCRIBE, 'rb') as file:
# Read file
file_content = file.read()
# Encode file content
encoded_content = base64.b64encode(file_content)
# Decode bytes to string
encoded_string = encoded_content.decode('utf-8')
request_body_json = {
"file_name": FILE_TO_TRANSCRIBE,
"audio_blob": encoded_string,
"file_size": 0, # this parameter is no longer used, but is still not optional! sorry!
}
headers={
"X-CLIENT-TOKEN": "<INSERT_TOKEN>",
}
response = requests.post(
"https://vulavula-services.lelapa.ai/api/v1/transcribe/sync",
json=request_body_json,
headers=headers,
)
except ValueError:
print("Response is not in JSON format")
# Handling the response
# Get the status code
print(f'Status Code: {response.status_code}')
# If the response is in JSON format, you can get it as a dictionary:
response_json = response.json() # Converts response to JSON format
print(f'Response JSON: {response_json}')
RESPONSES
π’ 200 OK
{
"message": "success",
"text": "sikubingelele esikomkene ndabenzehoranokqale mini sibingelelenakuwe ska holele ni lotholhlalo zohlakazi at itithi lezindabani",
"language_id": "sot"
}
RESPONSE BODY PARAMS
Object |
---|
message string required. A status message indicating the success of the transcription request. |
text string required. The transcribed text of the provided audio file. |
language_id string required. The language code representing the detected language of the transcription, in this case, "sot" for Sesotho. |
π΄ 400 Bad Request
π 401 Unauthorized
π΄ 500 Internal Server Error
Fast Transcribe endpointβ
POST
https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/fast
FORM DATA
upload string of bytes required
String of bytes of the file.
HEADERS
X-CLIENT-TOKEN string required
API token generated for the project.
CONTENT-TYPE string required
Set content-type
to multipart/form-data
PARAMS
lang_code string optional
This is the language code that we are going to use to transcribe the uploaded audio.
Optionally, you can specify a language code to specify which model youβre speaking on. The following language codes are valid
Afrikaans - "afr"
isiZulu - "zul"
Sesotho - "sot"
South African English - "eng"
African French - "fra"
If no language code is specified, our built-in language ID will select the most probable language.
- Python HTTP
- Curl
import requests
import json
from pprint import pprint
# set these
API_KEY = "<INSERT-TOKEN>"
WAV_FILE = "<INSERT-PATH-TO-AUDIO>"
def file_data(path: str):
with open(path, "rb") as f:
return f.read()
url = "https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/fast"
headers = {
"X-CLIENT-TOKEN": API_KEY,
}
files = {
'upload': ('fra_colours.wav', file_data(WAV_FILE), 'audio/wav')
}
# if language code is known, set it
params = {
"lang_code": "<INSERT-LANGUAGE-CODE>"
}
resp = requests.post(url, headers=headers, files=files, params=params)
pprint(resp.json())
curl - X 'POST' \
'https://vulavula-services.lelapa.ai/api/v2alpha/transcribe/fast?lang_code=<INSERT-LANGUAGE-CODE>' \
- H 'content-type: multipart/form-data'
- H 'X-CLIENT-TOKEN: <INSERT_TOKEN>' \
- F upload = @recording.wav
RESPONSE
π’ 200 OK
{
"text": "sikubingelele esikomkene ndabenzehoranokqale mini sibingelelenakuwe ska holele ni lotholhlalo zohlakazi at itithi lezindabani",
"lang_code": "sot"
"diarisation: {}
}
RESPONSE BODY PARAMS
Object |
---|
text string required. The transcribed text of the provided audio file. |
lang_code string required. The language code representing the detected/provided language of the transcription, in this case, "sot" for Sesotho. |
diarisation dict required. Diarisation data when available. |
π 401 Unauthorized
π΄ 415 Unsupported Media Type
π΄ 500 Internal Server Error
π΄ 503 The request timed out
Transcribe SDK methodβ
Transcribe an audio file by specifying the file path, with an optional webhook URL for asynchronous result delivery. The maximum audio length supported is 30 minutes.
This method simplifies the process by combining both the upload and transcription steps into a single, easy-to-use SDK method.
- Python SDK
pdm add vulavula
from vulavula import VulavulaClient
client = VulavulaClient("<INSERT_TOKEN>")
upload_id, transcription_result = client.transcribe(
"path/to/your/audio/file.wav",
webhook="<INSERT_URL>"
)
print("Transcription Submit Success:", transcription_result) #A success message, data is sent to webhook
Get Transcribed Textβ
This method allows one to ping the server to get the transcribed text. This is helpful for those who do not want to use a webhook.
- Python SDK
pdm add vulavula
import time
while client.get_transcribed_text(upload_id)['message'] == "Item has not been processed.":
time.sleep(5)
print("Waiting for transcribe to complete...")
client.get_transcribed_text(upload_id)
File uploadβ
POST
https://vulavula-services.lelapa.ai/api/v1/transport/file-upload
This API uploads the recording file to our azure storage. Currently can only upload one file at a time and does not cater for bulk uploads, this will be updated in the near future.
BODY PARAMS
file_name string required
The name of the file. (Note: I think this will be changing really soon)
audio_blob Blob/string of bytes required
Blob or string of bytes of the file. The string of bytes should be base64 encoded.
file_size int64 required
The size of the file. The file size should not exceed 1GB.
HEADERS
X-CLIENT-TOKEN string required
API token generated for the project.
- Python SDK
- Python HTTP
pdm add vulavula
from vulavula import VulavulaClient
client = VulavulaClient("<INSERT_TOKEN>")
upload_result = client.upload_file('path/to/your/file')
print(upload_result)
# Get file size
file_size = os.path.getsize(FILE_TO_TRANSCRIBE)
# Open file in binary mode
with open(FILE_TO_TRANSCRIBE, 'rb') as file:
# Read file
file_content = file.read()
# Encode file content
encoded_content = base64.b64encode(file_content)
# Decode bytes to string
encoded_string = encoded_content.decode()
transport_request_body = {
"file_name": FILE_TO_TRANSCRIBE,
"audio_blob": encoded_string,
"file_size": file_size,
}
headers={
"X-CLIENT-TOKEN": <INSERT_TOKEN>,
}
resp = requests.post(
"https://vulavula-services.lelapa.ai/api/v1/transport/file-upload",
json=transport_request_body,
headers=headers,
)
RESPONSE example (200)
{
"upload_id": "240a180d-553f-456f-912e-14cdc628869c",
"customer_id": 5,
"project_id": 5
}
RESPONSES
OBJECT Response from uploading
upload_id string This is the file's upload id
status_code int64 Response status code
detail string Details about the error
Processβ
POST
https://vulavula-services.lelapa.ai/api/v1/transcribe/process/{upload_id}
This endpoint triggers the transcribe process.
BODY PARAMS
webhook string optional
This is the webhook url where transcription results and errors will be sent to.
language_code string optional
This is the language code that we are going to use to transcribe the uploaded audio.
Optionally, you can specify a language code to specify which model youβre speaking on. The following language codes are valid
Afrikaans - "afr"
isiZulu - "zul"
Sesotho - "sot"
South African English - "eng"
If no language code is specified, our built-in language ID will select the most probably language.
HEADERS
X-CLIENT-TOKEN string required
API token generated for the project.
- Python SDK
- Python HTTP
pdm add vulavula
from vulavula import VulavulaClient
client = VulavulaClient("<INSERT_TOKEN>")
transcription_result = client.transcribe_process('<upload_id>', '<webhook>')
print(transcription_result) #transcribed message is sent to webhook
upload_id = resp.json()["upload_id"]
headers={
"X-CLIENT-TOKEN": <INSERT_TOKEN>,
}
process = requests.post(
f"https://vulavula-services.lelapa.ai/api/v1/transcribe/process/{upload_id}",
json={
"webhook": <INSERT_URL>,
"language_code":"zul"
},
headers=headers,
)
process.json()
RESPONSE example (200)
{"message": "Message sent to process queue."}
RESPONSES
OBJECT Response from processing
message string Message to indicate that process was triggered successfully
Webhook responseβ
The webhook response object will look like this
upload_id string File upload id
postprocessed_text list[dict] (Optional) The audio transcription.
status_code int64 Response status code.
language_id string (Optional) The language detected from audio.