Vulavula Transcribe
Vulavula transcribe section
Transcribe SDK method
Transcribe an audio file by specifying the file path, with an optional webhook URL for asynchronous result delivery. The maximum audio length supported is 30 minutes.
This method simplifies the process by combining both the upload and transcription steps into a single, easy-to-use SDK method.
- Python SDK
pdm add vulavula
from vulavula import VulavulaClient
client = VulavulaClient("<INSERT_TOKEN>")
upload_id, transcription_result = client.transcribe(
"path/to/your/audio/file.wav",
webhook="<INSERT_URL>"
)
print("Transcription Submit Success:", transcription_result) #A success message, data is sent to webhook
Get Transcribed Text
This method allows one to ping the server to get the transcribed text. This is helpful for those who do not want to use a webhook.
- Python SDK
pdm add vulavula
import time
while client.get_transcribed_text(upload_id)['message'] == "Item has not been processed.":
time.sleep(5)
print("Waiting for transcribe to complete...")
client.get_transcribed_text(upload_id)
File upload
POST
https://vulavula-services.lelapa.ai/api/v1/transport/file-upload
This API uploads the recording file to our azure storage. Currently can only upload one file at a time and does not cater for bulk uploads, this will be updated in the near future.
BODY PARAMS
file_name string required
The name of the file. (Note: I think this will be changing really soon)
audio_blob Blob/string of bytes required
Blob or string of bytes of the file. The string of bytes should be base64 encoded.
file_size int64 required
The size of the file. The file size should not exceed 1GB.
HEADERS
X-CLIENT-TOKEN string required
API token generated for the project.
- Python SDK
- Python HTTP
pdm add vulavula
from vulavula import VulavulaClient
client = VulavulaClient("<INSERT_TOKEN>")
upload_result = client.upload_file('path/to/your/file')
print(upload_result)
# Get file size
file_size = os.path.getsize(FILE_TO_TRANSCRIBE)
# Open file in binary mode
with open(FILE_TO_TRANSCRIBE, 'rb') as file:
# Read file
file_content = file.read()
# Encode file content
encoded_content = base64.b64encode(file_content)
# Decode bytes to string
encoded_string = encoded_content.decode()
transport_request_body = {
"file_name": FILE_TO_TRANSCRIBE,
"audio_blob": encoded_string,
"file_size": file_size,
}
headers={
"X-CLIENT-TOKEN": <INSERT_TOKEN>,
}
resp = requests.post(
"https://vulavula-services.lelapa.ai/api/v1/transport/file-upload",
json=transport_request_body,
headers=headers,
)
RESPONSE example (200)
{
"upload_id": "240a180d-553f-456f-912e-14cdc628869c",
"customer_id": 5,
"project_id": 5
}
RESPONSES
OBJECT Response from uploading
upload_id string This is the file's upload id
status_code int64 Response status code
detail string Details about the error
Process
POST
https://vulavula-services.lelapa.ai/api/v1/transcribe/process/{upload_id}
This endpoint triggers the transcribe process.
BODY PARAMS
webhook string optional
This is the webhook url where transcription results and errors will be sent to.
language_code string optional
This is the language code that we are going to use to transcribe the uploaded audio.
Optionally, you can specify a language code to specify which model you’re speaking on. The following language codes are valid
Afrikaans - "afr"
isiZulu - "zul"
Sesotho - "sot"
South African English - "eng"
If no language code is specified, our built-in language ID will select the most probably language.
HEADERS
X-CLIENT-TOKEN string required
API token generated for the project.
- Python SDK
- Python HTTP
pdm add vulavula
from vulavula import VulavulaClient
client = VulavulaClient("<INSERT_TOKEN>")
transcription_result = client.transcribe_process('<upload_id>', '<webhook>')
print(transcription_result) #transcribed message is sent to webhook
upload_id = resp.json()["upload_id"]
headers={
"X-CLIENT-TOKEN": <INSERT_TOKEN>,
}
process = requests.post(
f"https://vulavula-services.lelapa.ai/api/v1/transcribe/process/{upload_id}",
json={
"webhook": <INSERT_URL>,
"language_code":"zul"
},
headers=headers,
)
process.json()
RESPONSE example (200)
{"message": "Message sent to process queue."}
RESPONSES
OBJECT Response from processing
message string Message to indicate that process was triggered successfully
Webhook response
The webhook response object will look like this
upload_id string File upload id
postprocessed_text list[dict] (Optional) The audio transcription.
status_code int64 Response status code.
language_id string (Optional) The language detected from audio.