Vulavula Logo
Overview

Release Notes

Here, you’ll find the latest updates, features, and fixes to our API.

For any questions or support, reach out to our team at support@lelapa.ai

transcribe.sync-2025.04.16-1

  • Update: Improved diarisation logic for more accurate speaker identification, especially in overlapping or ambiguous segments.
  • Fix: Addressed minor internal edge cases to better handle uncommon failure modes and improve overall robustness.
  • Feature: Reintroduced word-level results in the response schema under diarisation_result.words, enabling more granular analysis. Updated schema below.
{
  "id": "string",                      // Unique identifier
  "upload_file_size": "number",        // Size of uploaded file in bytes
  "audio_length_seconds": "number",    // Duration of audio in seconds
  "sample_rate": "number or null",     // Audio sample rate
  "channels": "number or null",        // Number of audio channels
  "frame_rate": "number or null",      // Frame rate
  "mime_type": "string or null",       // MIME type of audio file
  "language_code": "string",           // Language code (e.g., "eng")
  
  "diarisation_result": {              // Speaker identification results
    "timeline": [                      // Sequential timeline of audio segments
      {
        "start_time": "number",        // Segment start time in seconds
        "end_time": "number",          // Segment end time in seconds
        "type": "string",              // Type of segment ("speech" or "silence")
        "speaker_id": "string",        // Speaker identifier (if type is "speech")
        "text": "string"               // Transcribed text (if type is "speech")
      }
    ],
    
    "words": [                         // Detailed word-level data
      {
        "word": "string",              // Transcribed word
        "start_time": "number",        // Word start time in seconds
        "end_time": "number",          // Word end time in seconds
        "confidence": "number",        // Confidence score (0-1)
        "weight": "number or null",    // Weight value
        "word_intensity": "number or null", // Word intensity
        "best_path": "boolean or null", // Best path indicator
        "speaker_id": "string"         // Speaker identifier
      }
    ]
  },
  
  "transcription_text": "string",      // Complete transcription text
  "transcription_status": "string",    // Status (e.g., "COMPLETED")
  "error_message": "string or null",   // Error message if any
  "status_datetime": "string",         // Status timestamp
  "upload_datetime": "string",         // Upload timestamp
  "warnings": ["string"]               // Array of warning messages
}
json

transcribe.sync-2025.04.09-1

  • Feature: Introduced diarisation_result.timeline, providing a clear, turn-by-turn view of diarisation events including speech, music, and silence.
  • Update: Refined transcription schema by removing deprecated fields from the output:
    • container_name, blob_name, customer_id, project_id, storage_url, keychain_id, batch_transcription_id
    • diarisation_result.starts, ends, words, sentences, audio_segments
  • Fix: Resolved edge cases that were causing 500 Internal Server Errors. While edge-case 500 errors have been patched, occasional server errors may still occur under unhandled conditions. We’re actively monitoring and working to resolve these.
  • Fix: Error messaging has been improved for clarity.
    • Invalid language codes now trigger a 400 Bad Request with the message “Invalid language code”.

transcribe.sync-2025.03.28-1

  • Performance: Adjusted baseline batch size for model inference, resulting in a major speed-up.
  • Performance: Refactored sentence-level results algorithm providing a ~30% latency improvement.
  • Feature: Deployed and integrated our new ASR model, improving Word Error Rate.
  • Fix: Added Signal-to-Noise Ratio (SNR) warning for audio files that may cause inaccuracies in Diarisation results.
  • Deprecation: Transcribe v1 endpoints have been deprecated.
  • BETA: Music/silence detection is now integrated and available for use. Some issues are being investigated, with certain calls failing, potentially due to audio length.
  • Feature: API release version details are now available here.
  • Note: minor change to release naming scheme

transcribe.sync-2025.03.20#1

  • Fix: Addressed file upload issues affecting new API key owners.
  • Performance: Upgraded Diarization model to new machines for improved performance.
  • Fix: Resolved keychain creation bug for new platform users.

transcribe.sync-2025.03.19#1

  • Fix: Resolved HTTP 500 errors caused by SQL connection timeouts.

transcribe.sync-2025.03.13#1

  • Feature: Added warning messages for files with a high signal-to-noise ratio.
  • Feature: Introduced sentence-level diarization results.
  • Fix: Resolved issue where null speaker IDs appeared in some results.

transcribe.sync-2025.03.10#1

  • Performance: System stability improvements, including:
    • Enhancements to diarization model configuration.
    • Increased timeout for synchronous HTTP calls.
    • Expanded capacity to meet demand with 10x concurrency.

transcribe.sync-2025.03.05#1

  • Fix: Resolved transcoding failures for certain files; compressed WAV files are now handled correctly.
  • Fix: Sanitized word-level diarization results by removing extraneous spaces and other formatting issues.

transcribe.sync-2025.03.04#1

  • Feature: Basic API usage reporting is now available via a new API endpoint.
  • Feature: Custom HTTP 503 response on the transcribe/sync/file endpoint indicates when the system is not yet ready to serve requests.
  • Fix: Resolved an issue where the incorrect Language ID was returned despite clients specifying lang_code.
  • Performance: Various stability and performance improvements made to the Transcribe Sync pipeline.
  • Ops: Enhanced observability signals to support troubleshooting and issue resolution.