Lelapa-X-GLOT

Lelapa-X-Glot

Model Details

Basic information about the model: Review section 4.1 of the model cards paper.

Organization	Lelapa AI
Product	Vulavula
Model date	30 October 2024
Feature	Translation
Lang	Multilingual
Domain	News, Religion, General
Model Name	Lelapa-X-GLOT
Model version	1.0.0
Model Type	Fine-Tuned Proprietary Multi-Architechtural Model

Information about training algorithms, parameters, fairness constraints or other applied approaches, and features: Proprietary Fine-tuning of a Base Model on Text Data

License: Proprietary

Contact: info@lelapa.ai

Intended use

Use cases that were envisioned during development: Review section 4.2 of the model cards paper.

Primary intended uses

This multi-architectural model is designed for machine translation across low-resource African languages, including Afrikaans, English, Sesotho, Sepedi, Setswana, isiXhosa, isiZulu, Swahili, French, Hausa, and Yoruba. It aims to support cross-linguistic communication, digital content accessibility, and language preservation by improving text translation quality in both formal and informal contexts. The model is particularly useful for academic research, governmental communication, and multilingual content generation in African linguistic settings.

Primary intended users

The Translation model can be used by :

Machine Translation community
Researchers

Out-of-scope use cases

This model is a multilingual neural machine translation model designed to support a diverse range of African languages, including Afrikaans, English, Sesotho, Sepedi, Setswana, isiXhosa, isiZulu, Swahili, French, Hausa, and Yoruba. While the model provides high-quality translations for these languages, certain limitations must be considered:

Not optimised for full document translation: The model was trained on sequences not exceeding 512 tokens, meaning that longer texts may suffer from coherence and quality degradation.
Limited handling of highly specialised or technical content: The model may struggle with domain-specific terminology in areas such as legal, medical, and scientific texts, particularly for languages with less high-quality training data.
Challenges with informal and code-switched text: African languages are often spoken in multilingual settings, where code-switching (mixing of languages in a single sentence) is common. The model’s ability to accurately translate code-switched content is limited.

Factors

Factors include linguistic, cultural, and contextual variables as well as technical aspects that could influence the performance and utility of the translation model. Refer to Section 4.3 of the model cards paper for further guidance on relevant factors.

Relevant Factors

Several linguistic, cultural, and technical factors influence the performance and utility of the model across the supported African languages.

1. Languages and Dialects

The model is trained on multiple African languages, each with unique grammatical structures, vocabularies, and syntactic complexities. While it supports Afrikaans, English, Sesotho, Sepedi, Setswana, isiXhosa, isiZulu, Hausa, and Yoruba, translation accuracy may vary due to:

Dialectal variations: Many African languages exhibit significant regional and dialectal differences, which may not be fully captured by the model. For instance, Yoruba spoken in Nigeria may differ from Yoruba spoken in Benin, and Zulu dialects may vary across regions in South Africa.
Scarcity of high-quality parallel data: Some languages, such as Sesotho, Sepedi, and Hausa, have fewer available high-quality training corpora, potentially affecting translation accuracy.
Linguistic overlap and borrowing: Certain languages, such as Setswana, Sesotho, and Sepedi, share structural similarities, which can lead to cross-language influences in translations.

2. Cultural Context
Idiomatic and figurative language: Many African languages are rich in proverbs, idioms, and metaphors, which do not always have direct equivalents in other languages, leading to loss of meaning in translation.
Sensitivity to sociocultural nuances: The model may misinterpret or oversimplify culturally significant expressions or terms, particularly when translating between languages with different cultural frames of reference.

3. Technical Attributes and Instrumentation
Computational efficiency: Translation performance depends on available computational resources, including GPU acceleration and memory capacity.
Sentence segmentation and formatting: In certain cases, the model may struggle with long or unstructured text, particularly if the source text lacks clear punctuation or paragraph breaks.
Training biases and generalisation: The model’s performance may vary across different text domains (e.g., news articles vs. social media text), depending on the diversity of data used during training.

Evaluation Factors

Model performance is evaluated using the CHF1 score.

Metrics

Model performance measures

The model is evaluated using Character F1 Score (CHF1), an automatic metric that provides a balanced measure of precision and recall at the character level. As with the standard F1 score, the CHF1 is the harmonic mean of character-level Precision and Recall.

Precision at the character level measures how well the model avoids including incorrect or extraneous characters in its translations. It reflects the model's ability to produce accurate and clean outputs that match the reference translation as closely as possible.
Recall at the character level indicates how well the model captures all the correct characters from the reference translation, ensuring completeness and accuracy in character representation.

A higher Character F1 Score means the model is effective in maintaining precise and complete character sequences, indicating a good balance between avoiding unnecessary character additions and capturing all relevant characters.

CHF1 score: Testing on various domains in isiZulu

Decision thresholds

No decision thresholds have been specified

Evaluation data

All referenced datasets would ideally point to any set of documents that provide visibility into the source and composition of the dataset. Evaluation datasets should include datasets that are publicly available for third-party use. These could be existing datasets or new ones provided alongside the model card analyses to enable further benchmarking.

Datasets

The multiple datasets were used to create test sets across various domains. Including Government, News, Broadcast, Call Center.

Motivation

These datasets have been selected because they are high-quality, and cover the targeted languages. These help to capture interesting cultural and linguistic aspects that would be crucial in the development process for better performance.

Training data

The model was trained on parallel multilingual data from a variety of open-source and propreitary sources.

Quantitative analyses

Quantitative analyses should be disaggregated, that is, broken down by the chosen factors. Quantitative analyses should provide the results of evaluating the model according to the chosen metrics, providing confidence interval values when possible.

Review section 4.7 of the model cards paper.

Unitary results (isiZulu)

Domain	Lelapa-X-GLOT (isiZulu->English) CHF1	Lelapa-X-GLOT (English->isiZulu) CHF1
Broadcast	52.29	26.02
Synthetic call center	66.58	54.33
Real World Call center	70.18	54.13

Ethical considerations

This section is intended to demonstrate the ethical considerations that went into model development, surfacing ethical challenges and solutions to stakeholders. The ethical analysis does not always lead to precise solutions, but the process of ethical contemplation is worthwhile to inform on responsible practices and next steps in future work: Review section 4.8 of the model cards paper.

This multilingual model, while enhancing machine translation for low-resource African languages, presents ethical challenges related to bias, cultural sensitivity, and potential misuse. Data imbalances may lead to lower translation quality for underrepresented languages such as Sesotho, Sepedi, and Hausa, while dialectal variations in languages like Yoruba and isiZulu could result in biased translations. Additionally, figurative speech, honorifics, and cultural expressions may be mistranslated, distorting meaning and social nuance. The model also poses risks in legal, medical, and governmental applications, where translation errors could have serious consequences. Furthermore, improper use in disinformation or propaganda raises ethical concerns. To mitigate these risks, human oversight should be prioritised in high-stakes applications, datasets should be expanded to improve linguistic diversity, and culturally aware evaluation metrics should be integrated to ensure fair and accurate translations.

Caveats and recommendations

This section should list additional concerns that were not covered in the previous sections. Review section 4.9 of the model cards paper.
Additional caveats are outlined extensively in our Terms and Conditions.

Additional caveats are outlined extensively in our Terms and Conditions.

Lelapa-X-Glot​

Model Details

Intended use​

Primary intended uses​

Primary intended users​

Out-of-scope use cases​

Factors​

1. Languages and Dialects​

2. Cultural Context​

3. Technical Attributes and Instrumentation​

Metrics​

Model performance measures​

Evaluation data​

Datasets​

Motivation​

Training data​

Quantitative analyses​

Unitary results (isiZulu)​

Ethical considerations​

Caveats and recommendations​