marianMT_hin_eng_cs
This model is a fine-tuned version of Helsinki-NLP/opus-mt-mul-en on ar5entum/hindi-english-code-mixed dataset. It achieves the following results on the evaluation set:
- Loss: 0.1450
- Bleu: 77.8649
- Gen Len: 74.8945
Model description
The model is specifically designed to translate Hindi text written in Devanagari script into a mixed format where Hindi words are retained in Devanagari while English words are converted to Roman script. This model effectively handles the complexities of code-switching, producing output that accurately reflects the intended language mixing.
Example:
Hindi | Hindi + English CS |
---|---|
तो वो टोटली मेरे घर के प्लान पे डिपेंड करता है | to वो totally मेरे घर के plan पे depend करता है |
मांग लो भाई बहुत नेसेसरी है | मांग लो भाई बहुत necessary है |
टेलीविज़न में क्या प्रोग्राम चल रहा है? | television में क्या program चल रहा है? |
from transformers import MarianMTModel, MarianTokenizer
class HinEngCS:
def __init__(self, model_name='ar5entum/marianMT_hin_eng_cs'):
self.model_name = model_name
self.tokenizer = MarianTokenizer.from_pretrained(model_name)
self.model = MarianMTModel.from_pretrained(model_name)
def predict(self, input_text):
tokenized_text = self.tokenizer(input_text, return_tensors='pt')
translated = self.model.generate(**tokenized_text)
translated_text = self.tokenizer.decode(translated[0], skip_special_tokens=True)
return translated_text
model = HinEngCS()
input_text = "आज मैं नानयांग टेक्नोलॉजिकल यूनिवर्सिटी में अनेक समझौते होते हुए देखूंगा जो कि उच्च शिक्षा साइंस टेक्नोलॉजी और इनोवेशन में हमारे सहयोग को और बढ़ाएंगे।"
model.predict(input_text)
# आज मैं नानयांग technological university में अनेक समझौते होते हुए देखूंगा जो कि उच्च शिक्षा science technology और innovation में हमारे सहयोग को और बढ़ाएंगे।
Training Procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 50
- eval_batch_size: 50
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- total_train_batch_size: 100
- total_eval_batch_size: 100
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30.0
Training results
Training Loss | Epoch | Step | Bleu | Gen Len | Validation Loss |
---|---|---|---|---|---|
1.5823 | 1.0 | 1118 | 11.6257 | 77.1622 | 1.1778 |
0.921 | 2.0 | 2236 | 33.2917 | 76.1459 | 0.6357 |
0.6472 | 3.0 | 3354 | 47.3533 | 75.9194 | 0.4504 |
0.5246 | 4.0 | 4472 | 55.2169 | 75.6871 | 0.3579 |
0.4228 | 5.0 | 5590 | 60.8262 | 75.5777 | 0.3041 |
0.3745 | 6.0 | 6708 | 64.8987 | 75.4424 | 0.2693 |
0.3552 | 7.0 | 7826 | 67.7607 | 75.2438 | 0.2455 |
0.3324 | 8.0 | 8944 | 69.635 | 75.1036 | 0.2274 |
0.2912 | 9.0 | 10062 | 71.3086 | 75.0326 | 0.2117 |
0.2591 | 10.0 | 11180 | 72.392 | 74.9607 | 0.2001 |
0.2471 | 11.0 | 12298 | 73.4758 | 74.9251 | 0.1899 |
0.236 | 12.0 | 13416 | 74.4219 | 74.833 | 0.1822 |
0.2265 | 13.0 | 14534 | 75.1435 | 74.9069 | 0.1745 |
0.2152 | 14.0 | 15652 | 75.7614 | 74.7409 | 0.1695 |
0.2078 | 15.0 | 16770 | 76.2353 | 74.7092 | 0.1641 |
0.2048 | 16.0 | 17888 | 76.7381 | 74.7274 | 0.1593 |
0.1975 | 17.0 | 19006 | 76.9954 | 74.7217 | 0.1559 |
0.1943 | 18.0 | 20124 | 77.421 | 74.6641 | 0.1524 |
0.1987 | 19.0 | 21242 | 77.8231 | 74.6833 | 0.1495 |
0.1855 | 20.0 | 22360 | 78.0784 | 74.6804 | 0.1472 |
Framework versions
- Transformers 4.45.0.dev0
- Pytorch 2.4.0+cu121
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 97
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for ar5entum/marianMT_hin_eng_cs
Base model
Helsinki-NLP/opus-mt-mul-en