imdatta0
/

pints_paged_adamw_32bit_warmup0.02

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pints_paged_adamw_32bit_warmup0.02

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 7.4135

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 2048
total_train_batch_size: 2048
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.01
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
10.8665	0.0020	208	10.5405
9.771	0.0040	416	9.3305
8.9825	0.0060	624	8.5321
8.2362	0.0080	832	7.9837
7.8471	0.0100	1040	7.6779
7.5529	0.0120	1248	7.4253
7.3361	0.0140	1456	7.2233
7.137	0.0160	1664	7.0466
7.0123	0.0180	1872	6.9768
6.9564	0.0200	2080	6.9193
6.9615	0.0220	2288	6.9234
6.9531	0.0240	2496	6.9235
6.9675	0.0260	2704	6.9571
6.9392	0.0280	2912	6.9076
7.3212	0.9604	3120	7.4135

Framework versions

Transformers 4.44.2
Pytorch 2.3.0+cu121
Datasets 2.21.0
Tokenizers 0.19.1

Downloads last month: 14

Safetensors

Model size

604M params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Evaluation results

Metadata error: specify a dataset to view leaderboard