---
license: apache-2.0
base_model: distilgpt2
tags:
- generated_from_trainer
model-index:
- name: judge_JuDe
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# judge_JuDe

This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 2.4628

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4

### Training results

| Training Loss | Epoch | Step  | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 2.9881        | 0.25  | 2057  | 2.7543          |
| 2.803         | 0.5   | 4114  | 2.6652          |
| 2.7298        | 0.75  | 6171  | 2.6124          |
| 2.687         | 1.0   | 8228  | 2.5772          |
| 2.6374        | 1.25  | 10285 | 2.5535          |
| 2.6161        | 1.5   | 12342 | 2.5332          |
| 2.598         | 1.75  | 14399 | 2.5171          |
| 2.5773        | 2.0   | 16456 | 2.5050          |
| 2.5578        | 2.25  | 18513 | 2.4943          |
| 2.5468        | 2.5   | 20570 | 2.4868          |
| 2.5385        | 2.75  | 22627 | 2.4783          |
| 2.5322        | 3.0   | 24684 | 2.4712          |
| 2.5182        | 3.25  | 26741 | 2.4697          |
| 2.5188        | 3.5   | 28798 | 2.4657          |
| 2.513         | 3.75  | 30855 | 2.4630          |
| 2.5123        | 4.0   | 32912 | 2.4628          |


### Framework versions

- Transformers 4.34.1
- Pytorch 1.12.1+cu113
- Datasets 2.8.0
- Tokenizers 0.14.1