|
2023-10-18 20:56:03,815 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:03,816 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 20:56:03,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:03,816 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-18 20:56:03,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:03,816 Train: 7936 sentences |
|
2023-10-18 20:56:03,816 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 20:56:03,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:03,816 Training Params: |
|
2023-10-18 20:56:03,816 - learning_rate: "5e-05" |
|
2023-10-18 20:56:03,816 - mini_batch_size: "8" |
|
2023-10-18 20:56:03,816 - max_epochs: "10" |
|
2023-10-18 20:56:03,816 - shuffle: "True" |
|
2023-10-18 20:56:03,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:03,816 Plugins: |
|
2023-10-18 20:56:03,816 - TensorboardLogger |
|
2023-10-18 20:56:03,816 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 20:56:03,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:03,816 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 20:56:03,816 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 20:56:03,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:03,816 Computation: |
|
2023-10-18 20:56:03,816 - compute on device: cuda:0 |
|
2023-10-18 20:56:03,816 - embedding storage: none |
|
2023-10-18 20:56:03,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:03,816 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-18 20:56:03,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:03,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:03,817 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 20:56:06,181 epoch 1 - iter 99/992 - loss 2.37417653 - time (sec): 2.36 - samples/sec: 6970.36 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 20:56:08,505 epoch 1 - iter 198/992 - loss 2.04547181 - time (sec): 4.69 - samples/sec: 6978.26 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 20:56:10,702 epoch 1 - iter 297/992 - loss 1.63644393 - time (sec): 6.88 - samples/sec: 7234.43 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 20:56:12,678 epoch 1 - iter 396/992 - loss 1.35011565 - time (sec): 8.86 - samples/sec: 7467.10 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 20:56:14,632 epoch 1 - iter 495/992 - loss 1.18194660 - time (sec): 10.82 - samples/sec: 7693.85 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 20:56:16,785 epoch 1 - iter 594/992 - loss 1.05850892 - time (sec): 12.97 - samples/sec: 7664.55 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 20:56:18,922 epoch 1 - iter 693/992 - loss 0.96209046 - time (sec): 15.11 - samples/sec: 7664.05 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 20:56:21,060 epoch 1 - iter 792/992 - loss 0.88404985 - time (sec): 17.24 - samples/sec: 7634.19 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 20:56:23,256 epoch 1 - iter 891/992 - loss 0.82833250 - time (sec): 19.44 - samples/sec: 7576.98 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 20:56:25,516 epoch 1 - iter 990/992 - loss 0.77917868 - time (sec): 21.70 - samples/sec: 7543.24 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-18 20:56:25,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:25,565 EPOCH 1 done: loss 0.7778 - lr: 0.000050 |
|
2023-10-18 20:56:27,108 DEV : loss 0.2251289188861847 - f1-score (micro avg) 0.2771 |
|
2023-10-18 20:56:27,126 saving best model |
|
2023-10-18 20:56:27,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:29,368 epoch 2 - iter 99/992 - loss 0.32167945 - time (sec): 2.21 - samples/sec: 7301.80 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 20:56:31,577 epoch 2 - iter 198/992 - loss 0.29255043 - time (sec): 4.42 - samples/sec: 7535.62 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 20:56:33,795 epoch 2 - iter 297/992 - loss 0.29067708 - time (sec): 6.64 - samples/sec: 7386.21 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 20:56:36,015 epoch 2 - iter 396/992 - loss 0.29025565 - time (sec): 8.86 - samples/sec: 7363.21 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 20:56:38,278 epoch 2 - iter 495/992 - loss 0.28143687 - time (sec): 11.12 - samples/sec: 7310.90 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 20:56:40,527 epoch 2 - iter 594/992 - loss 0.27945389 - time (sec): 13.37 - samples/sec: 7270.36 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 20:56:42,728 epoch 2 - iter 693/992 - loss 0.27975877 - time (sec): 15.57 - samples/sec: 7261.00 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 20:56:45,070 epoch 2 - iter 792/992 - loss 0.27735348 - time (sec): 17.91 - samples/sec: 7256.35 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 20:56:47,388 epoch 2 - iter 891/992 - loss 0.27681776 - time (sec): 20.23 - samples/sec: 7272.50 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 20:56:49,647 epoch 2 - iter 990/992 - loss 0.27381910 - time (sec): 22.49 - samples/sec: 7277.31 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 20:56:49,690 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:49,691 EPOCH 2 done: loss 0.2739 - lr: 0.000044 |
|
2023-10-18 20:56:51,906 DEV : loss 0.18277420103549957 - f1-score (micro avg) 0.3656 |
|
2023-10-18 20:56:51,924 saving best model |
|
2023-10-18 20:56:51,958 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:56:54,200 epoch 3 - iter 99/992 - loss 0.22769296 - time (sec): 2.24 - samples/sec: 7466.59 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 20:56:56,425 epoch 3 - iter 198/992 - loss 0.23820508 - time (sec): 4.47 - samples/sec: 7317.55 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 20:56:58,657 epoch 3 - iter 297/992 - loss 0.24041108 - time (sec): 6.70 - samples/sec: 7300.94 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 20:57:00,926 epoch 3 - iter 396/992 - loss 0.23852730 - time (sec): 8.97 - samples/sec: 7228.49 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 20:57:03,244 epoch 3 - iter 495/992 - loss 0.23174392 - time (sec): 11.29 - samples/sec: 7235.37 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 20:57:05,436 epoch 3 - iter 594/992 - loss 0.23397501 - time (sec): 13.48 - samples/sec: 7202.93 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 20:57:07,666 epoch 3 - iter 693/992 - loss 0.23953902 - time (sec): 15.71 - samples/sec: 7229.79 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 20:57:09,905 epoch 3 - iter 792/992 - loss 0.23601306 - time (sec): 17.95 - samples/sec: 7263.88 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 20:57:12,148 epoch 3 - iter 891/992 - loss 0.23207443 - time (sec): 20.19 - samples/sec: 7280.65 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 20:57:14,478 epoch 3 - iter 990/992 - loss 0.23020989 - time (sec): 22.52 - samples/sec: 7262.19 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 20:57:14,524 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:57:14,525 EPOCH 3 done: loss 0.2301 - lr: 0.000039 |
|
2023-10-18 20:57:16,353 DEV : loss 0.1740075945854187 - f1-score (micro avg) 0.3899 |
|
2023-10-18 20:57:16,372 saving best model |
|
2023-10-18 20:57:16,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:57:18,696 epoch 4 - iter 99/992 - loss 0.21972604 - time (sec): 2.29 - samples/sec: 6880.22 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 20:57:20,907 epoch 4 - iter 198/992 - loss 0.21771922 - time (sec): 4.50 - samples/sec: 7255.82 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 20:57:23,071 epoch 4 - iter 297/992 - loss 0.21406813 - time (sec): 6.66 - samples/sec: 7184.34 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 20:57:25,280 epoch 4 - iter 396/992 - loss 0.21159074 - time (sec): 8.87 - samples/sec: 7161.50 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 20:57:27,521 epoch 4 - iter 495/992 - loss 0.21280319 - time (sec): 11.11 - samples/sec: 7252.30 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 20:57:29,763 epoch 4 - iter 594/992 - loss 0.21029475 - time (sec): 13.36 - samples/sec: 7261.19 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 20:57:31,997 epoch 4 - iter 693/992 - loss 0.20859482 - time (sec): 15.59 - samples/sec: 7267.30 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 20:57:34,226 epoch 4 - iter 792/992 - loss 0.20972295 - time (sec): 17.82 - samples/sec: 7290.54 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 20:57:36,517 epoch 4 - iter 891/992 - loss 0.20628428 - time (sec): 20.11 - samples/sec: 7298.49 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 20:57:38,726 epoch 4 - iter 990/992 - loss 0.20626360 - time (sec): 22.32 - samples/sec: 7330.91 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 20:57:38,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:57:38,772 EPOCH 4 done: loss 0.2061 - lr: 0.000033 |
|
2023-10-18 20:57:40,612 DEV : loss 0.16479910910129547 - f1-score (micro avg) 0.4101 |
|
2023-10-18 20:57:40,630 saving best model |
|
2023-10-18 20:57:40,666 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:57:42,943 epoch 5 - iter 99/992 - loss 0.17177104 - time (sec): 2.28 - samples/sec: 7353.67 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 20:57:45,125 epoch 5 - iter 198/992 - loss 0.18004855 - time (sec): 4.46 - samples/sec: 7278.40 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 20:57:47,298 epoch 5 - iter 297/992 - loss 0.18559241 - time (sec): 6.63 - samples/sec: 7192.47 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 20:57:49,624 epoch 5 - iter 396/992 - loss 0.18522087 - time (sec): 8.96 - samples/sec: 7199.80 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 20:57:51,819 epoch 5 - iter 495/992 - loss 0.18653077 - time (sec): 11.15 - samples/sec: 7216.72 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 20:57:54,038 epoch 5 - iter 594/992 - loss 0.18784304 - time (sec): 13.37 - samples/sec: 7224.68 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 20:57:56,273 epoch 5 - iter 693/992 - loss 0.18868299 - time (sec): 15.61 - samples/sec: 7255.64 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 20:57:58,607 epoch 5 - iter 792/992 - loss 0.18764218 - time (sec): 17.94 - samples/sec: 7244.21 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 20:58:00,866 epoch 5 - iter 891/992 - loss 0.18960836 - time (sec): 20.20 - samples/sec: 7275.43 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 20:58:03,136 epoch 5 - iter 990/992 - loss 0.19225274 - time (sec): 22.47 - samples/sec: 7281.35 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 20:58:03,191 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:58:03,191 EPOCH 5 done: loss 0.1923 - lr: 0.000028 |
|
2023-10-18 20:58:05,000 DEV : loss 0.15571151673793793 - f1-score (micro avg) 0.429 |
|
2023-10-18 20:58:05,018 saving best model |
|
2023-10-18 20:58:05,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:58:07,306 epoch 6 - iter 99/992 - loss 0.17941164 - time (sec): 2.25 - samples/sec: 7240.83 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 20:58:09,525 epoch 6 - iter 198/992 - loss 0.17285384 - time (sec): 4.47 - samples/sec: 7150.39 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 20:58:11,791 epoch 6 - iter 297/992 - loss 0.17178395 - time (sec): 6.74 - samples/sec: 7272.32 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 20:58:14,001 epoch 6 - iter 396/992 - loss 0.17127824 - time (sec): 8.95 - samples/sec: 7269.39 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 20:58:16,238 epoch 6 - iter 495/992 - loss 0.17366195 - time (sec): 11.19 - samples/sec: 7241.92 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 20:58:18,505 epoch 6 - iter 594/992 - loss 0.17202885 - time (sec): 13.45 - samples/sec: 7246.43 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 20:58:20,637 epoch 6 - iter 693/992 - loss 0.17315227 - time (sec): 15.59 - samples/sec: 7274.96 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 20:58:22,852 epoch 6 - iter 792/992 - loss 0.17320411 - time (sec): 17.80 - samples/sec: 7271.31 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 20:58:25,041 epoch 6 - iter 891/992 - loss 0.17490211 - time (sec): 19.99 - samples/sec: 7285.57 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 20:58:27,314 epoch 6 - iter 990/992 - loss 0.17714492 - time (sec): 22.26 - samples/sec: 7351.83 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 20:58:27,362 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:58:27,363 EPOCH 6 done: loss 0.1770 - lr: 0.000022 |
|
2023-10-18 20:58:29,203 DEV : loss 0.15208245813846588 - f1-score (micro avg) 0.4522 |
|
2023-10-18 20:58:29,221 saving best model |
|
2023-10-18 20:58:29,259 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:58:31,472 epoch 7 - iter 99/992 - loss 0.18307759 - time (sec): 2.21 - samples/sec: 7285.65 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 20:58:33,665 epoch 7 - iter 198/992 - loss 0.17513144 - time (sec): 4.41 - samples/sec: 7284.45 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 20:58:35,924 epoch 7 - iter 297/992 - loss 0.17363848 - time (sec): 6.66 - samples/sec: 7347.88 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 20:58:38,152 epoch 7 - iter 396/992 - loss 0.17849995 - time (sec): 8.89 - samples/sec: 7356.97 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 20:58:40,363 epoch 7 - iter 495/992 - loss 0.17302936 - time (sec): 11.10 - samples/sec: 7373.04 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 20:58:42,608 epoch 7 - iter 594/992 - loss 0.17173804 - time (sec): 13.35 - samples/sec: 7350.59 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 20:58:44,842 epoch 7 - iter 693/992 - loss 0.17176977 - time (sec): 15.58 - samples/sec: 7321.56 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 20:58:47,119 epoch 7 - iter 792/992 - loss 0.17139044 - time (sec): 17.86 - samples/sec: 7282.96 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 20:58:49,409 epoch 7 - iter 891/992 - loss 0.16779114 - time (sec): 20.15 - samples/sec: 7332.95 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 20:58:51,632 epoch 7 - iter 990/992 - loss 0.16865105 - time (sec): 22.37 - samples/sec: 7315.20 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 20:58:51,674 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:58:51,675 EPOCH 7 done: loss 0.1684 - lr: 0.000017 |
|
2023-10-18 20:58:53,875 DEV : loss 0.14835909008979797 - f1-score (micro avg) 0.4563 |
|
2023-10-18 20:58:53,893 saving best model |
|
2023-10-18 20:58:53,930 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:58:56,137 epoch 8 - iter 99/992 - loss 0.16559554 - time (sec): 2.21 - samples/sec: 7584.76 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 20:58:58,402 epoch 8 - iter 198/992 - loss 0.15713025 - time (sec): 4.47 - samples/sec: 7401.59 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 20:59:00,698 epoch 8 - iter 297/992 - loss 0.15886284 - time (sec): 6.77 - samples/sec: 7505.02 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 20:59:02,941 epoch 8 - iter 396/992 - loss 0.16091614 - time (sec): 9.01 - samples/sec: 7381.27 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 20:59:05,171 epoch 8 - iter 495/992 - loss 0.15847892 - time (sec): 11.24 - samples/sec: 7290.06 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 20:59:07,413 epoch 8 - iter 594/992 - loss 0.16400434 - time (sec): 13.48 - samples/sec: 7279.96 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 20:59:09,638 epoch 8 - iter 693/992 - loss 0.16260905 - time (sec): 15.71 - samples/sec: 7267.37 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 20:59:11,837 epoch 8 - iter 792/992 - loss 0.16132712 - time (sec): 17.91 - samples/sec: 7275.69 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 20:59:14,069 epoch 8 - iter 891/992 - loss 0.16199039 - time (sec): 20.14 - samples/sec: 7290.01 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 20:59:16,290 epoch 8 - iter 990/992 - loss 0.16199479 - time (sec): 22.36 - samples/sec: 7317.37 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 20:59:16,333 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:59:16,334 EPOCH 8 done: loss 0.1618 - lr: 0.000011 |
|
2023-10-18 20:59:18,140 DEV : loss 0.14870117604732513 - f1-score (micro avg) 0.4715 |
|
2023-10-18 20:59:18,158 saving best model |
|
2023-10-18 20:59:18,192 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:59:20,399 epoch 9 - iter 99/992 - loss 0.16182854 - time (sec): 2.21 - samples/sec: 7607.41 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 20:59:22,613 epoch 9 - iter 198/992 - loss 0.16350312 - time (sec): 4.42 - samples/sec: 7814.63 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 20:59:24,553 epoch 9 - iter 297/992 - loss 0.15746054 - time (sec): 6.36 - samples/sec: 7922.20 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 20:59:26,606 epoch 9 - iter 396/992 - loss 0.15577802 - time (sec): 8.41 - samples/sec: 7832.63 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 20:59:28,852 epoch 9 - iter 495/992 - loss 0.15518974 - time (sec): 10.66 - samples/sec: 7730.86 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 20:59:31,075 epoch 9 - iter 594/992 - loss 0.16021182 - time (sec): 12.88 - samples/sec: 7660.07 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 20:59:33,419 epoch 9 - iter 693/992 - loss 0.16018544 - time (sec): 15.23 - samples/sec: 7495.81 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 20:59:35,680 epoch 9 - iter 792/992 - loss 0.15779920 - time (sec): 17.49 - samples/sec: 7456.85 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 20:59:37,888 epoch 9 - iter 891/992 - loss 0.15767464 - time (sec): 19.70 - samples/sec: 7472.73 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 20:59:40,148 epoch 9 - iter 990/992 - loss 0.15624567 - time (sec): 21.96 - samples/sec: 7453.85 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 20:59:40,194 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:59:40,194 EPOCH 9 done: loss 0.1560 - lr: 0.000006 |
|
2023-10-18 20:59:42,035 DEV : loss 0.14903637766838074 - f1-score (micro avg) 0.4845 |
|
2023-10-18 20:59:42,054 saving best model |
|
2023-10-18 20:59:42,087 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 20:59:44,366 epoch 10 - iter 99/992 - loss 0.16504639 - time (sec): 2.28 - samples/sec: 6883.04 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 20:59:46,843 epoch 10 - iter 198/992 - loss 0.15098993 - time (sec): 4.76 - samples/sec: 6839.12 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 20:59:49,054 epoch 10 - iter 297/992 - loss 0.15199686 - time (sec): 6.97 - samples/sec: 7033.05 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 20:59:51,283 epoch 10 - iter 396/992 - loss 0.15271679 - time (sec): 9.19 - samples/sec: 7085.91 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 20:59:53,521 epoch 10 - iter 495/992 - loss 0.15018940 - time (sec): 11.43 - samples/sec: 7162.47 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 20:59:55,709 epoch 10 - iter 594/992 - loss 0.15324209 - time (sec): 13.62 - samples/sec: 7237.12 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 20:59:57,949 epoch 10 - iter 693/992 - loss 0.15234940 - time (sec): 15.86 - samples/sec: 7258.78 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 21:00:00,169 epoch 10 - iter 792/992 - loss 0.15171088 - time (sec): 18.08 - samples/sec: 7283.86 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 21:00:02,422 epoch 10 - iter 891/992 - loss 0.15226960 - time (sec): 20.33 - samples/sec: 7245.68 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 21:00:04,665 epoch 10 - iter 990/992 - loss 0.15307673 - time (sec): 22.58 - samples/sec: 7252.76 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 21:00:04,718 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 21:00:04,718 EPOCH 10 done: loss 0.1532 - lr: 0.000000 |
|
2023-10-18 21:00:06,578 DEV : loss 0.1486775279045105 - f1-score (micro avg) 0.4917 |
|
2023-10-18 21:00:06,597 saving best model |
|
2023-10-18 21:00:06,663 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 21:00:06,664 Loading model from best epoch ... |
|
2023-10-18 21:00:06,735 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-18 21:00:08,257 |
|
Results: |
|
- F-score (micro) 0.545 |
|
- F-score (macro) 0.3587 |
|
- Accuracy 0.4189 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6909 0.6824 0.6866 655 |
|
PER 0.2964 0.5157 0.3764 223 |
|
ORG 0.0385 0.0079 0.0131 127 |
|
|
|
micro avg 0.5306 0.5602 0.5450 1005 |
|
macro avg 0.3419 0.4020 0.3587 1005 |
|
weighted avg 0.5209 0.5602 0.5327 1005 |
|
|
|
2023-10-18 21:00:08,257 ---------------------------------------------------------------------------------------------------- |
|
|