|
2023-10-18 23:50:09,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:09,558 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 23:50:09,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:09,558 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences |
|
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator |
|
2023-10-18 23:50:09,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:09,558 Train: 14465 sentences |
|
2023-10-18 23:50:09,558 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 23:50:09,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:09,558 Training Params: |
|
2023-10-18 23:50:09,558 - learning_rate: "5e-05" |
|
2023-10-18 23:50:09,559 - mini_batch_size: "8" |
|
2023-10-18 23:50:09,559 - max_epochs: "10" |
|
2023-10-18 23:50:09,559 - shuffle: "True" |
|
2023-10-18 23:50:09,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:09,559 Plugins: |
|
2023-10-18 23:50:09,559 - TensorboardLogger |
|
2023-10-18 23:50:09,559 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 23:50:09,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:09,559 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 23:50:09,559 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 23:50:09,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:09,559 Computation: |
|
2023-10-18 23:50:09,559 - compute on device: cuda:0 |
|
2023-10-18 23:50:09,559 - embedding storage: none |
|
2023-10-18 23:50:09,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:09,559 Model training base path: "hmbench-letemps/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-18 23:50:09,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:09,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:09,559 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 23:50:13,652 epoch 1 - iter 180/1809 - loss 3.08996031 - time (sec): 4.09 - samples/sec: 9215.23 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 23:50:17,785 epoch 1 - iter 360/1809 - loss 2.41550781 - time (sec): 8.23 - samples/sec: 9242.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 23:50:22,005 epoch 1 - iter 540/1809 - loss 1.76237783 - time (sec): 12.45 - samples/sec: 9280.74 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:50:25,998 epoch 1 - iter 720/1809 - loss 1.40258020 - time (sec): 16.44 - samples/sec: 9357.12 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 23:50:29,969 epoch 1 - iter 900/1809 - loss 1.18204419 - time (sec): 20.41 - samples/sec: 9436.48 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 23:50:34,138 epoch 1 - iter 1080/1809 - loss 1.04359449 - time (sec): 24.58 - samples/sec: 9313.63 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 23:50:38,296 epoch 1 - iter 1260/1809 - loss 0.93320987 - time (sec): 28.74 - samples/sec: 9251.00 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 23:50:42,424 epoch 1 - iter 1440/1809 - loss 0.84898279 - time (sec): 32.86 - samples/sec: 9201.97 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 23:50:46,577 epoch 1 - iter 1620/1809 - loss 0.77865271 - time (sec): 37.02 - samples/sec: 9161.07 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 23:50:50,699 epoch 1 - iter 1800/1809 - loss 0.71875392 - time (sec): 41.14 - samples/sec: 9194.58 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-18 23:50:50,904 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:50,905 EPOCH 1 done: loss 0.7161 - lr: 0.000050 |
|
2023-10-18 23:50:53,255 DEV : loss 0.17210480570793152 - f1-score (micro avg) 0.2591 |
|
2023-10-18 23:50:53,282 saving best model |
|
2023-10-18 23:50:53,312 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:50:57,438 epoch 2 - iter 180/1809 - loss 0.19350008 - time (sec): 4.13 - samples/sec: 9119.45 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 23:51:01,657 epoch 2 - iter 360/1809 - loss 0.18530881 - time (sec): 8.34 - samples/sec: 9136.68 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 23:51:05,850 epoch 2 - iter 540/1809 - loss 0.18831855 - time (sec): 12.54 - samples/sec: 9025.27 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 23:51:10,021 epoch 2 - iter 720/1809 - loss 0.18353362 - time (sec): 16.71 - samples/sec: 9010.09 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 23:51:13,824 epoch 2 - iter 900/1809 - loss 0.18236110 - time (sec): 20.51 - samples/sec: 9211.87 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 23:51:17,983 epoch 2 - iter 1080/1809 - loss 0.17912246 - time (sec): 24.67 - samples/sec: 9237.95 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 23:51:22,214 epoch 2 - iter 1260/1809 - loss 0.17835039 - time (sec): 28.90 - samples/sec: 9189.83 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 23:51:26,461 epoch 2 - iter 1440/1809 - loss 0.17614405 - time (sec): 33.15 - samples/sec: 9124.36 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 23:51:30,784 epoch 2 - iter 1620/1809 - loss 0.17446410 - time (sec): 37.47 - samples/sec: 9087.49 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 23:51:35,022 epoch 2 - iter 1800/1809 - loss 0.17193393 - time (sec): 41.71 - samples/sec: 9063.26 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 23:51:35,223 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:51:35,223 EPOCH 2 done: loss 0.1718 - lr: 0.000044 |
|
2023-10-18 23:51:39,047 DEV : loss 0.15255987644195557 - f1-score (micro avg) 0.3569 |
|
2023-10-18 23:51:39,075 saving best model |
|
2023-10-18 23:51:39,114 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:51:43,333 epoch 3 - iter 180/1809 - loss 0.17150139 - time (sec): 4.22 - samples/sec: 8967.83 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 23:51:47,648 epoch 3 - iter 360/1809 - loss 0.16038839 - time (sec): 8.53 - samples/sec: 8745.93 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 23:51:51,903 epoch 3 - iter 540/1809 - loss 0.15736877 - time (sec): 12.79 - samples/sec: 8943.81 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 23:51:56,136 epoch 3 - iter 720/1809 - loss 0.15451625 - time (sec): 17.02 - samples/sec: 8899.58 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 23:52:00,284 epoch 3 - iter 900/1809 - loss 0.14915798 - time (sec): 21.17 - samples/sec: 9008.40 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 23:52:04,655 epoch 3 - iter 1080/1809 - loss 0.14517457 - time (sec): 25.54 - samples/sec: 8988.62 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 23:52:08,944 epoch 3 - iter 1260/1809 - loss 0.14398672 - time (sec): 29.83 - samples/sec: 8930.99 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 23:52:13,152 epoch 3 - iter 1440/1809 - loss 0.14423275 - time (sec): 34.04 - samples/sec: 8931.25 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 23:52:17,308 epoch 3 - iter 1620/1809 - loss 0.14312020 - time (sec): 38.19 - samples/sec: 8940.45 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 23:52:21,455 epoch 3 - iter 1800/1809 - loss 0.14319625 - time (sec): 42.34 - samples/sec: 8935.40 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 23:52:21,639 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:52:21,640 EPOCH 3 done: loss 0.1434 - lr: 0.000039 |
|
2023-10-18 23:52:24,829 DEV : loss 0.15280331671237946 - f1-score (micro avg) 0.4243 |
|
2023-10-18 23:52:24,856 saving best model |
|
2023-10-18 23:52:24,888 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:52:29,169 epoch 4 - iter 180/1809 - loss 0.14125343 - time (sec): 4.28 - samples/sec: 8441.69 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 23:52:33,498 epoch 4 - iter 360/1809 - loss 0.13670460 - time (sec): 8.61 - samples/sec: 8764.35 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 23:52:37,835 epoch 4 - iter 540/1809 - loss 0.13288048 - time (sec): 12.95 - samples/sec: 8765.16 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 23:52:42,046 epoch 4 - iter 720/1809 - loss 0.13346511 - time (sec): 17.16 - samples/sec: 8749.72 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 23:52:46,264 epoch 4 - iter 900/1809 - loss 0.13052521 - time (sec): 21.38 - samples/sec: 8780.95 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 23:52:50,533 epoch 4 - iter 1080/1809 - loss 0.12782764 - time (sec): 25.64 - samples/sec: 8824.36 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 23:52:54,871 epoch 4 - iter 1260/1809 - loss 0.12904407 - time (sec): 29.98 - samples/sec: 8857.53 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 23:52:59,036 epoch 4 - iter 1440/1809 - loss 0.13011623 - time (sec): 34.15 - samples/sec: 8855.12 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 23:53:03,358 epoch 4 - iter 1620/1809 - loss 0.12871829 - time (sec): 38.47 - samples/sec: 8866.49 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 23:53:07,567 epoch 4 - iter 1800/1809 - loss 0.12782507 - time (sec): 42.68 - samples/sec: 8855.25 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 23:53:07,776 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:53:07,776 EPOCH 4 done: loss 0.1277 - lr: 0.000033 |
|
2023-10-18 23:53:11,633 DEV : loss 0.15063098073005676 - f1-score (micro avg) 0.4476 |
|
2023-10-18 23:53:11,661 saving best model |
|
2023-10-18 23:53:11,695 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:53:15,995 epoch 5 - iter 180/1809 - loss 0.11386799 - time (sec): 4.30 - samples/sec: 9079.20 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 23:53:20,165 epoch 5 - iter 360/1809 - loss 0.12060812 - time (sec): 8.47 - samples/sec: 9140.56 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 23:53:24,510 epoch 5 - iter 540/1809 - loss 0.11439752 - time (sec): 12.81 - samples/sec: 9043.00 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 23:53:28,764 epoch 5 - iter 720/1809 - loss 0.11352025 - time (sec): 17.07 - samples/sec: 9017.96 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 23:53:33,019 epoch 5 - iter 900/1809 - loss 0.11380639 - time (sec): 21.32 - samples/sec: 8942.74 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 23:53:37,217 epoch 5 - iter 1080/1809 - loss 0.11497194 - time (sec): 25.52 - samples/sec: 8905.41 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 23:53:41,448 epoch 5 - iter 1260/1809 - loss 0.11425627 - time (sec): 29.75 - samples/sec: 8930.11 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 23:53:45,607 epoch 5 - iter 1440/1809 - loss 0.11301762 - time (sec): 33.91 - samples/sec: 8951.27 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 23:53:49,741 epoch 5 - iter 1620/1809 - loss 0.11251591 - time (sec): 38.05 - samples/sec: 8960.67 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 23:53:54,016 epoch 5 - iter 1800/1809 - loss 0.11238350 - time (sec): 42.32 - samples/sec: 8932.57 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 23:53:54,227 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:53:54,227 EPOCH 5 done: loss 0.1123 - lr: 0.000028 |
|
2023-10-18 23:53:57,446 DEV : loss 0.1514587551355362 - f1-score (micro avg) 0.4644 |
|
2023-10-18 23:53:57,474 saving best model |
|
2023-10-18 23:53:57,514 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:54:01,829 epoch 6 - iter 180/1809 - loss 0.11537357 - time (sec): 4.32 - samples/sec: 9104.98 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:54:06,005 epoch 6 - iter 360/1809 - loss 0.11309621 - time (sec): 8.49 - samples/sec: 8967.79 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:54:10,175 epoch 6 - iter 540/1809 - loss 0.11364534 - time (sec): 12.66 - samples/sec: 8824.88 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 23:54:14,470 epoch 6 - iter 720/1809 - loss 0.11100439 - time (sec): 16.96 - samples/sec: 8875.97 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 23:54:18,822 epoch 6 - iter 900/1809 - loss 0.10958062 - time (sec): 21.31 - samples/sec: 8821.53 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 23:54:23,117 epoch 6 - iter 1080/1809 - loss 0.10709390 - time (sec): 25.60 - samples/sec: 8851.08 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:54:27,381 epoch 6 - iter 1260/1809 - loss 0.10377290 - time (sec): 29.87 - samples/sec: 8834.76 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:54:31,567 epoch 6 - iter 1440/1809 - loss 0.10361018 - time (sec): 34.05 - samples/sec: 8874.50 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 23:54:35,589 epoch 6 - iter 1620/1809 - loss 0.10410030 - time (sec): 38.07 - samples/sec: 8948.10 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 23:54:39,866 epoch 6 - iter 1800/1809 - loss 0.10567918 - time (sec): 42.35 - samples/sec: 8930.20 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 23:54:40,070 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:54:40,070 EPOCH 6 done: loss 0.1056 - lr: 0.000022 |
|
2023-10-18 23:54:43,949 DEV : loss 0.1575939804315567 - f1-score (micro avg) 0.4871 |
|
2023-10-18 23:54:43,977 saving best model |
|
2023-10-18 23:54:44,013 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:54:48,217 epoch 7 - iter 180/1809 - loss 0.10172919 - time (sec): 4.20 - samples/sec: 9184.15 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 23:54:52,501 epoch 7 - iter 360/1809 - loss 0.10051170 - time (sec): 8.49 - samples/sec: 9162.68 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:54:56,661 epoch 7 - iter 540/1809 - loss 0.10165841 - time (sec): 12.65 - samples/sec: 9083.25 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:55:00,929 epoch 7 - iter 720/1809 - loss 0.10132993 - time (sec): 16.92 - samples/sec: 9003.67 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 23:55:05,261 epoch 7 - iter 900/1809 - loss 0.10000428 - time (sec): 21.25 - samples/sec: 8962.92 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 23:55:09,564 epoch 7 - iter 1080/1809 - loss 0.09762103 - time (sec): 25.55 - samples/sec: 8964.75 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 23:55:13,714 epoch 7 - iter 1260/1809 - loss 0.09765739 - time (sec): 29.70 - samples/sec: 8975.24 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:55:17,857 epoch 7 - iter 1440/1809 - loss 0.09694399 - time (sec): 33.84 - samples/sec: 8975.87 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:55:22,063 epoch 7 - iter 1620/1809 - loss 0.09829466 - time (sec): 38.05 - samples/sec: 8972.65 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 23:55:26,296 epoch 7 - iter 1800/1809 - loss 0.09838081 - time (sec): 42.28 - samples/sec: 8954.08 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 23:55:26,493 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:55:26,493 EPOCH 7 done: loss 0.0984 - lr: 0.000017 |
|
2023-10-18 23:55:29,683 DEV : loss 0.16171115636825562 - f1-score (micro avg) 0.5 |
|
2023-10-18 23:55:29,711 saving best model |
|
2023-10-18 23:55:29,744 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:55:33,953 epoch 8 - iter 180/1809 - loss 0.08931629 - time (sec): 4.21 - samples/sec: 8929.32 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 23:55:38,202 epoch 8 - iter 360/1809 - loss 0.08518400 - time (sec): 8.46 - samples/sec: 8894.96 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 23:55:42,469 epoch 8 - iter 540/1809 - loss 0.08586313 - time (sec): 12.72 - samples/sec: 8976.80 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:55:46,803 epoch 8 - iter 720/1809 - loss 0.08639375 - time (sec): 17.06 - samples/sec: 8883.44 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 23:55:51,066 epoch 8 - iter 900/1809 - loss 0.08971534 - time (sec): 21.32 - samples/sec: 8920.93 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 23:55:55,425 epoch 8 - iter 1080/1809 - loss 0.09223836 - time (sec): 25.68 - samples/sec: 8906.09 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 23:55:59,663 epoch 8 - iter 1260/1809 - loss 0.09291100 - time (sec): 29.92 - samples/sec: 8930.10 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 23:56:03,864 epoch 8 - iter 1440/1809 - loss 0.09324636 - time (sec): 34.12 - samples/sec: 8935.33 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:56:08,013 epoch 8 - iter 1620/1809 - loss 0.09268125 - time (sec): 38.27 - samples/sec: 8950.24 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:56:12,070 epoch 8 - iter 1800/1809 - loss 0.09179737 - time (sec): 42.33 - samples/sec: 8936.09 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 23:56:12,234 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:56:12,234 EPOCH 8 done: loss 0.0916 - lr: 0.000011 |
|
2023-10-18 23:56:16,143 DEV : loss 0.16610997915267944 - f1-score (micro avg) 0.5027 |
|
2023-10-18 23:56:16,171 saving best model |
|
2023-10-18 23:56:16,209 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:56:20,393 epoch 9 - iter 180/1809 - loss 0.08056805 - time (sec): 4.18 - samples/sec: 9199.47 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 23:56:24,613 epoch 9 - iter 360/1809 - loss 0.08449942 - time (sec): 8.40 - samples/sec: 9098.24 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 23:56:28,795 epoch 9 - iter 540/1809 - loss 0.08566893 - time (sec): 12.59 - samples/sec: 9061.30 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:56:32,952 epoch 9 - iter 720/1809 - loss 0.08869739 - time (sec): 16.74 - samples/sec: 9158.64 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:56:37,078 epoch 9 - iter 900/1809 - loss 0.08905400 - time (sec): 20.87 - samples/sec: 9052.39 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 23:56:41,233 epoch 9 - iter 1080/1809 - loss 0.08725953 - time (sec): 25.02 - samples/sec: 9055.30 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 23:56:45,461 epoch 9 - iter 1260/1809 - loss 0.08791601 - time (sec): 29.25 - samples/sec: 9071.52 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 23:56:49,729 epoch 9 - iter 1440/1809 - loss 0.08850259 - time (sec): 33.52 - samples/sec: 9049.58 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 23:56:53,950 epoch 9 - iter 1620/1809 - loss 0.08846222 - time (sec): 37.74 - samples/sec: 9038.44 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:56:58,114 epoch 9 - iter 1800/1809 - loss 0.08900226 - time (sec): 41.90 - samples/sec: 9024.65 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:56:58,333 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:56:58,334 EPOCH 9 done: loss 0.0893 - lr: 0.000006 |
|
2023-10-18 23:57:01,548 DEV : loss 0.17448826134204865 - f1-score (micro avg) 0.4978 |
|
2023-10-18 23:57:01,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:57:05,851 epoch 10 - iter 180/1809 - loss 0.09229581 - time (sec): 4.27 - samples/sec: 8613.74 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 23:57:10,005 epoch 10 - iter 360/1809 - loss 0.08443788 - time (sec): 8.43 - samples/sec: 8795.85 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 23:57:14,339 epoch 10 - iter 540/1809 - loss 0.08175246 - time (sec): 12.76 - samples/sec: 8736.35 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 23:57:18,553 epoch 10 - iter 720/1809 - loss 0.08413821 - time (sec): 16.98 - samples/sec: 8780.50 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:57:22,844 epoch 10 - iter 900/1809 - loss 0.08309562 - time (sec): 21.27 - samples/sec: 8807.69 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:57:27,042 epoch 10 - iter 1080/1809 - loss 0.08624691 - time (sec): 25.47 - samples/sec: 8916.19 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 23:57:31,177 epoch 10 - iter 1260/1809 - loss 0.08479139 - time (sec): 29.60 - samples/sec: 8926.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 23:57:35,461 epoch 10 - iter 1440/1809 - loss 0.08563822 - time (sec): 33.88 - samples/sec: 8907.81 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 23:57:39,697 epoch 10 - iter 1620/1809 - loss 0.08500765 - time (sec): 38.12 - samples/sec: 8960.52 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 23:57:43,843 epoch 10 - iter 1800/1809 - loss 0.08584855 - time (sec): 42.27 - samples/sec: 8952.52 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 23:57:44,033 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:57:44,033 EPOCH 10 done: loss 0.0862 - lr: 0.000000 |
|
2023-10-18 23:57:47,913 DEV : loss 0.17676953971385956 - f1-score (micro avg) 0.5019 |
|
2023-10-18 23:57:47,972 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:57:47,973 Loading model from best epoch ... |
|
2023-10-18 23:57:48,060 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org |
|
2023-10-18 23:57:51,479 |
|
Results: |
|
- F-score (micro) 0.5103 |
|
- F-score (macro) 0.3397 |
|
- Accuracy 0.3558 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.5186 0.6599 0.5808 591 |
|
pers 0.4167 0.4622 0.4382 357 |
|
org 0.0000 0.0000 0.0000 79 |
|
|
|
micro avg 0.4834 0.5404 0.5103 1027 |
|
macro avg 0.3118 0.3740 0.3397 1027 |
|
weighted avg 0.4433 0.5404 0.4866 1027 |
|
|
|
2023-10-18 23:57:51,479 ---------------------------------------------------------------------------------------------------- |
|
|