|
2023-10-18 22:59:23,988 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:23,988 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 22:59:23,988 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:23,989 MultiCorpus: 5777 train + 722 dev + 723 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl |
|
2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:23,989 Train: 5777 sentences |
|
2023-10-18 22:59:23,989 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:23,989 Training Params: |
|
2023-10-18 22:59:23,989 - learning_rate: "5e-05" |
|
2023-10-18 22:59:23,989 - mini_batch_size: "8" |
|
2023-10-18 22:59:23,989 - max_epochs: "10" |
|
2023-10-18 22:59:23,989 - shuffle: "True" |
|
2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:23,989 Plugins: |
|
2023-10-18 22:59:23,989 - TensorboardLogger |
|
2023-10-18 22:59:23,989 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:23,989 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 22:59:23,989 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:23,989 Computation: |
|
2023-10-18 22:59:23,989 - compute on device: cuda:0 |
|
2023-10-18 22:59:23,989 - embedding storage: none |
|
2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:23,989 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:23,989 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 22:59:25,412 epoch 1 - iter 72/723 - loss 3.38114564 - time (sec): 1.42 - samples/sec: 11401.32 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 22:59:27,014 epoch 1 - iter 144/723 - loss 3.08418588 - time (sec): 3.02 - samples/sec: 11101.50 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 22:59:28,798 epoch 1 - iter 216/723 - loss 2.63293986 - time (sec): 4.81 - samples/sec: 10690.89 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 22:59:30,586 epoch 1 - iter 288/723 - loss 2.14162158 - time (sec): 6.60 - samples/sec: 10566.42 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 22:59:32,321 epoch 1 - iter 360/723 - loss 1.79554300 - time (sec): 8.33 - samples/sec: 10390.76 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 22:59:34,037 epoch 1 - iter 432/723 - loss 1.55698836 - time (sec): 10.05 - samples/sec: 10298.46 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 22:59:35,788 epoch 1 - iter 504/723 - loss 1.38373364 - time (sec): 11.80 - samples/sec: 10204.98 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 22:59:37,646 epoch 1 - iter 576/723 - loss 1.24627337 - time (sec): 13.66 - samples/sec: 10189.95 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 22:59:39,478 epoch 1 - iter 648/723 - loss 1.13025666 - time (sec): 15.49 - samples/sec: 10203.66 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 22:59:41,318 epoch 1 - iter 720/723 - loss 1.04619335 - time (sec): 17.33 - samples/sec: 10138.40 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-18 22:59:41,378 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:41,379 EPOCH 1 done: loss 1.0440 - lr: 0.000050 |
|
2023-10-18 22:59:42,611 DEV : loss 0.31867048144340515 - f1-score (micro avg) 0.0021 |
|
2023-10-18 22:59:42,625 saving best model |
|
2023-10-18 22:59:42,654 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:59:44,461 epoch 2 - iter 72/723 - loss 0.22991776 - time (sec): 1.81 - samples/sec: 9834.17 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 22:59:46,199 epoch 2 - iter 144/723 - loss 0.22949664 - time (sec): 3.54 - samples/sec: 9757.79 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 22:59:48,000 epoch 2 - iter 216/723 - loss 0.22651336 - time (sec): 5.35 - samples/sec: 9587.61 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 22:59:49,941 epoch 2 - iter 288/723 - loss 0.21724318 - time (sec): 7.29 - samples/sec: 9671.08 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 22:59:51,701 epoch 2 - iter 360/723 - loss 0.21247069 - time (sec): 9.05 - samples/sec: 9646.56 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 22:59:53,530 epoch 2 - iter 432/723 - loss 0.20857178 - time (sec): 10.88 - samples/sec: 9680.15 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 22:59:55,292 epoch 2 - iter 504/723 - loss 0.20684646 - time (sec): 12.64 - samples/sec: 9747.09 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 22:59:57,119 epoch 2 - iter 576/723 - loss 0.21024836 - time (sec): 14.46 - samples/sec: 9679.43 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 22:59:58,924 epoch 2 - iter 648/723 - loss 0.20759568 - time (sec): 16.27 - samples/sec: 9711.98 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 23:00:00,696 epoch 2 - iter 720/723 - loss 0.20794809 - time (sec): 18.04 - samples/sec: 9724.81 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 23:00:00,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:00:00,765 EPOCH 2 done: loss 0.2078 - lr: 0.000044 |
|
2023-10-18 23:00:02,850 DEV : loss 0.22826793789863586 - f1-score (micro avg) 0.254 |
|
2023-10-18 23:00:02,866 saving best model |
|
2023-10-18 23:00:02,905 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:00:04,764 epoch 3 - iter 72/723 - loss 0.18294966 - time (sec): 1.86 - samples/sec: 10092.27 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 23:00:06,600 epoch 3 - iter 144/723 - loss 0.18974998 - time (sec): 3.69 - samples/sec: 10188.71 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 23:00:08,426 epoch 3 - iter 216/723 - loss 0.18483913 - time (sec): 5.52 - samples/sec: 9994.19 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 23:00:10,230 epoch 3 - iter 288/723 - loss 0.18075862 - time (sec): 7.32 - samples/sec: 9979.24 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 23:00:12,058 epoch 3 - iter 360/723 - loss 0.18072422 - time (sec): 9.15 - samples/sec: 9873.89 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 23:00:13,801 epoch 3 - iter 432/723 - loss 0.18042225 - time (sec): 10.89 - samples/sec: 9776.93 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 23:00:15,624 epoch 3 - iter 504/723 - loss 0.18161063 - time (sec): 12.72 - samples/sec: 9786.06 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 23:00:17,382 epoch 3 - iter 576/723 - loss 0.18110465 - time (sec): 14.48 - samples/sec: 9758.02 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 23:00:19,118 epoch 3 - iter 648/723 - loss 0.17753532 - time (sec): 16.21 - samples/sec: 9816.52 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 23:00:20,684 epoch 3 - iter 720/723 - loss 0.17701445 - time (sec): 17.78 - samples/sec: 9870.27 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 23:00:20,748 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:00:20,748 EPOCH 3 done: loss 0.1767 - lr: 0.000039 |
|
2023-10-18 23:00:22,506 DEV : loss 0.20792421698570251 - f1-score (micro avg) 0.4092 |
|
2023-10-18 23:00:22,520 saving best model |
|
2023-10-18 23:00:22,555 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:00:24,297 epoch 4 - iter 72/723 - loss 0.15954370 - time (sec): 1.74 - samples/sec: 9774.94 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 23:00:26,086 epoch 4 - iter 144/723 - loss 0.15517464 - time (sec): 3.53 - samples/sec: 9353.01 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 23:00:27,920 epoch 4 - iter 216/723 - loss 0.15993206 - time (sec): 5.36 - samples/sec: 9370.16 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 23:00:29,776 epoch 4 - iter 288/723 - loss 0.15940359 - time (sec): 7.22 - samples/sec: 9411.20 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 23:00:31,614 epoch 4 - iter 360/723 - loss 0.16001585 - time (sec): 9.06 - samples/sec: 9419.71 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 23:00:33,458 epoch 4 - iter 432/723 - loss 0.16046789 - time (sec): 10.90 - samples/sec: 9414.93 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 23:00:35,341 epoch 4 - iter 504/723 - loss 0.15834988 - time (sec): 12.79 - samples/sec: 9421.93 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 23:00:37,679 epoch 4 - iter 576/723 - loss 0.15929391 - time (sec): 15.12 - samples/sec: 9286.74 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 23:00:39,475 epoch 4 - iter 648/723 - loss 0.15804445 - time (sec): 16.92 - samples/sec: 9305.11 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 23:00:41,347 epoch 4 - iter 720/723 - loss 0.15830661 - time (sec): 18.79 - samples/sec: 9349.83 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 23:00:41,406 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:00:41,406 EPOCH 4 done: loss 0.1583 - lr: 0.000033 |
|
2023-10-18 23:00:43,178 DEV : loss 0.19937574863433838 - f1-score (micro avg) 0.417 |
|
2023-10-18 23:00:43,192 saving best model |
|
2023-10-18 23:00:43,227 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:00:45,071 epoch 5 - iter 72/723 - loss 0.15411846 - time (sec): 1.84 - samples/sec: 9679.89 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 23:00:46,902 epoch 5 - iter 144/723 - loss 0.15553448 - time (sec): 3.67 - samples/sec: 9814.22 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 23:00:48,727 epoch 5 - iter 216/723 - loss 0.15503779 - time (sec): 5.50 - samples/sec: 9767.31 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 23:00:50,522 epoch 5 - iter 288/723 - loss 0.15444028 - time (sec): 7.29 - samples/sec: 9661.38 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 23:00:52,245 epoch 5 - iter 360/723 - loss 0.15017229 - time (sec): 9.02 - samples/sec: 9613.87 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 23:00:54,061 epoch 5 - iter 432/723 - loss 0.15134564 - time (sec): 10.83 - samples/sec: 9604.42 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 23:00:55,855 epoch 5 - iter 504/723 - loss 0.15022735 - time (sec): 12.63 - samples/sec: 9622.77 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 23:00:57,809 epoch 5 - iter 576/723 - loss 0.15001693 - time (sec): 14.58 - samples/sec: 9606.86 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 23:00:59,625 epoch 5 - iter 648/723 - loss 0.14939047 - time (sec): 16.40 - samples/sec: 9604.35 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 23:01:01,395 epoch 5 - iter 720/723 - loss 0.14893618 - time (sec): 18.17 - samples/sec: 9682.65 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 23:01:01,446 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:01:01,447 EPOCH 5 done: loss 0.1491 - lr: 0.000028 |
|
2023-10-18 23:01:03,204 DEV : loss 0.19236673414707184 - f1-score (micro avg) 0.4651 |
|
2023-10-18 23:01:03,218 saving best model |
|
2023-10-18 23:01:03,253 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:01:04,976 epoch 6 - iter 72/723 - loss 0.13995242 - time (sec): 1.72 - samples/sec: 9598.69 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:01:06,717 epoch 6 - iter 144/723 - loss 0.13162756 - time (sec): 3.46 - samples/sec: 9777.22 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:01:08,513 epoch 6 - iter 216/723 - loss 0.13349888 - time (sec): 5.26 - samples/sec: 9945.16 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 23:01:10,614 epoch 6 - iter 288/723 - loss 0.13416199 - time (sec): 7.36 - samples/sec: 9598.95 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 23:01:12,389 epoch 6 - iter 360/723 - loss 0.13680730 - time (sec): 9.14 - samples/sec: 9575.16 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 23:01:14,126 epoch 6 - iter 432/723 - loss 0.13369462 - time (sec): 10.87 - samples/sec: 9651.47 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:01:15,887 epoch 6 - iter 504/723 - loss 0.13682229 - time (sec): 12.63 - samples/sec: 9616.52 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:01:17,637 epoch 6 - iter 576/723 - loss 0.13447545 - time (sec): 14.38 - samples/sec: 9676.31 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 23:01:19,433 epoch 6 - iter 648/723 - loss 0.13738547 - time (sec): 16.18 - samples/sec: 9691.19 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 23:01:21,327 epoch 6 - iter 720/723 - loss 0.13843992 - time (sec): 18.07 - samples/sec: 9722.45 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 23:01:21,392 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:01:21,392 EPOCH 6 done: loss 0.1384 - lr: 0.000022 |
|
2023-10-18 23:01:23,168 DEV : loss 0.18509748578071594 - f1-score (micro avg) 0.518 |
|
2023-10-18 23:01:23,182 saving best model |
|
2023-10-18 23:01:23,221 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:01:25,034 epoch 7 - iter 72/723 - loss 0.12980522 - time (sec): 1.81 - samples/sec: 9087.83 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 23:01:26,869 epoch 7 - iter 144/723 - loss 0.13216005 - time (sec): 3.65 - samples/sec: 9411.47 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:01:28,651 epoch 7 - iter 216/723 - loss 0.13522464 - time (sec): 5.43 - samples/sec: 9421.00 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:01:30,521 epoch 7 - iter 288/723 - loss 0.13547392 - time (sec): 7.30 - samples/sec: 9526.68 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 23:01:32,314 epoch 7 - iter 360/723 - loss 0.13328086 - time (sec): 9.09 - samples/sec: 9519.95 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 23:01:34,075 epoch 7 - iter 432/723 - loss 0.13286486 - time (sec): 10.85 - samples/sec: 9590.25 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 23:01:35,905 epoch 7 - iter 504/723 - loss 0.13294580 - time (sec): 12.68 - samples/sec: 9647.07 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:01:37,763 epoch 7 - iter 576/723 - loss 0.13280059 - time (sec): 14.54 - samples/sec: 9628.22 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:01:39,546 epoch 7 - iter 648/723 - loss 0.13385121 - time (sec): 16.32 - samples/sec: 9678.50 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 23:01:41,376 epoch 7 - iter 720/723 - loss 0.13277021 - time (sec): 18.15 - samples/sec: 9673.98 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 23:01:41,439 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:01:41,439 EPOCH 7 done: loss 0.1327 - lr: 0.000017 |
|
2023-10-18 23:01:43,565 DEV : loss 0.1816086322069168 - f1-score (micro avg) 0.5211 |
|
2023-10-18 23:01:43,580 saving best model |
|
2023-10-18 23:01:43,616 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:01:45,389 epoch 8 - iter 72/723 - loss 0.14663446 - time (sec): 1.77 - samples/sec: 9305.44 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 23:01:47,220 epoch 8 - iter 144/723 - loss 0.13839322 - time (sec): 3.60 - samples/sec: 9639.06 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 23:01:49,017 epoch 8 - iter 216/723 - loss 0.13301673 - time (sec): 5.40 - samples/sec: 9821.06 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:01:50,787 epoch 8 - iter 288/723 - loss 0.13305023 - time (sec): 7.17 - samples/sec: 9734.21 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 23:01:52,565 epoch 8 - iter 360/723 - loss 0.12937088 - time (sec): 8.95 - samples/sec: 9819.91 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 23:01:54,376 epoch 8 - iter 432/723 - loss 0.12649252 - time (sec): 10.76 - samples/sec: 9823.98 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 23:01:56,134 epoch 8 - iter 504/723 - loss 0.12645102 - time (sec): 12.52 - samples/sec: 9875.58 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 23:01:57,874 epoch 8 - iter 576/723 - loss 0.12467083 - time (sec): 14.26 - samples/sec: 9833.41 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:01:59,659 epoch 8 - iter 648/723 - loss 0.12570604 - time (sec): 16.04 - samples/sec: 9830.77 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:02:01,618 epoch 8 - iter 720/723 - loss 0.12654477 - time (sec): 18.00 - samples/sec: 9764.67 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 23:02:01,675 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:01,675 EPOCH 8 done: loss 0.1265 - lr: 0.000011 |
|
2023-10-18 23:02:03,469 DEV : loss 0.18727229535579681 - f1-score (micro avg) 0.5216 |
|
2023-10-18 23:02:03,485 saving best model |
|
2023-10-18 23:02:03,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:05,356 epoch 9 - iter 72/723 - loss 0.13100301 - time (sec): 1.83 - samples/sec: 9805.15 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 23:02:07,228 epoch 9 - iter 144/723 - loss 0.13098805 - time (sec): 3.71 - samples/sec: 9707.02 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 23:02:09,057 epoch 9 - iter 216/723 - loss 0.12293263 - time (sec): 5.54 - samples/sec: 9524.65 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:02:10,869 epoch 9 - iter 288/723 - loss 0.12134590 - time (sec): 7.35 - samples/sec: 9569.42 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:02:12,730 epoch 9 - iter 360/723 - loss 0.12289732 - time (sec): 9.21 - samples/sec: 9589.05 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 23:02:14,593 epoch 9 - iter 432/723 - loss 0.12316734 - time (sec): 11.07 - samples/sec: 9634.39 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 23:02:16,409 epoch 9 - iter 504/723 - loss 0.12430855 - time (sec): 12.89 - samples/sec: 9657.90 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 23:02:18,214 epoch 9 - iter 576/723 - loss 0.12555226 - time (sec): 14.69 - samples/sec: 9614.98 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 23:02:19,968 epoch 9 - iter 648/723 - loss 0.12562317 - time (sec): 16.45 - samples/sec: 9657.55 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:02:21,802 epoch 9 - iter 720/723 - loss 0.12504301 - time (sec): 18.28 - samples/sec: 9611.69 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:02:21,864 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:21,864 EPOCH 9 done: loss 0.1251 - lr: 0.000006 |
|
2023-10-18 23:02:24,005 DEV : loss 0.1787503957748413 - f1-score (micro avg) 0.5396 |
|
2023-10-18 23:02:24,019 saving best model |
|
2023-10-18 23:02:24,055 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:25,778 epoch 10 - iter 72/723 - loss 0.12760560 - time (sec): 1.72 - samples/sec: 10007.55 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 23:02:27,518 epoch 10 - iter 144/723 - loss 0.13169396 - time (sec): 3.46 - samples/sec: 9862.38 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 23:02:29,301 epoch 10 - iter 216/723 - loss 0.13152802 - time (sec): 5.24 - samples/sec: 9903.98 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 23:02:31,175 epoch 10 - iter 288/723 - loss 0.12743923 - time (sec): 7.12 - samples/sec: 9988.53 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:02:33,058 epoch 10 - iter 360/723 - loss 0.12276207 - time (sec): 9.00 - samples/sec: 9888.82 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:02:34,873 epoch 10 - iter 432/723 - loss 0.12383487 - time (sec): 10.82 - samples/sec: 9764.32 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 23:02:36,659 epoch 10 - iter 504/723 - loss 0.12055966 - time (sec): 12.60 - samples/sec: 9782.10 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 23:02:38,526 epoch 10 - iter 576/723 - loss 0.11942674 - time (sec): 14.47 - samples/sec: 9798.52 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 23:02:40,315 epoch 10 - iter 648/723 - loss 0.12086038 - time (sec): 16.26 - samples/sec: 9737.31 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 23:02:42,102 epoch 10 - iter 720/723 - loss 0.12182104 - time (sec): 18.05 - samples/sec: 9731.88 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 23:02:42,163 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:42,163 EPOCH 10 done: loss 0.1219 - lr: 0.000000 |
|
2023-10-18 23:02:43,955 DEV : loss 0.18162186443805695 - f1-score (micro avg) 0.5438 |
|
2023-10-18 23:02:43,970 saving best model |
|
2023-10-18 23:02:44,037 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:44,037 Loading model from best epoch ... |
|
2023-10-18 23:02:44,117 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-18 23:02:45,458 |
|
Results: |
|
- F-score (micro) 0.547 |
|
- F-score (macro) 0.3859 |
|
- Accuracy 0.3854 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6234 0.6288 0.6261 458 |
|
PER 0.5989 0.4336 0.5030 482 |
|
ORG 1.0000 0.0145 0.0286 69 |
|
|
|
micro avg 0.6133 0.4936 0.5470 1009 |
|
macro avg 0.7407 0.3590 0.3859 1009 |
|
weighted avg 0.6374 0.4936 0.5264 1009 |
|
|
|
2023-10-18 23:02:45,458 ---------------------------------------------------------------------------------------------------- |
|
|