stefan-it's picture
Upload folder using huggingface_hub
ea3b28f
2023-10-18 22:59:23,988 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:23,988 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 22:59:23,988 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:23,989 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-18 22:59:23,989 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:23,989 Train: 5777 sentences
2023-10-18 22:59:23,989 (train_with_dev=False, train_with_test=False)
2023-10-18 22:59:23,989 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:23,989 Training Params:
2023-10-18 22:59:23,989 - learning_rate: "5e-05"
2023-10-18 22:59:23,989 - mini_batch_size: "8"
2023-10-18 22:59:23,989 - max_epochs: "10"
2023-10-18 22:59:23,989 - shuffle: "True"
2023-10-18 22:59:23,989 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:23,989 Plugins:
2023-10-18 22:59:23,989 - TensorboardLogger
2023-10-18 22:59:23,989 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 22:59:23,989 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:23,989 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 22:59:23,989 - metric: "('micro avg', 'f1-score')"
2023-10-18 22:59:23,989 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:23,989 Computation:
2023-10-18 22:59:23,989 - compute on device: cuda:0
2023-10-18 22:59:23,989 - embedding storage: none
2023-10-18 22:59:23,989 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:23,989 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-18 22:59:23,989 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:23,989 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:23,989 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 22:59:25,412 epoch 1 - iter 72/723 - loss 3.38114564 - time (sec): 1.42 - samples/sec: 11401.32 - lr: 0.000005 - momentum: 0.000000
2023-10-18 22:59:27,014 epoch 1 - iter 144/723 - loss 3.08418588 - time (sec): 3.02 - samples/sec: 11101.50 - lr: 0.000010 - momentum: 0.000000
2023-10-18 22:59:28,798 epoch 1 - iter 216/723 - loss 2.63293986 - time (sec): 4.81 - samples/sec: 10690.89 - lr: 0.000015 - momentum: 0.000000
2023-10-18 22:59:30,586 epoch 1 - iter 288/723 - loss 2.14162158 - time (sec): 6.60 - samples/sec: 10566.42 - lr: 0.000020 - momentum: 0.000000
2023-10-18 22:59:32,321 epoch 1 - iter 360/723 - loss 1.79554300 - time (sec): 8.33 - samples/sec: 10390.76 - lr: 0.000025 - momentum: 0.000000
2023-10-18 22:59:34,037 epoch 1 - iter 432/723 - loss 1.55698836 - time (sec): 10.05 - samples/sec: 10298.46 - lr: 0.000030 - momentum: 0.000000
2023-10-18 22:59:35,788 epoch 1 - iter 504/723 - loss 1.38373364 - time (sec): 11.80 - samples/sec: 10204.98 - lr: 0.000035 - momentum: 0.000000
2023-10-18 22:59:37,646 epoch 1 - iter 576/723 - loss 1.24627337 - time (sec): 13.66 - samples/sec: 10189.95 - lr: 0.000040 - momentum: 0.000000
2023-10-18 22:59:39,478 epoch 1 - iter 648/723 - loss 1.13025666 - time (sec): 15.49 - samples/sec: 10203.66 - lr: 0.000045 - momentum: 0.000000
2023-10-18 22:59:41,318 epoch 1 - iter 720/723 - loss 1.04619335 - time (sec): 17.33 - samples/sec: 10138.40 - lr: 0.000050 - momentum: 0.000000
2023-10-18 22:59:41,378 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:41,379 EPOCH 1 done: loss 1.0440 - lr: 0.000050
2023-10-18 22:59:42,611 DEV : loss 0.31867048144340515 - f1-score (micro avg) 0.0021
2023-10-18 22:59:42,625 saving best model
2023-10-18 22:59:42,654 ----------------------------------------------------------------------------------------------------
2023-10-18 22:59:44,461 epoch 2 - iter 72/723 - loss 0.22991776 - time (sec): 1.81 - samples/sec: 9834.17 - lr: 0.000049 - momentum: 0.000000
2023-10-18 22:59:46,199 epoch 2 - iter 144/723 - loss 0.22949664 - time (sec): 3.54 - samples/sec: 9757.79 - lr: 0.000049 - momentum: 0.000000
2023-10-18 22:59:48,000 epoch 2 - iter 216/723 - loss 0.22651336 - time (sec): 5.35 - samples/sec: 9587.61 - lr: 0.000048 - momentum: 0.000000
2023-10-18 22:59:49,941 epoch 2 - iter 288/723 - loss 0.21724318 - time (sec): 7.29 - samples/sec: 9671.08 - lr: 0.000048 - momentum: 0.000000
2023-10-18 22:59:51,701 epoch 2 - iter 360/723 - loss 0.21247069 - time (sec): 9.05 - samples/sec: 9646.56 - lr: 0.000047 - momentum: 0.000000
2023-10-18 22:59:53,530 epoch 2 - iter 432/723 - loss 0.20857178 - time (sec): 10.88 - samples/sec: 9680.15 - lr: 0.000047 - momentum: 0.000000
2023-10-18 22:59:55,292 epoch 2 - iter 504/723 - loss 0.20684646 - time (sec): 12.64 - samples/sec: 9747.09 - lr: 0.000046 - momentum: 0.000000
2023-10-18 22:59:57,119 epoch 2 - iter 576/723 - loss 0.21024836 - time (sec): 14.46 - samples/sec: 9679.43 - lr: 0.000046 - momentum: 0.000000
2023-10-18 22:59:58,924 epoch 2 - iter 648/723 - loss 0.20759568 - time (sec): 16.27 - samples/sec: 9711.98 - lr: 0.000045 - momentum: 0.000000
2023-10-18 23:00:00,696 epoch 2 - iter 720/723 - loss 0.20794809 - time (sec): 18.04 - samples/sec: 9724.81 - lr: 0.000044 - momentum: 0.000000
2023-10-18 23:00:00,764 ----------------------------------------------------------------------------------------------------
2023-10-18 23:00:00,765 EPOCH 2 done: loss 0.2078 - lr: 0.000044
2023-10-18 23:00:02,850 DEV : loss 0.22826793789863586 - f1-score (micro avg) 0.254
2023-10-18 23:00:02,866 saving best model
2023-10-18 23:00:02,905 ----------------------------------------------------------------------------------------------------
2023-10-18 23:00:04,764 epoch 3 - iter 72/723 - loss 0.18294966 - time (sec): 1.86 - samples/sec: 10092.27 - lr: 0.000044 - momentum: 0.000000
2023-10-18 23:00:06,600 epoch 3 - iter 144/723 - loss 0.18974998 - time (sec): 3.69 - samples/sec: 10188.71 - lr: 0.000043 - momentum: 0.000000
2023-10-18 23:00:08,426 epoch 3 - iter 216/723 - loss 0.18483913 - time (sec): 5.52 - samples/sec: 9994.19 - lr: 0.000043 - momentum: 0.000000
2023-10-18 23:00:10,230 epoch 3 - iter 288/723 - loss 0.18075862 - time (sec): 7.32 - samples/sec: 9979.24 - lr: 0.000042 - momentum: 0.000000
2023-10-18 23:00:12,058 epoch 3 - iter 360/723 - loss 0.18072422 - time (sec): 9.15 - samples/sec: 9873.89 - lr: 0.000042 - momentum: 0.000000
2023-10-18 23:00:13,801 epoch 3 - iter 432/723 - loss 0.18042225 - time (sec): 10.89 - samples/sec: 9776.93 - lr: 0.000041 - momentum: 0.000000
2023-10-18 23:00:15,624 epoch 3 - iter 504/723 - loss 0.18161063 - time (sec): 12.72 - samples/sec: 9786.06 - lr: 0.000041 - momentum: 0.000000
2023-10-18 23:00:17,382 epoch 3 - iter 576/723 - loss 0.18110465 - time (sec): 14.48 - samples/sec: 9758.02 - lr: 0.000040 - momentum: 0.000000
2023-10-18 23:00:19,118 epoch 3 - iter 648/723 - loss 0.17753532 - time (sec): 16.21 - samples/sec: 9816.52 - lr: 0.000039 - momentum: 0.000000
2023-10-18 23:00:20,684 epoch 3 - iter 720/723 - loss 0.17701445 - time (sec): 17.78 - samples/sec: 9870.27 - lr: 0.000039 - momentum: 0.000000
2023-10-18 23:00:20,748 ----------------------------------------------------------------------------------------------------
2023-10-18 23:00:20,748 EPOCH 3 done: loss 0.1767 - lr: 0.000039
2023-10-18 23:00:22,506 DEV : loss 0.20792421698570251 - f1-score (micro avg) 0.4092
2023-10-18 23:00:22,520 saving best model
2023-10-18 23:00:22,555 ----------------------------------------------------------------------------------------------------
2023-10-18 23:00:24,297 epoch 4 - iter 72/723 - loss 0.15954370 - time (sec): 1.74 - samples/sec: 9774.94 - lr: 0.000038 - momentum: 0.000000
2023-10-18 23:00:26,086 epoch 4 - iter 144/723 - loss 0.15517464 - time (sec): 3.53 - samples/sec: 9353.01 - lr: 0.000038 - momentum: 0.000000
2023-10-18 23:00:27,920 epoch 4 - iter 216/723 - loss 0.15993206 - time (sec): 5.36 - samples/sec: 9370.16 - lr: 0.000037 - momentum: 0.000000
2023-10-18 23:00:29,776 epoch 4 - iter 288/723 - loss 0.15940359 - time (sec): 7.22 - samples/sec: 9411.20 - lr: 0.000037 - momentum: 0.000000
2023-10-18 23:00:31,614 epoch 4 - iter 360/723 - loss 0.16001585 - time (sec): 9.06 - samples/sec: 9419.71 - lr: 0.000036 - momentum: 0.000000
2023-10-18 23:00:33,458 epoch 4 - iter 432/723 - loss 0.16046789 - time (sec): 10.90 - samples/sec: 9414.93 - lr: 0.000036 - momentum: 0.000000
2023-10-18 23:00:35,341 epoch 4 - iter 504/723 - loss 0.15834988 - time (sec): 12.79 - samples/sec: 9421.93 - lr: 0.000035 - momentum: 0.000000
2023-10-18 23:00:37,679 epoch 4 - iter 576/723 - loss 0.15929391 - time (sec): 15.12 - samples/sec: 9286.74 - lr: 0.000034 - momentum: 0.000000
2023-10-18 23:00:39,475 epoch 4 - iter 648/723 - loss 0.15804445 - time (sec): 16.92 - samples/sec: 9305.11 - lr: 0.000034 - momentum: 0.000000
2023-10-18 23:00:41,347 epoch 4 - iter 720/723 - loss 0.15830661 - time (sec): 18.79 - samples/sec: 9349.83 - lr: 0.000033 - momentum: 0.000000
2023-10-18 23:00:41,406 ----------------------------------------------------------------------------------------------------
2023-10-18 23:00:41,406 EPOCH 4 done: loss 0.1583 - lr: 0.000033
2023-10-18 23:00:43,178 DEV : loss 0.19937574863433838 - f1-score (micro avg) 0.417
2023-10-18 23:00:43,192 saving best model
2023-10-18 23:00:43,227 ----------------------------------------------------------------------------------------------------
2023-10-18 23:00:45,071 epoch 5 - iter 72/723 - loss 0.15411846 - time (sec): 1.84 - samples/sec: 9679.89 - lr: 0.000033 - momentum: 0.000000
2023-10-18 23:00:46,902 epoch 5 - iter 144/723 - loss 0.15553448 - time (sec): 3.67 - samples/sec: 9814.22 - lr: 0.000032 - momentum: 0.000000
2023-10-18 23:00:48,727 epoch 5 - iter 216/723 - loss 0.15503779 - time (sec): 5.50 - samples/sec: 9767.31 - lr: 0.000032 - momentum: 0.000000
2023-10-18 23:00:50,522 epoch 5 - iter 288/723 - loss 0.15444028 - time (sec): 7.29 - samples/sec: 9661.38 - lr: 0.000031 - momentum: 0.000000
2023-10-18 23:00:52,245 epoch 5 - iter 360/723 - loss 0.15017229 - time (sec): 9.02 - samples/sec: 9613.87 - lr: 0.000031 - momentum: 0.000000
2023-10-18 23:00:54,061 epoch 5 - iter 432/723 - loss 0.15134564 - time (sec): 10.83 - samples/sec: 9604.42 - lr: 0.000030 - momentum: 0.000000
2023-10-18 23:00:55,855 epoch 5 - iter 504/723 - loss 0.15022735 - time (sec): 12.63 - samples/sec: 9622.77 - lr: 0.000029 - momentum: 0.000000
2023-10-18 23:00:57,809 epoch 5 - iter 576/723 - loss 0.15001693 - time (sec): 14.58 - samples/sec: 9606.86 - lr: 0.000029 - momentum: 0.000000
2023-10-18 23:00:59,625 epoch 5 - iter 648/723 - loss 0.14939047 - time (sec): 16.40 - samples/sec: 9604.35 - lr: 0.000028 - momentum: 0.000000
2023-10-18 23:01:01,395 epoch 5 - iter 720/723 - loss 0.14893618 - time (sec): 18.17 - samples/sec: 9682.65 - lr: 0.000028 - momentum: 0.000000
2023-10-18 23:01:01,446 ----------------------------------------------------------------------------------------------------
2023-10-18 23:01:01,447 EPOCH 5 done: loss 0.1491 - lr: 0.000028
2023-10-18 23:01:03,204 DEV : loss 0.19236673414707184 - f1-score (micro avg) 0.4651
2023-10-18 23:01:03,218 saving best model
2023-10-18 23:01:03,253 ----------------------------------------------------------------------------------------------------
2023-10-18 23:01:04,976 epoch 6 - iter 72/723 - loss 0.13995242 - time (sec): 1.72 - samples/sec: 9598.69 - lr: 0.000027 - momentum: 0.000000
2023-10-18 23:01:06,717 epoch 6 - iter 144/723 - loss 0.13162756 - time (sec): 3.46 - samples/sec: 9777.22 - lr: 0.000027 - momentum: 0.000000
2023-10-18 23:01:08,513 epoch 6 - iter 216/723 - loss 0.13349888 - time (sec): 5.26 - samples/sec: 9945.16 - lr: 0.000026 - momentum: 0.000000
2023-10-18 23:01:10,614 epoch 6 - iter 288/723 - loss 0.13416199 - time (sec): 7.36 - samples/sec: 9598.95 - lr: 0.000026 - momentum: 0.000000
2023-10-18 23:01:12,389 epoch 6 - iter 360/723 - loss 0.13680730 - time (sec): 9.14 - samples/sec: 9575.16 - lr: 0.000025 - momentum: 0.000000
2023-10-18 23:01:14,126 epoch 6 - iter 432/723 - loss 0.13369462 - time (sec): 10.87 - samples/sec: 9651.47 - lr: 0.000024 - momentum: 0.000000
2023-10-18 23:01:15,887 epoch 6 - iter 504/723 - loss 0.13682229 - time (sec): 12.63 - samples/sec: 9616.52 - lr: 0.000024 - momentum: 0.000000
2023-10-18 23:01:17,637 epoch 6 - iter 576/723 - loss 0.13447545 - time (sec): 14.38 - samples/sec: 9676.31 - lr: 0.000023 - momentum: 0.000000
2023-10-18 23:01:19,433 epoch 6 - iter 648/723 - loss 0.13738547 - time (sec): 16.18 - samples/sec: 9691.19 - lr: 0.000023 - momentum: 0.000000
2023-10-18 23:01:21,327 epoch 6 - iter 720/723 - loss 0.13843992 - time (sec): 18.07 - samples/sec: 9722.45 - lr: 0.000022 - momentum: 0.000000
2023-10-18 23:01:21,392 ----------------------------------------------------------------------------------------------------
2023-10-18 23:01:21,392 EPOCH 6 done: loss 0.1384 - lr: 0.000022
2023-10-18 23:01:23,168 DEV : loss 0.18509748578071594 - f1-score (micro avg) 0.518
2023-10-18 23:01:23,182 saving best model
2023-10-18 23:01:23,221 ----------------------------------------------------------------------------------------------------
2023-10-18 23:01:25,034 epoch 7 - iter 72/723 - loss 0.12980522 - time (sec): 1.81 - samples/sec: 9087.83 - lr: 0.000022 - momentum: 0.000000
2023-10-18 23:01:26,869 epoch 7 - iter 144/723 - loss 0.13216005 - time (sec): 3.65 - samples/sec: 9411.47 - lr: 0.000021 - momentum: 0.000000
2023-10-18 23:01:28,651 epoch 7 - iter 216/723 - loss 0.13522464 - time (sec): 5.43 - samples/sec: 9421.00 - lr: 0.000021 - momentum: 0.000000
2023-10-18 23:01:30,521 epoch 7 - iter 288/723 - loss 0.13547392 - time (sec): 7.30 - samples/sec: 9526.68 - lr: 0.000020 - momentum: 0.000000
2023-10-18 23:01:32,314 epoch 7 - iter 360/723 - loss 0.13328086 - time (sec): 9.09 - samples/sec: 9519.95 - lr: 0.000019 - momentum: 0.000000
2023-10-18 23:01:34,075 epoch 7 - iter 432/723 - loss 0.13286486 - time (sec): 10.85 - samples/sec: 9590.25 - lr: 0.000019 - momentum: 0.000000
2023-10-18 23:01:35,905 epoch 7 - iter 504/723 - loss 0.13294580 - time (sec): 12.68 - samples/sec: 9647.07 - lr: 0.000018 - momentum: 0.000000
2023-10-18 23:01:37,763 epoch 7 - iter 576/723 - loss 0.13280059 - time (sec): 14.54 - samples/sec: 9628.22 - lr: 0.000018 - momentum: 0.000000
2023-10-18 23:01:39,546 epoch 7 - iter 648/723 - loss 0.13385121 - time (sec): 16.32 - samples/sec: 9678.50 - lr: 0.000017 - momentum: 0.000000
2023-10-18 23:01:41,376 epoch 7 - iter 720/723 - loss 0.13277021 - time (sec): 18.15 - samples/sec: 9673.98 - lr: 0.000017 - momentum: 0.000000
2023-10-18 23:01:41,439 ----------------------------------------------------------------------------------------------------
2023-10-18 23:01:41,439 EPOCH 7 done: loss 0.1327 - lr: 0.000017
2023-10-18 23:01:43,565 DEV : loss 0.1816086322069168 - f1-score (micro avg) 0.5211
2023-10-18 23:01:43,580 saving best model
2023-10-18 23:01:43,616 ----------------------------------------------------------------------------------------------------
2023-10-18 23:01:45,389 epoch 8 - iter 72/723 - loss 0.14663446 - time (sec): 1.77 - samples/sec: 9305.44 - lr: 0.000016 - momentum: 0.000000
2023-10-18 23:01:47,220 epoch 8 - iter 144/723 - loss 0.13839322 - time (sec): 3.60 - samples/sec: 9639.06 - lr: 0.000016 - momentum: 0.000000
2023-10-18 23:01:49,017 epoch 8 - iter 216/723 - loss 0.13301673 - time (sec): 5.40 - samples/sec: 9821.06 - lr: 0.000015 - momentum: 0.000000
2023-10-18 23:01:50,787 epoch 8 - iter 288/723 - loss 0.13305023 - time (sec): 7.17 - samples/sec: 9734.21 - lr: 0.000014 - momentum: 0.000000
2023-10-18 23:01:52,565 epoch 8 - iter 360/723 - loss 0.12937088 - time (sec): 8.95 - samples/sec: 9819.91 - lr: 0.000014 - momentum: 0.000000
2023-10-18 23:01:54,376 epoch 8 - iter 432/723 - loss 0.12649252 - time (sec): 10.76 - samples/sec: 9823.98 - lr: 0.000013 - momentum: 0.000000
2023-10-18 23:01:56,134 epoch 8 - iter 504/723 - loss 0.12645102 - time (sec): 12.52 - samples/sec: 9875.58 - lr: 0.000013 - momentum: 0.000000
2023-10-18 23:01:57,874 epoch 8 - iter 576/723 - loss 0.12467083 - time (sec): 14.26 - samples/sec: 9833.41 - lr: 0.000012 - momentum: 0.000000
2023-10-18 23:01:59,659 epoch 8 - iter 648/723 - loss 0.12570604 - time (sec): 16.04 - samples/sec: 9830.77 - lr: 0.000012 - momentum: 0.000000
2023-10-18 23:02:01,618 epoch 8 - iter 720/723 - loss 0.12654477 - time (sec): 18.00 - samples/sec: 9764.67 - lr: 0.000011 - momentum: 0.000000
2023-10-18 23:02:01,675 ----------------------------------------------------------------------------------------------------
2023-10-18 23:02:01,675 EPOCH 8 done: loss 0.1265 - lr: 0.000011
2023-10-18 23:02:03,469 DEV : loss 0.18727229535579681 - f1-score (micro avg) 0.5216
2023-10-18 23:02:03,485 saving best model
2023-10-18 23:02:03,521 ----------------------------------------------------------------------------------------------------
2023-10-18 23:02:05,356 epoch 9 - iter 72/723 - loss 0.13100301 - time (sec): 1.83 - samples/sec: 9805.15 - lr: 0.000011 - momentum: 0.000000
2023-10-18 23:02:07,228 epoch 9 - iter 144/723 - loss 0.13098805 - time (sec): 3.71 - samples/sec: 9707.02 - lr: 0.000010 - momentum: 0.000000
2023-10-18 23:02:09,057 epoch 9 - iter 216/723 - loss 0.12293263 - time (sec): 5.54 - samples/sec: 9524.65 - lr: 0.000009 - momentum: 0.000000
2023-10-18 23:02:10,869 epoch 9 - iter 288/723 - loss 0.12134590 - time (sec): 7.35 - samples/sec: 9569.42 - lr: 0.000009 - momentum: 0.000000
2023-10-18 23:02:12,730 epoch 9 - iter 360/723 - loss 0.12289732 - time (sec): 9.21 - samples/sec: 9589.05 - lr: 0.000008 - momentum: 0.000000
2023-10-18 23:02:14,593 epoch 9 - iter 432/723 - loss 0.12316734 - time (sec): 11.07 - samples/sec: 9634.39 - lr: 0.000008 - momentum: 0.000000
2023-10-18 23:02:16,409 epoch 9 - iter 504/723 - loss 0.12430855 - time (sec): 12.89 - samples/sec: 9657.90 - lr: 0.000007 - momentum: 0.000000
2023-10-18 23:02:18,214 epoch 9 - iter 576/723 - loss 0.12555226 - time (sec): 14.69 - samples/sec: 9614.98 - lr: 0.000007 - momentum: 0.000000
2023-10-18 23:02:19,968 epoch 9 - iter 648/723 - loss 0.12562317 - time (sec): 16.45 - samples/sec: 9657.55 - lr: 0.000006 - momentum: 0.000000
2023-10-18 23:02:21,802 epoch 9 - iter 720/723 - loss 0.12504301 - time (sec): 18.28 - samples/sec: 9611.69 - lr: 0.000006 - momentum: 0.000000
2023-10-18 23:02:21,864 ----------------------------------------------------------------------------------------------------
2023-10-18 23:02:21,864 EPOCH 9 done: loss 0.1251 - lr: 0.000006
2023-10-18 23:02:24,005 DEV : loss 0.1787503957748413 - f1-score (micro avg) 0.5396
2023-10-18 23:02:24,019 saving best model
2023-10-18 23:02:24,055 ----------------------------------------------------------------------------------------------------
2023-10-18 23:02:25,778 epoch 10 - iter 72/723 - loss 0.12760560 - time (sec): 1.72 - samples/sec: 10007.55 - lr: 0.000005 - momentum: 0.000000
2023-10-18 23:02:27,518 epoch 10 - iter 144/723 - loss 0.13169396 - time (sec): 3.46 - samples/sec: 9862.38 - lr: 0.000004 - momentum: 0.000000
2023-10-18 23:02:29,301 epoch 10 - iter 216/723 - loss 0.13152802 - time (sec): 5.24 - samples/sec: 9903.98 - lr: 0.000004 - momentum: 0.000000
2023-10-18 23:02:31,175 epoch 10 - iter 288/723 - loss 0.12743923 - time (sec): 7.12 - samples/sec: 9988.53 - lr: 0.000003 - momentum: 0.000000
2023-10-18 23:02:33,058 epoch 10 - iter 360/723 - loss 0.12276207 - time (sec): 9.00 - samples/sec: 9888.82 - lr: 0.000003 - momentum: 0.000000
2023-10-18 23:02:34,873 epoch 10 - iter 432/723 - loss 0.12383487 - time (sec): 10.82 - samples/sec: 9764.32 - lr: 0.000002 - momentum: 0.000000
2023-10-18 23:02:36,659 epoch 10 - iter 504/723 - loss 0.12055966 - time (sec): 12.60 - samples/sec: 9782.10 - lr: 0.000002 - momentum: 0.000000
2023-10-18 23:02:38,526 epoch 10 - iter 576/723 - loss 0.11942674 - time (sec): 14.47 - samples/sec: 9798.52 - lr: 0.000001 - momentum: 0.000000
2023-10-18 23:02:40,315 epoch 10 - iter 648/723 - loss 0.12086038 - time (sec): 16.26 - samples/sec: 9737.31 - lr: 0.000001 - momentum: 0.000000
2023-10-18 23:02:42,102 epoch 10 - iter 720/723 - loss 0.12182104 - time (sec): 18.05 - samples/sec: 9731.88 - lr: 0.000000 - momentum: 0.000000
2023-10-18 23:02:42,163 ----------------------------------------------------------------------------------------------------
2023-10-18 23:02:42,163 EPOCH 10 done: loss 0.1219 - lr: 0.000000
2023-10-18 23:02:43,955 DEV : loss 0.18162186443805695 - f1-score (micro avg) 0.5438
2023-10-18 23:02:43,970 saving best model
2023-10-18 23:02:44,037 ----------------------------------------------------------------------------------------------------
2023-10-18 23:02:44,037 Loading model from best epoch ...
2023-10-18 23:02:44,117 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-18 23:02:45,458
Results:
- F-score (micro) 0.547
- F-score (macro) 0.3859
- Accuracy 0.3854
By class:
precision recall f1-score support
LOC 0.6234 0.6288 0.6261 458
PER 0.5989 0.4336 0.5030 482
ORG 1.0000 0.0145 0.0286 69
micro avg 0.6133 0.4936 0.5470 1009
macro avg 0.7407 0.3590 0.3859 1009
weighted avg 0.6374 0.4936 0.5264 1009
2023-10-18 23:02:45,458 ----------------------------------------------------------------------------------------------------