2023-10-18 22:59:23,988 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:23,988 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 22:59:23,988 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:23,989 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:23,989 Train: 5777 sentences 2023-10-18 22:59:23,989 (train_with_dev=False, train_with_test=False) 2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:23,989 Training Params: 2023-10-18 22:59:23,989 - learning_rate: "5e-05" 2023-10-18 22:59:23,989 - mini_batch_size: "8" 2023-10-18 22:59:23,989 - max_epochs: "10" 2023-10-18 22:59:23,989 - shuffle: "True" 2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:23,989 Plugins: 2023-10-18 22:59:23,989 - TensorboardLogger 2023-10-18 22:59:23,989 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:23,989 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 22:59:23,989 - metric: "('micro avg', 'f1-score')" 2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:23,989 Computation: 2023-10-18 22:59:23,989 - compute on device: cuda:0 2023-10-18 22:59:23,989 - embedding storage: none 2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:23,989 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:23,989 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:23,989 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 22:59:25,412 epoch 1 - iter 72/723 - loss 3.38114564 - time (sec): 1.42 - samples/sec: 11401.32 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:59:27,014 epoch 1 - iter 144/723 - loss 3.08418588 - time (sec): 3.02 - samples/sec: 11101.50 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:59:28,798 epoch 1 - iter 216/723 - loss 2.63293986 - time (sec): 4.81 - samples/sec: 10690.89 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:59:30,586 epoch 1 - iter 288/723 - loss 2.14162158 - time (sec): 6.60 - samples/sec: 10566.42 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:59:32,321 epoch 1 - iter 360/723 - loss 1.79554300 - time (sec): 8.33 - samples/sec: 10390.76 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:59:34,037 epoch 1 - iter 432/723 - loss 1.55698836 - time (sec): 10.05 - samples/sec: 10298.46 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:59:35,788 epoch 1 - iter 504/723 - loss 1.38373364 - time (sec): 11.80 - samples/sec: 10204.98 - lr: 0.000035 - momentum: 0.000000 2023-10-18 22:59:37,646 epoch 1 - iter 576/723 - loss 1.24627337 - time (sec): 13.66 - samples/sec: 10189.95 - lr: 0.000040 - momentum: 0.000000 2023-10-18 22:59:39,478 epoch 1 - iter 648/723 - loss 1.13025666 - time (sec): 15.49 - samples/sec: 10203.66 - lr: 0.000045 - momentum: 0.000000 2023-10-18 22:59:41,318 epoch 1 - iter 720/723 - loss 1.04619335 - time (sec): 17.33 - samples/sec: 10138.40 - lr: 0.000050 - momentum: 0.000000 2023-10-18 22:59:41,378 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:41,379 EPOCH 1 done: loss 1.0440 - lr: 0.000050 2023-10-18 22:59:42,611 DEV : loss 0.31867048144340515 - f1-score (micro avg) 0.0021 2023-10-18 22:59:42,625 saving best model 2023-10-18 22:59:42,654 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:44,461 epoch 2 - iter 72/723 - loss 0.22991776 - time (sec): 1.81 - samples/sec: 9834.17 - lr: 0.000049 - momentum: 0.000000 2023-10-18 22:59:46,199 epoch 2 - iter 144/723 - loss 0.22949664 - time (sec): 3.54 - samples/sec: 9757.79 - lr: 0.000049 - momentum: 0.000000 2023-10-18 22:59:48,000 epoch 2 - iter 216/723 - loss 0.22651336 - time (sec): 5.35 - samples/sec: 9587.61 - lr: 0.000048 - momentum: 0.000000 2023-10-18 22:59:49,941 epoch 2 - iter 288/723 - loss 0.21724318 - time (sec): 7.29 - samples/sec: 9671.08 - lr: 0.000048 - momentum: 0.000000 2023-10-18 22:59:51,701 epoch 2 - iter 360/723 - loss 0.21247069 - time (sec): 9.05 - samples/sec: 9646.56 - lr: 0.000047 - momentum: 0.000000 2023-10-18 22:59:53,530 epoch 2 - iter 432/723 - loss 0.20857178 - time (sec): 10.88 - samples/sec: 9680.15 - lr: 0.000047 - momentum: 0.000000 2023-10-18 22:59:55,292 epoch 2 - iter 504/723 - loss 0.20684646 - time (sec): 12.64 - samples/sec: 9747.09 - lr: 0.000046 - momentum: 0.000000 2023-10-18 22:59:57,119 epoch 2 - iter 576/723 - loss 0.21024836 - time (sec): 14.46 - samples/sec: 9679.43 - lr: 0.000046 - momentum: 0.000000 2023-10-18 22:59:58,924 epoch 2 - iter 648/723 - loss 0.20759568 - time (sec): 16.27 - samples/sec: 9711.98 - lr: 0.000045 - momentum: 0.000000 2023-10-18 23:00:00,696 epoch 2 - iter 720/723 - loss 0.20794809 - time (sec): 18.04 - samples/sec: 9724.81 - lr: 0.000044 - momentum: 0.000000 2023-10-18 23:00:00,764 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:00:00,765 EPOCH 2 done: loss 0.2078 - lr: 0.000044 2023-10-18 23:00:02,850 DEV : loss 0.22826793789863586 - f1-score (micro avg) 0.254 2023-10-18 23:00:02,866 saving best model 2023-10-18 23:00:02,905 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:00:04,764 epoch 3 - iter 72/723 - loss 0.18294966 - time (sec): 1.86 - samples/sec: 10092.27 - lr: 0.000044 - momentum: 0.000000 2023-10-18 23:00:06,600 epoch 3 - iter 144/723 - loss 0.18974998 - time (sec): 3.69 - samples/sec: 10188.71 - lr: 0.000043 - momentum: 0.000000 2023-10-18 23:00:08,426 epoch 3 - iter 216/723 - loss 0.18483913 - time (sec): 5.52 - samples/sec: 9994.19 - lr: 0.000043 - momentum: 0.000000 2023-10-18 23:00:10,230 epoch 3 - iter 288/723 - loss 0.18075862 - time (sec): 7.32 - samples/sec: 9979.24 - lr: 0.000042 - momentum: 0.000000 2023-10-18 23:00:12,058 epoch 3 - iter 360/723 - loss 0.18072422 - time (sec): 9.15 - samples/sec: 9873.89 - lr: 0.000042 - momentum: 0.000000 2023-10-18 23:00:13,801 epoch 3 - iter 432/723 - loss 0.18042225 - time (sec): 10.89 - samples/sec: 9776.93 - lr: 0.000041 - momentum: 0.000000 2023-10-18 23:00:15,624 epoch 3 - iter 504/723 - loss 0.18161063 - time (sec): 12.72 - samples/sec: 9786.06 - lr: 0.000041 - momentum: 0.000000 2023-10-18 23:00:17,382 epoch 3 - iter 576/723 - loss 0.18110465 - time (sec): 14.48 - samples/sec: 9758.02 - lr: 0.000040 - momentum: 0.000000 2023-10-18 23:00:19,118 epoch 3 - iter 648/723 - loss 0.17753532 - time (sec): 16.21 - samples/sec: 9816.52 - lr: 0.000039 - momentum: 0.000000 2023-10-18 23:00:20,684 epoch 3 - iter 720/723 - loss 0.17701445 - time (sec): 17.78 - samples/sec: 9870.27 - lr: 0.000039 - momentum: 0.000000 2023-10-18 23:00:20,748 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:00:20,748 EPOCH 3 done: loss 0.1767 - lr: 0.000039 2023-10-18 23:00:22,506 DEV : loss 0.20792421698570251 - f1-score (micro avg) 0.4092 2023-10-18 23:00:22,520 saving best model 2023-10-18 23:00:22,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:00:24,297 epoch 4 - iter 72/723 - loss 0.15954370 - time (sec): 1.74 - samples/sec: 9774.94 - lr: 0.000038 - momentum: 0.000000 2023-10-18 23:00:26,086 epoch 4 - iter 144/723 - loss 0.15517464 - time (sec): 3.53 - samples/sec: 9353.01 - lr: 0.000038 - momentum: 0.000000 2023-10-18 23:00:27,920 epoch 4 - iter 216/723 - loss 0.15993206 - time (sec): 5.36 - samples/sec: 9370.16 - lr: 0.000037 - momentum: 0.000000 2023-10-18 23:00:29,776 epoch 4 - iter 288/723 - loss 0.15940359 - time (sec): 7.22 - samples/sec: 9411.20 - lr: 0.000037 - momentum: 0.000000 2023-10-18 23:00:31,614 epoch 4 - iter 360/723 - loss 0.16001585 - time (sec): 9.06 - samples/sec: 9419.71 - lr: 0.000036 - momentum: 0.000000 2023-10-18 23:00:33,458 epoch 4 - iter 432/723 - loss 0.16046789 - time (sec): 10.90 - samples/sec: 9414.93 - lr: 0.000036 - momentum: 0.000000 2023-10-18 23:00:35,341 epoch 4 - iter 504/723 - loss 0.15834988 - time (sec): 12.79 - samples/sec: 9421.93 - lr: 0.000035 - momentum: 0.000000 2023-10-18 23:00:37,679 epoch 4 - iter 576/723 - loss 0.15929391 - time (sec): 15.12 - samples/sec: 9286.74 - lr: 0.000034 - momentum: 0.000000 2023-10-18 23:00:39,475 epoch 4 - iter 648/723 - loss 0.15804445 - time (sec): 16.92 - samples/sec: 9305.11 - lr: 0.000034 - momentum: 0.000000 2023-10-18 23:00:41,347 epoch 4 - iter 720/723 - loss 0.15830661 - time (sec): 18.79 - samples/sec: 9349.83 - lr: 0.000033 - momentum: 0.000000 2023-10-18 23:00:41,406 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:00:41,406 EPOCH 4 done: loss 0.1583 - lr: 0.000033 2023-10-18 23:00:43,178 DEV : loss 0.19937574863433838 - f1-score (micro avg) 0.417 2023-10-18 23:00:43,192 saving best model 2023-10-18 23:00:43,227 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:00:45,071 epoch 5 - iter 72/723 - loss 0.15411846 - time (sec): 1.84 - samples/sec: 9679.89 - lr: 0.000033 - momentum: 0.000000 2023-10-18 23:00:46,902 epoch 5 - iter 144/723 - loss 0.15553448 - time (sec): 3.67 - samples/sec: 9814.22 - lr: 0.000032 - momentum: 0.000000 2023-10-18 23:00:48,727 epoch 5 - iter 216/723 - loss 0.15503779 - time (sec): 5.50 - samples/sec: 9767.31 - lr: 0.000032 - momentum: 0.000000 2023-10-18 23:00:50,522 epoch 5 - iter 288/723 - loss 0.15444028 - time (sec): 7.29 - samples/sec: 9661.38 - lr: 0.000031 - momentum: 0.000000 2023-10-18 23:00:52,245 epoch 5 - iter 360/723 - loss 0.15017229 - time (sec): 9.02 - samples/sec: 9613.87 - lr: 0.000031 - momentum: 0.000000 2023-10-18 23:00:54,061 epoch 5 - iter 432/723 - loss 0.15134564 - time (sec): 10.83 - samples/sec: 9604.42 - lr: 0.000030 - momentum: 0.000000 2023-10-18 23:00:55,855 epoch 5 - iter 504/723 - loss 0.15022735 - time (sec): 12.63 - samples/sec: 9622.77 - lr: 0.000029 - momentum: 0.000000 2023-10-18 23:00:57,809 epoch 5 - iter 576/723 - loss 0.15001693 - time (sec): 14.58 - samples/sec: 9606.86 - lr: 0.000029 - momentum: 0.000000 2023-10-18 23:00:59,625 epoch 5 - iter 648/723 - loss 0.14939047 - time (sec): 16.40 - samples/sec: 9604.35 - lr: 0.000028 - momentum: 0.000000 2023-10-18 23:01:01,395 epoch 5 - iter 720/723 - loss 0.14893618 - time (sec): 18.17 - samples/sec: 9682.65 - lr: 0.000028 - momentum: 0.000000 2023-10-18 23:01:01,446 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:01:01,447 EPOCH 5 done: loss 0.1491 - lr: 0.000028 2023-10-18 23:01:03,204 DEV : loss 0.19236673414707184 - f1-score (micro avg) 0.4651 2023-10-18 23:01:03,218 saving best model 2023-10-18 23:01:03,253 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:01:04,976 epoch 6 - iter 72/723 - loss 0.13995242 - time (sec): 1.72 - samples/sec: 9598.69 - lr: 0.000027 - momentum: 0.000000 2023-10-18 23:01:06,717 epoch 6 - iter 144/723 - loss 0.13162756 - time (sec): 3.46 - samples/sec: 9777.22 - lr: 0.000027 - momentum: 0.000000 2023-10-18 23:01:08,513 epoch 6 - iter 216/723 - loss 0.13349888 - time (sec): 5.26 - samples/sec: 9945.16 - lr: 0.000026 - momentum: 0.000000 2023-10-18 23:01:10,614 epoch 6 - iter 288/723 - loss 0.13416199 - time (sec): 7.36 - samples/sec: 9598.95 - lr: 0.000026 - momentum: 0.000000 2023-10-18 23:01:12,389 epoch 6 - iter 360/723 - loss 0.13680730 - time (sec): 9.14 - samples/sec: 9575.16 - lr: 0.000025 - momentum: 0.000000 2023-10-18 23:01:14,126 epoch 6 - iter 432/723 - loss 0.13369462 - time (sec): 10.87 - samples/sec: 9651.47 - lr: 0.000024 - momentum: 0.000000 2023-10-18 23:01:15,887 epoch 6 - iter 504/723 - loss 0.13682229 - time (sec): 12.63 - samples/sec: 9616.52 - lr: 0.000024 - momentum: 0.000000 2023-10-18 23:01:17,637 epoch 6 - iter 576/723 - loss 0.13447545 - time (sec): 14.38 - samples/sec: 9676.31 - lr: 0.000023 - momentum: 0.000000 2023-10-18 23:01:19,433 epoch 6 - iter 648/723 - loss 0.13738547 - time (sec): 16.18 - samples/sec: 9691.19 - lr: 0.000023 - momentum: 0.000000 2023-10-18 23:01:21,327 epoch 6 - iter 720/723 - loss 0.13843992 - time (sec): 18.07 - samples/sec: 9722.45 - lr: 0.000022 - momentum: 0.000000 2023-10-18 23:01:21,392 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:01:21,392 EPOCH 6 done: loss 0.1384 - lr: 0.000022 2023-10-18 23:01:23,168 DEV : loss 0.18509748578071594 - f1-score (micro avg) 0.518 2023-10-18 23:01:23,182 saving best model 2023-10-18 23:01:23,221 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:01:25,034 epoch 7 - iter 72/723 - loss 0.12980522 - time (sec): 1.81 - samples/sec: 9087.83 - lr: 0.000022 - momentum: 0.000000 2023-10-18 23:01:26,869 epoch 7 - iter 144/723 - loss 0.13216005 - time (sec): 3.65 - samples/sec: 9411.47 - lr: 0.000021 - momentum: 0.000000 2023-10-18 23:01:28,651 epoch 7 - iter 216/723 - loss 0.13522464 - time (sec): 5.43 - samples/sec: 9421.00 - lr: 0.000021 - momentum: 0.000000 2023-10-18 23:01:30,521 epoch 7 - iter 288/723 - loss 0.13547392 - time (sec): 7.30 - samples/sec: 9526.68 - lr: 0.000020 - momentum: 0.000000 2023-10-18 23:01:32,314 epoch 7 - iter 360/723 - loss 0.13328086 - time (sec): 9.09 - samples/sec: 9519.95 - lr: 0.000019 - momentum: 0.000000 2023-10-18 23:01:34,075 epoch 7 - iter 432/723 - loss 0.13286486 - time (sec): 10.85 - samples/sec: 9590.25 - lr: 0.000019 - momentum: 0.000000 2023-10-18 23:01:35,905 epoch 7 - iter 504/723 - loss 0.13294580 - time (sec): 12.68 - samples/sec: 9647.07 - lr: 0.000018 - momentum: 0.000000 2023-10-18 23:01:37,763 epoch 7 - iter 576/723 - loss 0.13280059 - time (sec): 14.54 - samples/sec: 9628.22 - lr: 0.000018 - momentum: 0.000000 2023-10-18 23:01:39,546 epoch 7 - iter 648/723 - loss 0.13385121 - time (sec): 16.32 - samples/sec: 9678.50 - lr: 0.000017 - momentum: 0.000000 2023-10-18 23:01:41,376 epoch 7 - iter 720/723 - loss 0.13277021 - time (sec): 18.15 - samples/sec: 9673.98 - lr: 0.000017 - momentum: 0.000000 2023-10-18 23:01:41,439 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:01:41,439 EPOCH 7 done: loss 0.1327 - lr: 0.000017 2023-10-18 23:01:43,565 DEV : loss 0.1816086322069168 - f1-score (micro avg) 0.5211 2023-10-18 23:01:43,580 saving best model 2023-10-18 23:01:43,616 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:01:45,389 epoch 8 - iter 72/723 - loss 0.14663446 - time (sec): 1.77 - samples/sec: 9305.44 - lr: 0.000016 - momentum: 0.000000 2023-10-18 23:01:47,220 epoch 8 - iter 144/723 - loss 0.13839322 - time (sec): 3.60 - samples/sec: 9639.06 - lr: 0.000016 - momentum: 0.000000 2023-10-18 23:01:49,017 epoch 8 - iter 216/723 - loss 0.13301673 - time (sec): 5.40 - samples/sec: 9821.06 - lr: 0.000015 - momentum: 0.000000 2023-10-18 23:01:50,787 epoch 8 - iter 288/723 - loss 0.13305023 - time (sec): 7.17 - samples/sec: 9734.21 - lr: 0.000014 - momentum: 0.000000 2023-10-18 23:01:52,565 epoch 8 - iter 360/723 - loss 0.12937088 - time (sec): 8.95 - samples/sec: 9819.91 - lr: 0.000014 - momentum: 0.000000 2023-10-18 23:01:54,376 epoch 8 - iter 432/723 - loss 0.12649252 - time (sec): 10.76 - samples/sec: 9823.98 - lr: 0.000013 - momentum: 0.000000 2023-10-18 23:01:56,134 epoch 8 - iter 504/723 - loss 0.12645102 - time (sec): 12.52 - samples/sec: 9875.58 - lr: 0.000013 - momentum: 0.000000 2023-10-18 23:01:57,874 epoch 8 - iter 576/723 - loss 0.12467083 - time (sec): 14.26 - samples/sec: 9833.41 - lr: 0.000012 - momentum: 0.000000 2023-10-18 23:01:59,659 epoch 8 - iter 648/723 - loss 0.12570604 - time (sec): 16.04 - samples/sec: 9830.77 - lr: 0.000012 - momentum: 0.000000 2023-10-18 23:02:01,618 epoch 8 - iter 720/723 - loss 0.12654477 - time (sec): 18.00 - samples/sec: 9764.67 - lr: 0.000011 - momentum: 0.000000 2023-10-18 23:02:01,675 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:02:01,675 EPOCH 8 done: loss 0.1265 - lr: 0.000011 2023-10-18 23:02:03,469 DEV : loss 0.18727229535579681 - f1-score (micro avg) 0.5216 2023-10-18 23:02:03,485 saving best model 2023-10-18 23:02:03,521 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:02:05,356 epoch 9 - iter 72/723 - loss 0.13100301 - time (sec): 1.83 - samples/sec: 9805.15 - lr: 0.000011 - momentum: 0.000000 2023-10-18 23:02:07,228 epoch 9 - iter 144/723 - loss 0.13098805 - time (sec): 3.71 - samples/sec: 9707.02 - lr: 0.000010 - momentum: 0.000000 2023-10-18 23:02:09,057 epoch 9 - iter 216/723 - loss 0.12293263 - time (sec): 5.54 - samples/sec: 9524.65 - lr: 0.000009 - momentum: 0.000000 2023-10-18 23:02:10,869 epoch 9 - iter 288/723 - loss 0.12134590 - time (sec): 7.35 - samples/sec: 9569.42 - lr: 0.000009 - momentum: 0.000000 2023-10-18 23:02:12,730 epoch 9 - iter 360/723 - loss 0.12289732 - time (sec): 9.21 - samples/sec: 9589.05 - lr: 0.000008 - momentum: 0.000000 2023-10-18 23:02:14,593 epoch 9 - iter 432/723 - loss 0.12316734 - time (sec): 11.07 - samples/sec: 9634.39 - lr: 0.000008 - momentum: 0.000000 2023-10-18 23:02:16,409 epoch 9 - iter 504/723 - loss 0.12430855 - time (sec): 12.89 - samples/sec: 9657.90 - lr: 0.000007 - momentum: 0.000000 2023-10-18 23:02:18,214 epoch 9 - iter 576/723 - loss 0.12555226 - time (sec): 14.69 - samples/sec: 9614.98 - lr: 0.000007 - momentum: 0.000000 2023-10-18 23:02:19,968 epoch 9 - iter 648/723 - loss 0.12562317 - time (sec): 16.45 - samples/sec: 9657.55 - lr: 0.000006 - momentum: 0.000000 2023-10-18 23:02:21,802 epoch 9 - iter 720/723 - loss 0.12504301 - time (sec): 18.28 - samples/sec: 9611.69 - lr: 0.000006 - momentum: 0.000000 2023-10-18 23:02:21,864 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:02:21,864 EPOCH 9 done: loss 0.1251 - lr: 0.000006 2023-10-18 23:02:24,005 DEV : loss 0.1787503957748413 - f1-score (micro avg) 0.5396 2023-10-18 23:02:24,019 saving best model 2023-10-18 23:02:24,055 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:02:25,778 epoch 10 - iter 72/723 - loss 0.12760560 - time (sec): 1.72 - samples/sec: 10007.55 - lr: 0.000005 - momentum: 0.000000 2023-10-18 23:02:27,518 epoch 10 - iter 144/723 - loss 0.13169396 - time (sec): 3.46 - samples/sec: 9862.38 - lr: 0.000004 - momentum: 0.000000 2023-10-18 23:02:29,301 epoch 10 - iter 216/723 - loss 0.13152802 - time (sec): 5.24 - samples/sec: 9903.98 - lr: 0.000004 - momentum: 0.000000 2023-10-18 23:02:31,175 epoch 10 - iter 288/723 - loss 0.12743923 - time (sec): 7.12 - samples/sec: 9988.53 - lr: 0.000003 - momentum: 0.000000 2023-10-18 23:02:33,058 epoch 10 - iter 360/723 - loss 0.12276207 - time (sec): 9.00 - samples/sec: 9888.82 - lr: 0.000003 - momentum: 0.000000 2023-10-18 23:02:34,873 epoch 10 - iter 432/723 - loss 0.12383487 - time (sec): 10.82 - samples/sec: 9764.32 - lr: 0.000002 - momentum: 0.000000 2023-10-18 23:02:36,659 epoch 10 - iter 504/723 - loss 0.12055966 - time (sec): 12.60 - samples/sec: 9782.10 - lr: 0.000002 - momentum: 0.000000 2023-10-18 23:02:38,526 epoch 10 - iter 576/723 - loss 0.11942674 - time (sec): 14.47 - samples/sec: 9798.52 - lr: 0.000001 - momentum: 0.000000 2023-10-18 23:02:40,315 epoch 10 - iter 648/723 - loss 0.12086038 - time (sec): 16.26 - samples/sec: 9737.31 - lr: 0.000001 - momentum: 0.000000 2023-10-18 23:02:42,102 epoch 10 - iter 720/723 - loss 0.12182104 - time (sec): 18.05 - samples/sec: 9731.88 - lr: 0.000000 - momentum: 0.000000 2023-10-18 23:02:42,163 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:02:42,163 EPOCH 10 done: loss 0.1219 - lr: 0.000000 2023-10-18 23:02:43,955 DEV : loss 0.18162186443805695 - f1-score (micro avg) 0.5438 2023-10-18 23:02:43,970 saving best model 2023-10-18 23:02:44,037 ---------------------------------------------------------------------------------------------------- 2023-10-18 23:02:44,037 Loading model from best epoch ... 2023-10-18 23:02:44,117 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 23:02:45,458 Results: - F-score (micro) 0.547 - F-score (macro) 0.3859 - Accuracy 0.3854 By class: precision recall f1-score support LOC 0.6234 0.6288 0.6261 458 PER 0.5989 0.4336 0.5030 482 ORG 1.0000 0.0145 0.0286 69 micro avg 0.6133 0.4936 0.5470 1009 macro avg 0.7407 0.3590 0.3859 1009 weighted avg 0.6374 0.4936 0.5264 1009 2023-10-18 23:02:45,458 ----------------------------------------------------------------------------------------------------