2023-10-18 22:11:46,686 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:46,687 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 22:11:46,687 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:46,687 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-18 22:11:46,687 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:46,687 Train: 5777 sentences 2023-10-18 22:11:46,687 (train_with_dev=False, train_with_test=False) 2023-10-18 22:11:46,687 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:46,687 Training Params: 2023-10-18 22:11:46,687 - learning_rate: "5e-05" 2023-10-18 22:11:46,687 - mini_batch_size: "8" 2023-10-18 22:11:46,687 - max_epochs: "10" 2023-10-18 22:11:46,687 - shuffle: "True" 2023-10-18 22:11:46,687 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:46,687 Plugins: 2023-10-18 22:11:46,687 - TensorboardLogger 2023-10-18 22:11:46,687 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 22:11:46,687 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:46,687 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 22:11:46,687 - metric: "('micro avg', 'f1-score')" 2023-10-18 22:11:46,687 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:46,688 Computation: 2023-10-18 22:11:46,688 - compute on device: cuda:0 2023-10-18 22:11:46,688 - embedding storage: none 2023-10-18 22:11:46,688 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:46,688 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-18 22:11:46,688 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:46,688 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:46,688 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 22:11:48,521 epoch 1 - iter 72/723 - loss 3.15369193 - time (sec): 1.83 - samples/sec: 9418.43 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:11:50,353 epoch 1 - iter 144/723 - loss 2.82531096 - time (sec): 3.66 - samples/sec: 9750.79 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:11:52,174 epoch 1 - iter 216/723 - loss 2.35058830 - time (sec): 5.49 - samples/sec: 9736.98 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:11:53,972 epoch 1 - iter 288/723 - loss 1.91572462 - time (sec): 7.28 - samples/sec: 9734.87 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:11:55,779 epoch 1 - iter 360/723 - loss 1.59356684 - time (sec): 9.09 - samples/sec: 9795.92 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:11:57,603 epoch 1 - iter 432/723 - loss 1.37380455 - time (sec): 10.91 - samples/sec: 9818.38 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:11:59,430 epoch 1 - iter 504/723 - loss 1.22307055 - time (sec): 12.74 - samples/sec: 9797.58 - lr: 0.000035 - momentum: 0.000000 2023-10-18 22:12:01,273 epoch 1 - iter 576/723 - loss 1.11055697 - time (sec): 14.58 - samples/sec: 9758.58 - lr: 0.000040 - momentum: 0.000000 2023-10-18 22:12:03,063 epoch 1 - iter 648/723 - loss 1.02493184 - time (sec): 16.38 - samples/sec: 9685.83 - lr: 0.000045 - momentum: 0.000000 2023-10-18 22:12:04,793 epoch 1 - iter 720/723 - loss 0.95066192 - time (sec): 18.11 - samples/sec: 9693.53 - lr: 0.000050 - momentum: 0.000000 2023-10-18 22:12:04,877 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:12:04,877 EPOCH 1 done: loss 0.9486 - lr: 0.000050 2023-10-18 22:12:06,089 DEV : loss 0.31063002347946167 - f1-score (micro avg) 0.0 2023-10-18 22:12:06,102 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:12:07,989 epoch 2 - iter 72/723 - loss 0.23255061 - time (sec): 1.89 - samples/sec: 9861.83 - lr: 0.000049 - momentum: 0.000000 2023-10-18 22:12:09,742 epoch 2 - iter 144/723 - loss 0.23804088 - time (sec): 3.64 - samples/sec: 9834.13 - lr: 0.000049 - momentum: 0.000000 2023-10-18 22:12:11,573 epoch 2 - iter 216/723 - loss 0.24106924 - time (sec): 5.47 - samples/sec: 9842.86 - lr: 0.000048 - momentum: 0.000000 2023-10-18 22:12:13,330 epoch 2 - iter 288/723 - loss 0.23333713 - time (sec): 7.23 - samples/sec: 9913.68 - lr: 0.000048 - momentum: 0.000000 2023-10-18 22:12:15,083 epoch 2 - iter 360/723 - loss 0.22213309 - time (sec): 8.98 - samples/sec: 9890.07 - lr: 0.000047 - momentum: 0.000000 2023-10-18 22:12:16,869 epoch 2 - iter 432/723 - loss 0.21724811 - time (sec): 10.77 - samples/sec: 9985.45 - lr: 0.000047 - momentum: 0.000000 2023-10-18 22:12:18,591 epoch 2 - iter 504/723 - loss 0.21698959 - time (sec): 12.49 - samples/sec: 9901.44 - lr: 0.000046 - momentum: 0.000000 2023-10-18 22:12:20,283 epoch 2 - iter 576/723 - loss 0.21535393 - time (sec): 14.18 - samples/sec: 9886.96 - lr: 0.000046 - momentum: 0.000000 2023-10-18 22:12:22,061 epoch 2 - iter 648/723 - loss 0.21374976 - time (sec): 15.96 - samples/sec: 9885.22 - lr: 0.000045 - momentum: 0.000000 2023-10-18 22:12:23,798 epoch 2 - iter 720/723 - loss 0.20898244 - time (sec): 17.70 - samples/sec: 9933.82 - lr: 0.000044 - momentum: 0.000000 2023-10-18 22:12:23,856 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:12:23,856 EPOCH 2 done: loss 0.2092 - lr: 0.000044 2023-10-18 22:12:25,964 DEV : loss 0.22788439691066742 - f1-score (micro avg) 0.3239 2023-10-18 22:12:25,979 saving best model 2023-10-18 22:12:26,010 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:12:27,797 epoch 3 - iter 72/723 - loss 0.19078244 - time (sec): 1.79 - samples/sec: 10043.51 - lr: 0.000044 - momentum: 0.000000 2023-10-18 22:12:29,543 epoch 3 - iter 144/723 - loss 0.19123649 - time (sec): 3.53 - samples/sec: 9837.27 - lr: 0.000043 - momentum: 0.000000 2023-10-18 22:12:31,289 epoch 3 - iter 216/723 - loss 0.18622492 - time (sec): 5.28 - samples/sec: 9893.55 - lr: 0.000043 - momentum: 0.000000 2023-10-18 22:12:33,145 epoch 3 - iter 288/723 - loss 0.17538802 - time (sec): 7.13 - samples/sec: 9922.72 - lr: 0.000042 - momentum: 0.000000 2023-10-18 22:12:34,879 epoch 3 - iter 360/723 - loss 0.17611275 - time (sec): 8.87 - samples/sec: 9917.65 - lr: 0.000042 - momentum: 0.000000 2023-10-18 22:12:36,750 epoch 3 - iter 432/723 - loss 0.17600510 - time (sec): 10.74 - samples/sec: 9821.66 - lr: 0.000041 - momentum: 0.000000 2023-10-18 22:12:38,475 epoch 3 - iter 504/723 - loss 0.17653894 - time (sec): 12.46 - samples/sec: 9798.39 - lr: 0.000041 - momentum: 0.000000 2023-10-18 22:12:40,261 epoch 3 - iter 576/723 - loss 0.17866614 - time (sec): 14.25 - samples/sec: 9817.94 - lr: 0.000040 - momentum: 0.000000 2023-10-18 22:12:42,033 epoch 3 - iter 648/723 - loss 0.17625735 - time (sec): 16.02 - samples/sec: 9858.26 - lr: 0.000039 - momentum: 0.000000 2023-10-18 22:12:43,814 epoch 3 - iter 720/723 - loss 0.17698859 - time (sec): 17.80 - samples/sec: 9872.51 - lr: 0.000039 - momentum: 0.000000 2023-10-18 22:12:43,871 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:12:43,871 EPOCH 3 done: loss 0.1768 - lr: 0.000039 2023-10-18 22:12:45,631 DEV : loss 0.21366006135940552 - f1-score (micro avg) 0.3814 2023-10-18 22:12:45,646 saving best model 2023-10-18 22:12:45,684 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:12:47,421 epoch 4 - iter 72/723 - loss 0.15319449 - time (sec): 1.74 - samples/sec: 10153.35 - lr: 0.000038 - momentum: 0.000000 2023-10-18 22:12:49,178 epoch 4 - iter 144/723 - loss 0.15219794 - time (sec): 3.49 - samples/sec: 9871.73 - lr: 0.000038 - momentum: 0.000000 2023-10-18 22:12:50,904 epoch 4 - iter 216/723 - loss 0.15958723 - time (sec): 5.22 - samples/sec: 10037.56 - lr: 0.000037 - momentum: 0.000000 2023-10-18 22:12:52,729 epoch 4 - iter 288/723 - loss 0.15641675 - time (sec): 7.04 - samples/sec: 10141.55 - lr: 0.000037 - momentum: 0.000000 2023-10-18 22:12:54,430 epoch 4 - iter 360/723 - loss 0.15539065 - time (sec): 8.74 - samples/sec: 10131.07 - lr: 0.000036 - momentum: 0.000000 2023-10-18 22:12:56,173 epoch 4 - iter 432/723 - loss 0.15963557 - time (sec): 10.49 - samples/sec: 10188.50 - lr: 0.000036 - momentum: 0.000000 2023-10-18 22:12:58,321 epoch 4 - iter 504/723 - loss 0.15900354 - time (sec): 12.64 - samples/sec: 9892.63 - lr: 0.000035 - momentum: 0.000000 2023-10-18 22:13:00,068 epoch 4 - iter 576/723 - loss 0.15775232 - time (sec): 14.38 - samples/sec: 9897.82 - lr: 0.000034 - momentum: 0.000000 2023-10-18 22:13:01,795 epoch 4 - iter 648/723 - loss 0.15793963 - time (sec): 16.11 - samples/sec: 9854.22 - lr: 0.000034 - momentum: 0.000000 2023-10-18 22:13:03,655 epoch 4 - iter 720/723 - loss 0.16172244 - time (sec): 17.97 - samples/sec: 9776.09 - lr: 0.000033 - momentum: 0.000000 2023-10-18 22:13:03,711 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:13:03,711 EPOCH 4 done: loss 0.1617 - lr: 0.000033 2023-10-18 22:13:05,482 DEV : loss 0.19289655983448029 - f1-score (micro avg) 0.4815 2023-10-18 22:13:05,497 saving best model 2023-10-18 22:13:05,533 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:13:07,441 epoch 5 - iter 72/723 - loss 0.16811055 - time (sec): 1.91 - samples/sec: 9536.49 - lr: 0.000033 - momentum: 0.000000 2023-10-18 22:13:09,201 epoch 5 - iter 144/723 - loss 0.15925998 - time (sec): 3.67 - samples/sec: 9799.25 - lr: 0.000032 - momentum: 0.000000 2023-10-18 22:13:10,943 epoch 5 - iter 216/723 - loss 0.15623823 - time (sec): 5.41 - samples/sec: 9557.30 - lr: 0.000032 - momentum: 0.000000 2023-10-18 22:13:12,686 epoch 5 - iter 288/723 - loss 0.15588053 - time (sec): 7.15 - samples/sec: 9547.60 - lr: 0.000031 - momentum: 0.000000 2023-10-18 22:13:14,425 epoch 5 - iter 360/723 - loss 0.15281731 - time (sec): 8.89 - samples/sec: 9559.96 - lr: 0.000031 - momentum: 0.000000 2023-10-18 22:13:16,265 epoch 5 - iter 432/723 - loss 0.15073615 - time (sec): 10.73 - samples/sec: 9668.65 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:13:17,926 epoch 5 - iter 504/723 - loss 0.14950718 - time (sec): 12.39 - samples/sec: 9809.91 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:13:19,596 epoch 5 - iter 576/723 - loss 0.14885470 - time (sec): 14.06 - samples/sec: 9898.68 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:13:21,379 epoch 5 - iter 648/723 - loss 0.15131920 - time (sec): 15.84 - samples/sec: 9902.45 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:13:23,193 epoch 5 - iter 720/723 - loss 0.15022435 - time (sec): 17.66 - samples/sec: 9943.96 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:13:23,253 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:13:23,253 EPOCH 5 done: loss 0.1504 - lr: 0.000028 2023-10-18 22:13:25,007 DEV : loss 0.1961345225572586 - f1-score (micro avg) 0.4696 2023-10-18 22:13:25,022 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:13:26,804 epoch 6 - iter 72/723 - loss 0.13617864 - time (sec): 1.78 - samples/sec: 9559.29 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:13:28,589 epoch 6 - iter 144/723 - loss 0.14005984 - time (sec): 3.57 - samples/sec: 9628.21 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:13:30,437 epoch 6 - iter 216/723 - loss 0.15059229 - time (sec): 5.42 - samples/sec: 9626.66 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:13:32,163 epoch 6 - iter 288/723 - loss 0.15216103 - time (sec): 7.14 - samples/sec: 9579.75 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:13:34,396 epoch 6 - iter 360/723 - loss 0.14557675 - time (sec): 9.37 - samples/sec: 9231.32 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:13:36,183 epoch 6 - iter 432/723 - loss 0.14252551 - time (sec): 11.16 - samples/sec: 9279.64 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:13:37,996 epoch 6 - iter 504/723 - loss 0.14416986 - time (sec): 12.97 - samples/sec: 9456.22 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:13:39,763 epoch 6 - iter 576/723 - loss 0.14274090 - time (sec): 14.74 - samples/sec: 9548.14 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:13:41,578 epoch 6 - iter 648/723 - loss 0.14378379 - time (sec): 16.56 - samples/sec: 9609.62 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:13:43,291 epoch 6 - iter 720/723 - loss 0.14185352 - time (sec): 18.27 - samples/sec: 9606.77 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:13:43,355 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:13:43,356 EPOCH 6 done: loss 0.1414 - lr: 0.000022 2023-10-18 22:13:45,122 DEV : loss 0.19116047024726868 - f1-score (micro avg) 0.4857 2023-10-18 22:13:45,137 saving best model 2023-10-18 22:13:45,174 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:13:46,912 epoch 7 - iter 72/723 - loss 0.13529263 - time (sec): 1.74 - samples/sec: 9722.45 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:13:48,713 epoch 7 - iter 144/723 - loss 0.13613335 - time (sec): 3.54 - samples/sec: 9958.96 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:13:50,482 epoch 7 - iter 216/723 - loss 0.13441868 - time (sec): 5.31 - samples/sec: 9917.67 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:13:52,251 epoch 7 - iter 288/723 - loss 0.13760997 - time (sec): 7.08 - samples/sec: 9848.22 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:13:53,993 epoch 7 - iter 360/723 - loss 0.13573573 - time (sec): 8.82 - samples/sec: 9819.71 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:13:55,820 epoch 7 - iter 432/723 - loss 0.13560010 - time (sec): 10.65 - samples/sec: 9890.16 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:13:57,634 epoch 7 - iter 504/723 - loss 0.13364674 - time (sec): 12.46 - samples/sec: 9840.57 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:13:59,388 epoch 7 - iter 576/723 - loss 0.13450202 - time (sec): 14.21 - samples/sec: 9772.70 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:14:01,207 epoch 7 - iter 648/723 - loss 0.13529263 - time (sec): 16.03 - samples/sec: 9801.38 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:14:03,065 epoch 7 - iter 720/723 - loss 0.13363141 - time (sec): 17.89 - samples/sec: 9814.24 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:14:03,126 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:14:03,127 EPOCH 7 done: loss 0.1334 - lr: 0.000017 2023-10-18 22:14:04,891 DEV : loss 0.1891321837902069 - f1-score (micro avg) 0.4815 2023-10-18 22:14:04,906 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:14:06,635 epoch 8 - iter 72/723 - loss 0.12504995 - time (sec): 1.73 - samples/sec: 9327.36 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:14:08,831 epoch 8 - iter 144/723 - loss 0.14497774 - time (sec): 3.92 - samples/sec: 8703.68 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:14:10,623 epoch 8 - iter 216/723 - loss 0.13421076 - time (sec): 5.72 - samples/sec: 9270.31 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:14:12,411 epoch 8 - iter 288/723 - loss 0.12980263 - time (sec): 7.50 - samples/sec: 9387.38 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:14:14,168 epoch 8 - iter 360/723 - loss 0.12913939 - time (sec): 9.26 - samples/sec: 9523.64 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:14:16,043 epoch 8 - iter 432/723 - loss 0.12544526 - time (sec): 11.14 - samples/sec: 9576.42 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:14:17,799 epoch 8 - iter 504/723 - loss 0.12520676 - time (sec): 12.89 - samples/sec: 9556.09 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:14:19,682 epoch 8 - iter 576/723 - loss 0.12476139 - time (sec): 14.78 - samples/sec: 9558.26 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:14:21,488 epoch 8 - iter 648/723 - loss 0.12586790 - time (sec): 16.58 - samples/sec: 9535.81 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:14:23,247 epoch 8 - iter 720/723 - loss 0.12811874 - time (sec): 18.34 - samples/sec: 9586.01 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:14:23,303 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:14:23,303 EPOCH 8 done: loss 0.1279 - lr: 0.000011 2023-10-18 22:14:25,072 DEV : loss 0.18201254308223724 - f1-score (micro avg) 0.5072 2023-10-18 22:14:25,087 saving best model 2023-10-18 22:14:25,123 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:14:26,935 epoch 9 - iter 72/723 - loss 0.11411652 - time (sec): 1.81 - samples/sec: 10766.66 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:14:28,686 epoch 9 - iter 144/723 - loss 0.10939075 - time (sec): 3.56 - samples/sec: 10321.51 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:14:30,443 epoch 9 - iter 216/723 - loss 0.11384251 - time (sec): 5.32 - samples/sec: 10057.20 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:14:32,214 epoch 9 - iter 288/723 - loss 0.11951860 - time (sec): 7.09 - samples/sec: 9978.23 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:14:34,027 epoch 9 - iter 360/723 - loss 0.12179869 - time (sec): 8.90 - samples/sec: 9931.67 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:14:35,758 epoch 9 - iter 432/723 - loss 0.12431945 - time (sec): 10.63 - samples/sec: 9850.04 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:14:37,499 epoch 9 - iter 504/723 - loss 0.12584957 - time (sec): 12.38 - samples/sec: 9780.76 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:14:39,356 epoch 9 - iter 576/723 - loss 0.12515093 - time (sec): 14.23 - samples/sec: 9893.29 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:14:41,122 epoch 9 - iter 648/723 - loss 0.12556555 - time (sec): 16.00 - samples/sec: 9903.79 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:14:42,893 epoch 9 - iter 720/723 - loss 0.12475240 - time (sec): 17.77 - samples/sec: 9882.46 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:14:42,962 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:14:42,962 EPOCH 9 done: loss 0.1247 - lr: 0.000006 2023-10-18 22:14:45,097 DEV : loss 0.1882598102092743 - f1-score (micro avg) 0.4959 2023-10-18 22:14:45,112 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:14:46,869 epoch 10 - iter 72/723 - loss 0.10866994 - time (sec): 1.76 - samples/sec: 9731.03 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:14:48,671 epoch 10 - iter 144/723 - loss 0.12562583 - time (sec): 3.56 - samples/sec: 9529.64 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:14:50,455 epoch 10 - iter 216/723 - loss 0.12325727 - time (sec): 5.34 - samples/sec: 9768.34 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:14:52,279 epoch 10 - iter 288/723 - loss 0.12513734 - time (sec): 7.17 - samples/sec: 9689.81 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:14:54,208 epoch 10 - iter 360/723 - loss 0.13109443 - time (sec): 9.10 - samples/sec: 9713.73 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:14:55,958 epoch 10 - iter 432/723 - loss 0.12975067 - time (sec): 10.85 - samples/sec: 9720.21 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:14:57,713 epoch 10 - iter 504/723 - loss 0.12653515 - time (sec): 12.60 - samples/sec: 9811.00 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:14:59,552 epoch 10 - iter 576/723 - loss 0.12518131 - time (sec): 14.44 - samples/sec: 9762.73 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:15:01,283 epoch 10 - iter 648/723 - loss 0.12318594 - time (sec): 16.17 - samples/sec: 9754.98 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:15:03,005 epoch 10 - iter 720/723 - loss 0.12424305 - time (sec): 17.89 - samples/sec: 9814.44 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:15:03,068 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:03,068 EPOCH 10 done: loss 0.1244 - lr: 0.000000 2023-10-18 22:15:04,871 DEV : loss 0.1862431764602661 - f1-score (micro avg) 0.4991 2023-10-18 22:15:04,918 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:15:04,918 Loading model from best epoch ... 2023-10-18 22:15:05,001 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 22:15:06,339 Results: - F-score (micro) 0.5138 - F-score (macro) 0.3616 - Accuracy 0.3566 By class: precision recall f1-score support LOC 0.5583 0.6485 0.6000 458 PER 0.4975 0.4212 0.4562 482 ORG 1.0000 0.0145 0.0286 69 micro avg 0.5324 0.4965 0.5138 1009 macro avg 0.6853 0.3614 0.3616 1009 weighted avg 0.5595 0.4965 0.4922 1009 2023-10-18 22:15:06,339 ----------------------------------------------------------------------------------------------------