2023-10-20 10:11:23,618 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:23,618 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-20 10:11:23,618 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:23,618 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-20 10:11:23,618 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:23,618 Train: 6183 sentences 2023-10-20 10:11:23,618 (train_with_dev=False, train_with_test=False) 2023-10-20 10:11:23,618 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:23,618 Training Params: 2023-10-20 10:11:23,618 - learning_rate: "5e-05" 2023-10-20 10:11:23,619 - mini_batch_size: "4" 2023-10-20 10:11:23,619 - max_epochs: "10" 2023-10-20 10:11:23,619 - shuffle: "True" 2023-10-20 10:11:23,619 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:23,619 Plugins: 2023-10-20 10:11:23,619 - TensorboardLogger 2023-10-20 10:11:23,619 - LinearScheduler | warmup_fraction: '0.1' 2023-10-20 10:11:23,619 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:23,619 Final evaluation on model from best epoch (best-model.pt) 2023-10-20 10:11:23,619 - metric: "('micro avg', 'f1-score')" 2023-10-20 10:11:23,619 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:23,619 Computation: 2023-10-20 10:11:23,619 - compute on device: cuda:0 2023-10-20 10:11:23,619 - embedding storage: none 2023-10-20 10:11:23,619 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:23,619 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-20 10:11:23,619 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:23,619 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:23,619 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-20 10:11:25,776 epoch 1 - iter 154/1546 - loss 2.72059880 - time (sec): 2.16 - samples/sec: 5540.38 - lr: 0.000005 - momentum: 0.000000 2023-10-20 10:11:28,105 epoch 1 - iter 308/1546 - loss 2.29041797 - time (sec): 4.49 - samples/sec: 5471.93 - lr: 0.000010 - momentum: 0.000000 2023-10-20 10:11:30,400 epoch 1 - iter 462/1546 - loss 1.72643001 - time (sec): 6.78 - samples/sec: 5478.88 - lr: 0.000015 - momentum: 0.000000 2023-10-20 10:11:32,590 epoch 1 - iter 616/1546 - loss 1.37951698 - time (sec): 8.97 - samples/sec: 5500.37 - lr: 0.000020 - momentum: 0.000000 2023-10-20 10:11:34,716 epoch 1 - iter 770/1546 - loss 1.15728832 - time (sec): 11.10 - samples/sec: 5570.30 - lr: 0.000025 - momentum: 0.000000 2023-10-20 10:11:37,071 epoch 1 - iter 924/1546 - loss 0.99808405 - time (sec): 13.45 - samples/sec: 5536.42 - lr: 0.000030 - momentum: 0.000000 2023-10-20 10:11:39,079 epoch 1 - iter 1078/1546 - loss 0.88634732 - time (sec): 15.46 - samples/sec: 5641.38 - lr: 0.000035 - momentum: 0.000000 2023-10-20 10:11:41,395 epoch 1 - iter 1232/1546 - loss 0.80646342 - time (sec): 17.78 - samples/sec: 5584.58 - lr: 0.000040 - momentum: 0.000000 2023-10-20 10:11:43,811 epoch 1 - iter 1386/1546 - loss 0.73926131 - time (sec): 20.19 - samples/sec: 5561.38 - lr: 0.000045 - momentum: 0.000000 2023-10-20 10:11:46,197 epoch 1 - iter 1540/1546 - loss 0.69229608 - time (sec): 22.58 - samples/sec: 5480.56 - lr: 0.000050 - momentum: 0.000000 2023-10-20 10:11:46,294 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:46,294 EPOCH 1 done: loss 0.6905 - lr: 0.000050 2023-10-20 10:11:47,290 DEV : loss 0.11603401601314545 - f1-score (micro avg) 0.0 2023-10-20 10:11:47,301 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:11:49,689 epoch 2 - iter 154/1546 - loss 0.19907014 - time (sec): 2.39 - samples/sec: 5475.50 - lr: 0.000049 - momentum: 0.000000 2023-10-20 10:11:51,968 epoch 2 - iter 308/1546 - loss 0.20449276 - time (sec): 4.67 - samples/sec: 5574.04 - lr: 0.000049 - momentum: 0.000000 2023-10-20 10:11:54,080 epoch 2 - iter 462/1546 - loss 0.19279187 - time (sec): 6.78 - samples/sec: 5707.82 - lr: 0.000048 - momentum: 0.000000 2023-10-20 10:11:55,859 epoch 2 - iter 616/1546 - loss 0.18431996 - time (sec): 8.56 - samples/sec: 5976.19 - lr: 0.000048 - momentum: 0.000000 2023-10-20 10:11:57,964 epoch 2 - iter 770/1546 - loss 0.18240699 - time (sec): 10.66 - samples/sec: 5941.45 - lr: 0.000047 - momentum: 0.000000 2023-10-20 10:12:00,413 epoch 2 - iter 924/1546 - loss 0.18267141 - time (sec): 13.11 - samples/sec: 5750.00 - lr: 0.000047 - momentum: 0.000000 2023-10-20 10:12:02,759 epoch 2 - iter 1078/1546 - loss 0.18118887 - time (sec): 15.46 - samples/sec: 5684.13 - lr: 0.000046 - momentum: 0.000000 2023-10-20 10:12:04,940 epoch 2 - iter 1232/1546 - loss 0.17978302 - time (sec): 17.64 - samples/sec: 5693.03 - lr: 0.000046 - momentum: 0.000000 2023-10-20 10:12:07,104 epoch 2 - iter 1386/1546 - loss 0.17838777 - time (sec): 19.80 - samples/sec: 5667.36 - lr: 0.000045 - momentum: 0.000000 2023-10-20 10:12:09,457 epoch 2 - iter 1540/1546 - loss 0.17702935 - time (sec): 22.16 - samples/sec: 5591.85 - lr: 0.000044 - momentum: 0.000000 2023-10-20 10:12:09,533 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:12:09,533 EPOCH 2 done: loss 0.1771 - lr: 0.000044 2023-10-20 10:12:10,606 DEV : loss 0.09703084826469421 - f1-score (micro avg) 0.4852 2023-10-20 10:12:10,617 saving best model 2023-10-20 10:12:10,646 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:12:12,991 epoch 3 - iter 154/1546 - loss 0.12051920 - time (sec): 2.34 - samples/sec: 5147.42 - lr: 0.000044 - momentum: 0.000000 2023-10-20 10:12:15,409 epoch 3 - iter 308/1546 - loss 0.12883184 - time (sec): 4.76 - samples/sec: 5117.80 - lr: 0.000043 - momentum: 0.000000 2023-10-20 10:12:17,780 epoch 3 - iter 462/1546 - loss 0.13669424 - time (sec): 7.13 - samples/sec: 5271.49 - lr: 0.000043 - momentum: 0.000000 2023-10-20 10:12:20,236 epoch 3 - iter 616/1546 - loss 0.14202212 - time (sec): 9.59 - samples/sec: 5254.06 - lr: 0.000042 - momentum: 0.000000 2023-10-20 10:12:22,565 epoch 3 - iter 770/1546 - loss 0.14046282 - time (sec): 11.92 - samples/sec: 5291.48 - lr: 0.000042 - momentum: 0.000000 2023-10-20 10:12:24,784 epoch 3 - iter 924/1546 - loss 0.14566944 - time (sec): 14.14 - samples/sec: 5281.71 - lr: 0.000041 - momentum: 0.000000 2023-10-20 10:12:26,940 epoch 3 - iter 1078/1546 - loss 0.14394182 - time (sec): 16.29 - samples/sec: 5340.92 - lr: 0.000041 - momentum: 0.000000 2023-10-20 10:12:29,118 epoch 3 - iter 1232/1546 - loss 0.14435536 - time (sec): 18.47 - samples/sec: 5363.39 - lr: 0.000040 - momentum: 0.000000 2023-10-20 10:12:31,419 epoch 3 - iter 1386/1546 - loss 0.14728902 - time (sec): 20.77 - samples/sec: 5371.12 - lr: 0.000039 - momentum: 0.000000 2023-10-20 10:12:33,453 epoch 3 - iter 1540/1546 - loss 0.14866474 - time (sec): 22.81 - samples/sec: 5424.77 - lr: 0.000039 - momentum: 0.000000 2023-10-20 10:12:33,519 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:12:33,519 EPOCH 3 done: loss 0.1485 - lr: 0.000039 2023-10-20 10:12:34,674 DEV : loss 0.08332959562540054 - f1-score (micro avg) 0.5828 2023-10-20 10:12:34,685 saving best model 2023-10-20 10:12:34,721 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:12:36,938 epoch 4 - iter 154/1546 - loss 0.11336206 - time (sec): 2.22 - samples/sec: 5811.48 - lr: 0.000038 - momentum: 0.000000 2023-10-20 10:12:39,184 epoch 4 - iter 308/1546 - loss 0.10644679 - time (sec): 4.46 - samples/sec: 5640.31 - lr: 0.000038 - momentum: 0.000000 2023-10-20 10:12:41,379 epoch 4 - iter 462/1546 - loss 0.11293290 - time (sec): 6.66 - samples/sec: 5586.60 - lr: 0.000037 - momentum: 0.000000 2023-10-20 10:12:43,939 epoch 4 - iter 616/1546 - loss 0.12093545 - time (sec): 9.22 - samples/sec: 5433.62 - lr: 0.000037 - momentum: 0.000000 2023-10-20 10:12:46,139 epoch 4 - iter 770/1546 - loss 0.12265145 - time (sec): 11.42 - samples/sec: 5490.74 - lr: 0.000036 - momentum: 0.000000 2023-10-20 10:12:48,362 epoch 4 - iter 924/1546 - loss 0.12517304 - time (sec): 13.64 - samples/sec: 5485.77 - lr: 0.000036 - momentum: 0.000000 2023-10-20 10:12:50,534 epoch 4 - iter 1078/1546 - loss 0.12895114 - time (sec): 15.81 - samples/sec: 5460.39 - lr: 0.000035 - momentum: 0.000000 2023-10-20 10:12:52,790 epoch 4 - iter 1232/1546 - loss 0.13213860 - time (sec): 18.07 - samples/sec: 5463.07 - lr: 0.000034 - momentum: 0.000000 2023-10-20 10:12:54,879 epoch 4 - iter 1386/1546 - loss 0.13131054 - time (sec): 20.16 - samples/sec: 5490.54 - lr: 0.000034 - momentum: 0.000000 2023-10-20 10:12:56,938 epoch 4 - iter 1540/1546 - loss 0.13260358 - time (sec): 22.22 - samples/sec: 5572.26 - lr: 0.000033 - momentum: 0.000000 2023-10-20 10:12:57,027 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:12:57,027 EPOCH 4 done: loss 0.1325 - lr: 0.000033 2023-10-20 10:12:58,133 DEV : loss 0.0857335776090622 - f1-score (micro avg) 0.5851 2023-10-20 10:12:58,145 saving best model 2023-10-20 10:12:58,177 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:13:00,491 epoch 5 - iter 154/1546 - loss 0.11580687 - time (sec): 2.31 - samples/sec: 5281.28 - lr: 0.000033 - momentum: 0.000000 2023-10-20 10:13:02,744 epoch 5 - iter 308/1546 - loss 0.11624256 - time (sec): 4.57 - samples/sec: 5506.77 - lr: 0.000032 - momentum: 0.000000 2023-10-20 10:13:05,103 epoch 5 - iter 462/1546 - loss 0.11580313 - time (sec): 6.93 - samples/sec: 5359.13 - lr: 0.000032 - momentum: 0.000000 2023-10-20 10:13:07,517 epoch 5 - iter 616/1546 - loss 0.11296953 - time (sec): 9.34 - samples/sec: 5287.80 - lr: 0.000031 - momentum: 0.000000 2023-10-20 10:13:09,886 epoch 5 - iter 770/1546 - loss 0.11530584 - time (sec): 11.71 - samples/sec: 5332.82 - lr: 0.000031 - momentum: 0.000000 2023-10-20 10:13:12,024 epoch 5 - iter 924/1546 - loss 0.11516890 - time (sec): 13.85 - samples/sec: 5392.23 - lr: 0.000030 - momentum: 0.000000 2023-10-20 10:13:14,189 epoch 5 - iter 1078/1546 - loss 0.11778867 - time (sec): 16.01 - samples/sec: 5450.86 - lr: 0.000029 - momentum: 0.000000 2023-10-20 10:13:16,430 epoch 5 - iter 1232/1546 - loss 0.11835084 - time (sec): 18.25 - samples/sec: 5469.73 - lr: 0.000029 - momentum: 0.000000 2023-10-20 10:13:18,397 epoch 5 - iter 1386/1546 - loss 0.11935535 - time (sec): 20.22 - samples/sec: 5539.74 - lr: 0.000028 - momentum: 0.000000 2023-10-20 10:13:20,866 epoch 5 - iter 1540/1546 - loss 0.12128787 - time (sec): 22.69 - samples/sec: 5460.02 - lr: 0.000028 - momentum: 0.000000 2023-10-20 10:13:20,927 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:13:20,927 EPOCH 5 done: loss 0.1217 - lr: 0.000028 2023-10-20 10:13:22,018 DEV : loss 0.08949781209230423 - f1-score (micro avg) 0.595 2023-10-20 10:13:22,029 saving best model 2023-10-20 10:13:22,062 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:13:24,393 epoch 6 - iter 154/1546 - loss 0.11226266 - time (sec): 2.33 - samples/sec: 5387.68 - lr: 0.000027 - momentum: 0.000000 2023-10-20 10:13:26,532 epoch 6 - iter 308/1546 - loss 0.10265943 - time (sec): 4.47 - samples/sec: 5555.45 - lr: 0.000027 - momentum: 0.000000 2023-10-20 10:13:28,613 epoch 6 - iter 462/1546 - loss 0.10424379 - time (sec): 6.55 - samples/sec: 5716.90 - lr: 0.000026 - momentum: 0.000000 2023-10-20 10:13:30,641 epoch 6 - iter 616/1546 - loss 0.10623558 - time (sec): 8.58 - samples/sec: 5725.20 - lr: 0.000026 - momentum: 0.000000 2023-10-20 10:13:32,828 epoch 6 - iter 770/1546 - loss 0.11100787 - time (sec): 10.77 - samples/sec: 5724.91 - lr: 0.000025 - momentum: 0.000000 2023-10-20 10:13:35,180 epoch 6 - iter 924/1546 - loss 0.11211571 - time (sec): 13.12 - samples/sec: 5676.76 - lr: 0.000024 - momentum: 0.000000 2023-10-20 10:13:37,345 epoch 6 - iter 1078/1546 - loss 0.11177970 - time (sec): 15.28 - samples/sec: 5718.75 - lr: 0.000024 - momentum: 0.000000 2023-10-20 10:13:39,582 epoch 6 - iter 1232/1546 - loss 0.11266747 - time (sec): 17.52 - samples/sec: 5691.91 - lr: 0.000023 - momentum: 0.000000 2023-10-20 10:13:41,568 epoch 6 - iter 1386/1546 - loss 0.11257646 - time (sec): 19.50 - samples/sec: 5743.01 - lr: 0.000023 - momentum: 0.000000 2023-10-20 10:13:43,620 epoch 6 - iter 1540/1546 - loss 0.11445026 - time (sec): 21.56 - samples/sec: 5740.72 - lr: 0.000022 - momentum: 0.000000 2023-10-20 10:13:43,693 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:13:43,693 EPOCH 6 done: loss 0.1141 - lr: 0.000022 2023-10-20 10:13:44,795 DEV : loss 0.0894583985209465 - f1-score (micro avg) 0.6044 2023-10-20 10:13:44,807 saving best model 2023-10-20 10:13:44,844 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:13:46,840 epoch 7 - iter 154/1546 - loss 0.09218541 - time (sec): 2.00 - samples/sec: 6246.33 - lr: 0.000022 - momentum: 0.000000 2023-10-20 10:13:48,873 epoch 7 - iter 308/1546 - loss 0.10539156 - time (sec): 4.03 - samples/sec: 5993.97 - lr: 0.000021 - momentum: 0.000000 2023-10-20 10:13:51,066 epoch 7 - iter 462/1546 - loss 0.10587781 - time (sec): 6.22 - samples/sec: 5844.78 - lr: 0.000021 - momentum: 0.000000 2023-10-20 10:13:53,381 epoch 7 - iter 616/1546 - loss 0.10732477 - time (sec): 8.54 - samples/sec: 5699.55 - lr: 0.000020 - momentum: 0.000000 2023-10-20 10:13:55,775 epoch 7 - iter 770/1546 - loss 0.10725162 - time (sec): 10.93 - samples/sec: 5609.63 - lr: 0.000019 - momentum: 0.000000 2023-10-20 10:13:57,910 epoch 7 - iter 924/1546 - loss 0.10584773 - time (sec): 13.07 - samples/sec: 5644.57 - lr: 0.000019 - momentum: 0.000000 2023-10-20 10:14:00,190 epoch 7 - iter 1078/1546 - loss 0.10739739 - time (sec): 15.35 - samples/sec: 5638.74 - lr: 0.000018 - momentum: 0.000000 2023-10-20 10:14:02,578 epoch 7 - iter 1232/1546 - loss 0.10850295 - time (sec): 17.73 - samples/sec: 5613.09 - lr: 0.000018 - momentum: 0.000000 2023-10-20 10:14:04,840 epoch 7 - iter 1386/1546 - loss 0.10686550 - time (sec): 20.00 - samples/sec: 5605.27 - lr: 0.000017 - momentum: 0.000000 2023-10-20 10:14:06,965 epoch 7 - iter 1540/1546 - loss 0.10810778 - time (sec): 22.12 - samples/sec: 5602.39 - lr: 0.000017 - momentum: 0.000000 2023-10-20 10:14:07,047 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:14:07,048 EPOCH 7 done: loss 0.1080 - lr: 0.000017 2023-10-20 10:14:08,133 DEV : loss 0.09173186868429184 - f1-score (micro avg) 0.64 2023-10-20 10:14:08,144 saving best model 2023-10-20 10:14:08,176 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:14:10,311 epoch 8 - iter 154/1546 - loss 0.08739069 - time (sec): 2.13 - samples/sec: 5922.14 - lr: 0.000016 - momentum: 0.000000 2023-10-20 10:14:12,436 epoch 8 - iter 308/1546 - loss 0.09407382 - time (sec): 4.26 - samples/sec: 5614.68 - lr: 0.000016 - momentum: 0.000000 2023-10-20 10:14:14,578 epoch 8 - iter 462/1546 - loss 0.09395776 - time (sec): 6.40 - samples/sec: 5661.18 - lr: 0.000015 - momentum: 0.000000 2023-10-20 10:14:16,748 epoch 8 - iter 616/1546 - loss 0.09701201 - time (sec): 8.57 - samples/sec: 5736.51 - lr: 0.000014 - momentum: 0.000000 2023-10-20 10:14:18,964 epoch 8 - iter 770/1546 - loss 0.09510309 - time (sec): 10.79 - samples/sec: 5712.16 - lr: 0.000014 - momentum: 0.000000 2023-10-20 10:14:21,121 epoch 8 - iter 924/1546 - loss 0.09516088 - time (sec): 12.95 - samples/sec: 5715.24 - lr: 0.000013 - momentum: 0.000000 2023-10-20 10:14:23,323 epoch 8 - iter 1078/1546 - loss 0.09572100 - time (sec): 15.15 - samples/sec: 5632.13 - lr: 0.000013 - momentum: 0.000000 2023-10-20 10:14:25,812 epoch 8 - iter 1232/1546 - loss 0.09502015 - time (sec): 17.64 - samples/sec: 5593.97 - lr: 0.000012 - momentum: 0.000000 2023-10-20 10:14:28,120 epoch 8 - iter 1386/1546 - loss 0.09739280 - time (sec): 19.94 - samples/sec: 5575.44 - lr: 0.000012 - momentum: 0.000000 2023-10-20 10:14:30,422 epoch 8 - iter 1540/1546 - loss 0.10129209 - time (sec): 22.25 - samples/sec: 5570.21 - lr: 0.000011 - momentum: 0.000000 2023-10-20 10:14:30,512 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:14:30,512 EPOCH 8 done: loss 0.1011 - lr: 0.000011 2023-10-20 10:14:31,600 DEV : loss 0.0943157821893692 - f1-score (micro avg) 0.6389 2023-10-20 10:14:31,613 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:14:33,990 epoch 9 - iter 154/1546 - loss 0.10381016 - time (sec): 2.38 - samples/sec: 5710.02 - lr: 0.000011 - momentum: 0.000000 2023-10-20 10:14:36,251 epoch 9 - iter 308/1546 - loss 0.09899822 - time (sec): 4.64 - samples/sec: 5612.75 - lr: 0.000010 - momentum: 0.000000 2023-10-20 10:14:38,651 epoch 9 - iter 462/1546 - loss 0.09638670 - time (sec): 7.04 - samples/sec: 5497.79 - lr: 0.000009 - momentum: 0.000000 2023-10-20 10:14:40,844 epoch 9 - iter 616/1546 - loss 0.09667721 - time (sec): 9.23 - samples/sec: 5513.33 - lr: 0.000009 - momentum: 0.000000 2023-10-20 10:14:43,057 epoch 9 - iter 770/1546 - loss 0.09476298 - time (sec): 11.44 - samples/sec: 5548.85 - lr: 0.000008 - momentum: 0.000000 2023-10-20 10:14:45,150 epoch 9 - iter 924/1546 - loss 0.09903898 - time (sec): 13.54 - samples/sec: 5578.79 - lr: 0.000008 - momentum: 0.000000 2023-10-20 10:14:47,158 epoch 9 - iter 1078/1546 - loss 0.09644769 - time (sec): 15.54 - samples/sec: 5638.81 - lr: 0.000007 - momentum: 0.000000 2023-10-20 10:14:49,412 epoch 9 - iter 1232/1546 - loss 0.09654711 - time (sec): 17.80 - samples/sec: 5558.24 - lr: 0.000007 - momentum: 0.000000 2023-10-20 10:14:51,703 epoch 9 - iter 1386/1546 - loss 0.09722965 - time (sec): 20.09 - samples/sec: 5570.51 - lr: 0.000006 - momentum: 0.000000 2023-10-20 10:14:53,798 epoch 9 - iter 1540/1546 - loss 0.09793060 - time (sec): 22.19 - samples/sec: 5578.88 - lr: 0.000006 - momentum: 0.000000 2023-10-20 10:14:53,879 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:14:53,879 EPOCH 9 done: loss 0.0978 - lr: 0.000006 2023-10-20 10:14:54,978 DEV : loss 0.09980098158121109 - f1-score (micro avg) 0.6467 2023-10-20 10:14:54,990 saving best model 2023-10-20 10:14:55,022 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:14:56,893 epoch 10 - iter 154/1546 - loss 0.09696259 - time (sec): 1.87 - samples/sec: 6547.47 - lr: 0.000005 - momentum: 0.000000 2023-10-20 10:14:58,970 epoch 10 - iter 308/1546 - loss 0.10050118 - time (sec): 3.95 - samples/sec: 6367.38 - lr: 0.000004 - momentum: 0.000000 2023-10-20 10:15:00,733 epoch 10 - iter 462/1546 - loss 0.09022987 - time (sec): 5.71 - samples/sec: 6443.56 - lr: 0.000004 - momentum: 0.000000 2023-10-20 10:15:02,925 epoch 10 - iter 616/1546 - loss 0.09032734 - time (sec): 7.90 - samples/sec: 6108.71 - lr: 0.000003 - momentum: 0.000000 2023-10-20 10:15:05,155 epoch 10 - iter 770/1546 - loss 0.09382315 - time (sec): 10.13 - samples/sec: 6024.21 - lr: 0.000003 - momentum: 0.000000 2023-10-20 10:15:07,178 epoch 10 - iter 924/1546 - loss 0.09467083 - time (sec): 12.16 - samples/sec: 6040.01 - lr: 0.000002 - momentum: 0.000000 2023-10-20 10:15:09,265 epoch 10 - iter 1078/1546 - loss 0.09386195 - time (sec): 14.24 - samples/sec: 6070.68 - lr: 0.000002 - momentum: 0.000000 2023-10-20 10:15:11,315 epoch 10 - iter 1232/1546 - loss 0.09326761 - time (sec): 16.29 - samples/sec: 6039.42 - lr: 0.000001 - momentum: 0.000000 2023-10-20 10:15:13,332 epoch 10 - iter 1386/1546 - loss 0.09334859 - time (sec): 18.31 - samples/sec: 6096.90 - lr: 0.000001 - momentum: 0.000000 2023-10-20 10:15:15,718 epoch 10 - iter 1540/1546 - loss 0.09651352 - time (sec): 20.70 - samples/sec: 5985.19 - lr: 0.000000 - momentum: 0.000000 2023-10-20 10:15:15,808 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:15:15,808 EPOCH 10 done: loss 0.0966 - lr: 0.000000 2023-10-20 10:15:16,905 DEV : loss 0.09902973473072052 - f1-score (micro avg) 0.658 2023-10-20 10:15:16,918 saving best model 2023-10-20 10:15:16,992 ---------------------------------------------------------------------------------------------------- 2023-10-20 10:15:16,992 Loading model from best epoch ... 2023-10-20 10:15:17,065 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-20 10:15:19,991 Results: - F-score (micro) 0.5767 - F-score (macro) 0.3607 - Accuracy 0.4198 By class: precision recall f1-score support LOC 0.6340 0.6628 0.6481 946 BUILDING 0.1875 0.0649 0.0964 185 STREET 0.6190 0.2321 0.3377 56 micro avg 0.6071 0.5493 0.5767 1187 macro avg 0.4802 0.3199 0.3607 1187 weighted avg 0.5637 0.5493 0.5474 1187 2023-10-20 10:15:19,991 ----------------------------------------------------------------------------------------------------