2023-10-20 09:21:43,922 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:43,922 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-20 09:21:43,922 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:43,922 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-20 09:21:43,922 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:43,922 Train: 6183 sentences 2023-10-20 09:21:43,922 (train_with_dev=False, train_with_test=False) 2023-10-20 09:21:43,922 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:43,922 Training Params: 2023-10-20 09:21:43,922 - learning_rate: "3e-05" 2023-10-20 09:21:43,922 - mini_batch_size: "4" 2023-10-20 09:21:43,922 - max_epochs: "10" 2023-10-20 09:21:43,923 - shuffle: "True" 2023-10-20 09:21:43,923 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:43,923 Plugins: 2023-10-20 09:21:43,923 - TensorboardLogger 2023-10-20 09:21:43,923 - LinearScheduler | warmup_fraction: '0.1' 2023-10-20 09:21:43,923 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:43,923 Final evaluation on model from best epoch (best-model.pt) 2023-10-20 09:21:43,923 - metric: "('micro avg', 'f1-score')" 2023-10-20 09:21:43,923 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:43,923 Computation: 2023-10-20 09:21:43,923 - compute on device: cuda:0 2023-10-20 09:21:43,923 - embedding storage: none 2023-10-20 09:21:43,923 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:43,923 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-20 09:21:43,923 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:43,923 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:21:43,923 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-20 09:21:46,109 epoch 1 - iter 154/1546 - loss 2.48451865 - time (sec): 2.19 - samples/sec: 5594.24 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:21:48,505 epoch 1 - iter 308/1546 - loss 2.20980853 - time (sec): 4.58 - samples/sec: 5428.83 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:21:50,962 epoch 1 - iter 462/1546 - loss 1.82421682 - time (sec): 7.04 - samples/sec: 5172.98 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:21:53,386 epoch 1 - iter 616/1546 - loss 1.48474101 - time (sec): 9.46 - samples/sec: 5109.95 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:21:55,723 epoch 1 - iter 770/1546 - loss 1.25106887 - time (sec): 11.80 - samples/sec: 5093.22 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:21:58,079 epoch 1 - iter 924/1546 - loss 1.08823913 - time (sec): 14.16 - samples/sec: 5114.37 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:22:00,454 epoch 1 - iter 1078/1546 - loss 0.95790324 - time (sec): 16.53 - samples/sec: 5192.58 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:22:02,788 epoch 1 - iter 1232/1546 - loss 0.87119296 - time (sec): 18.86 - samples/sec: 5187.38 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:22:05,208 epoch 1 - iter 1386/1546 - loss 0.79808934 - time (sec): 21.28 - samples/sec: 5193.97 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:22:07,613 epoch 1 - iter 1540/1546 - loss 0.73692287 - time (sec): 23.69 - samples/sec: 5231.99 - lr: 0.000030 - momentum: 0.000000 2023-10-20 09:22:07,698 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:22:07,698 EPOCH 1 done: loss 0.7355 - lr: 0.000030 2023-10-20 09:22:08,681 DEV : loss 0.1223587766289711 - f1-score (micro avg) 0.0 2023-10-20 09:22:08,694 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:22:11,055 epoch 2 - iter 154/1546 - loss 0.23461614 - time (sec): 2.36 - samples/sec: 5266.33 - lr: 0.000030 - momentum: 0.000000 2023-10-20 09:22:13,484 epoch 2 - iter 308/1546 - loss 0.22531020 - time (sec): 4.79 - samples/sec: 5142.15 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:22:15,947 epoch 2 - iter 462/1546 - loss 0.20946648 - time (sec): 7.25 - samples/sec: 5161.02 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:22:18,269 epoch 2 - iter 616/1546 - loss 0.20751780 - time (sec): 9.57 - samples/sec: 5222.26 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:22:20,610 epoch 2 - iter 770/1546 - loss 0.19961546 - time (sec): 11.91 - samples/sec: 5227.20 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:22:22,995 epoch 2 - iter 924/1546 - loss 0.19714712 - time (sec): 14.30 - samples/sec: 5174.61 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:22:25,327 epoch 2 - iter 1078/1546 - loss 0.19697459 - time (sec): 16.63 - samples/sec: 5174.54 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:22:27,676 epoch 2 - iter 1232/1546 - loss 0.19565618 - time (sec): 18.98 - samples/sec: 5183.60 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:22:30,107 epoch 2 - iter 1386/1546 - loss 0.19351339 - time (sec): 21.41 - samples/sec: 5184.12 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:22:32,820 epoch 2 - iter 1540/1546 - loss 0.19609873 - time (sec): 24.13 - samples/sec: 5131.95 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:22:32,914 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:22:32,915 EPOCH 2 done: loss 0.1957 - lr: 0.000027 2023-10-20 09:22:34,291 DEV : loss 0.10185166448354721 - f1-score (micro avg) 0.4131 2023-10-20 09:22:34,302 saving best model 2023-10-20 09:22:34,336 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:22:36,711 epoch 3 - iter 154/1546 - loss 0.17535207 - time (sec): 2.37 - samples/sec: 5156.66 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:22:39,106 epoch 3 - iter 308/1546 - loss 0.18337472 - time (sec): 4.77 - samples/sec: 5116.39 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:22:41,362 epoch 3 - iter 462/1546 - loss 0.17696955 - time (sec): 7.03 - samples/sec: 5168.68 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:22:43,760 epoch 3 - iter 616/1546 - loss 0.17228177 - time (sec): 9.42 - samples/sec: 5189.50 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:22:46,138 epoch 3 - iter 770/1546 - loss 0.16767539 - time (sec): 11.80 - samples/sec: 5203.89 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:22:48,495 epoch 3 - iter 924/1546 - loss 0.16675286 - time (sec): 14.16 - samples/sec: 5206.07 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:22:50,831 epoch 3 - iter 1078/1546 - loss 0.16667390 - time (sec): 16.49 - samples/sec: 5214.07 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:22:53,279 epoch 3 - iter 1232/1546 - loss 0.16837749 - time (sec): 18.94 - samples/sec: 5220.26 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:22:55,532 epoch 3 - iter 1386/1546 - loss 0.16869676 - time (sec): 21.20 - samples/sec: 5269.06 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:22:57,935 epoch 3 - iter 1540/1546 - loss 0.17071969 - time (sec): 23.60 - samples/sec: 5249.18 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:22:58,025 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:22:58,025 EPOCH 3 done: loss 0.1705 - lr: 0.000023 2023-10-20 09:22:59,119 DEV : loss 0.09305855631828308 - f1-score (micro avg) 0.4828 2023-10-20 09:22:59,130 saving best model 2023-10-20 09:22:59,169 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:23:01,533 epoch 4 - iter 154/1546 - loss 0.18946194 - time (sec): 2.36 - samples/sec: 5331.47 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:23:03,986 epoch 4 - iter 308/1546 - loss 0.15783265 - time (sec): 4.82 - samples/sec: 5292.11 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:23:06,399 epoch 4 - iter 462/1546 - loss 0.15915683 - time (sec): 7.23 - samples/sec: 5311.71 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:23:08,748 epoch 4 - iter 616/1546 - loss 0.15503094 - time (sec): 9.58 - samples/sec: 5278.97 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:23:11,091 epoch 4 - iter 770/1546 - loss 0.15482979 - time (sec): 11.92 - samples/sec: 5251.65 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:23:13,444 epoch 4 - iter 924/1546 - loss 0.15581642 - time (sec): 14.27 - samples/sec: 5225.36 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:23:15,828 epoch 4 - iter 1078/1546 - loss 0.15477132 - time (sec): 16.66 - samples/sec: 5195.01 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:23:18,190 epoch 4 - iter 1232/1546 - loss 0.15524974 - time (sec): 19.02 - samples/sec: 5213.98 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:23:20,514 epoch 4 - iter 1386/1546 - loss 0.15589324 - time (sec): 21.34 - samples/sec: 5239.09 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:23:22,914 epoch 4 - iter 1540/1546 - loss 0.15641620 - time (sec): 23.74 - samples/sec: 5201.94 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:23:23,014 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:23:23,014 EPOCH 4 done: loss 0.1558 - lr: 0.000020 2023-10-20 09:23:24,084 DEV : loss 0.09403558075428009 - f1-score (micro avg) 0.5223 2023-10-20 09:23:24,095 saving best model 2023-10-20 09:23:24,133 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:23:26,643 epoch 5 - iter 154/1546 - loss 0.12664182 - time (sec): 2.51 - samples/sec: 5226.08 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:23:29,036 epoch 5 - iter 308/1546 - loss 0.13916018 - time (sec): 4.90 - samples/sec: 5224.75 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:23:31,417 epoch 5 - iter 462/1546 - loss 0.14351088 - time (sec): 7.28 - samples/sec: 5221.91 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:23:33,823 epoch 5 - iter 616/1546 - loss 0.14577952 - time (sec): 9.69 - samples/sec: 5198.34 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:23:36,213 epoch 5 - iter 770/1546 - loss 0.14852475 - time (sec): 12.08 - samples/sec: 5162.44 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:23:38,571 epoch 5 - iter 924/1546 - loss 0.14575396 - time (sec): 14.44 - samples/sec: 5178.76 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:23:41,071 epoch 5 - iter 1078/1546 - loss 0.14710420 - time (sec): 16.94 - samples/sec: 5184.88 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:23:43,431 epoch 5 - iter 1232/1546 - loss 0.14875112 - time (sec): 19.30 - samples/sec: 5162.22 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:23:45,866 epoch 5 - iter 1386/1546 - loss 0.15024416 - time (sec): 21.73 - samples/sec: 5150.08 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:23:48,241 epoch 5 - iter 1540/1546 - loss 0.14764070 - time (sec): 24.11 - samples/sec: 5134.71 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:23:48,330 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:23:48,330 EPOCH 5 done: loss 0.1474 - lr: 0.000017 2023-10-20 09:23:49,403 DEV : loss 0.0924314334988594 - f1-score (micro avg) 0.543 2023-10-20 09:23:49,414 saving best model 2023-10-20 09:23:49,448 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:23:51,812 epoch 6 - iter 154/1546 - loss 0.11599985 - time (sec): 2.36 - samples/sec: 5319.11 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:23:54,241 epoch 6 - iter 308/1546 - loss 0.12632791 - time (sec): 4.79 - samples/sec: 5163.22 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:23:56,387 epoch 6 - iter 462/1546 - loss 0.13301169 - time (sec): 6.94 - samples/sec: 5434.71 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:23:58,510 epoch 6 - iter 616/1546 - loss 0.13389940 - time (sec): 9.06 - samples/sec: 5547.70 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:24:00,637 epoch 6 - iter 770/1546 - loss 0.13455641 - time (sec): 11.19 - samples/sec: 5572.60 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:24:02,964 epoch 6 - iter 924/1546 - loss 0.13548871 - time (sec): 13.52 - samples/sec: 5506.63 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:24:05,387 epoch 6 - iter 1078/1546 - loss 0.13801768 - time (sec): 15.94 - samples/sec: 5515.90 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:24:07,751 epoch 6 - iter 1232/1546 - loss 0.13738306 - time (sec): 18.30 - samples/sec: 5434.50 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:24:10,100 epoch 6 - iter 1386/1546 - loss 0.13781890 - time (sec): 20.65 - samples/sec: 5409.32 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:24:12,462 epoch 6 - iter 1540/1546 - loss 0.13895651 - time (sec): 23.01 - samples/sec: 5385.41 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:24:12,553 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:24:12,553 EPOCH 6 done: loss 0.1391 - lr: 0.000013 2023-10-20 09:24:13,648 DEV : loss 0.09394099563360214 - f1-score (micro avg) 0.5667 2023-10-20 09:24:13,660 saving best model 2023-10-20 09:24:13,698 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:24:16,052 epoch 7 - iter 154/1546 - loss 0.12165657 - time (sec): 2.35 - samples/sec: 4933.80 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:24:18,408 epoch 7 - iter 308/1546 - loss 0.12673301 - time (sec): 4.71 - samples/sec: 4986.08 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:24:20,752 epoch 7 - iter 462/1546 - loss 0.12724155 - time (sec): 7.05 - samples/sec: 5074.12 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:24:23,143 epoch 7 - iter 616/1546 - loss 0.13164398 - time (sec): 9.44 - samples/sec: 5141.76 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:24:25,490 epoch 7 - iter 770/1546 - loss 0.12961218 - time (sec): 11.79 - samples/sec: 5191.62 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:24:27,806 epoch 7 - iter 924/1546 - loss 0.12903599 - time (sec): 14.11 - samples/sec: 5172.23 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:24:30,111 epoch 7 - iter 1078/1546 - loss 0.13147164 - time (sec): 16.41 - samples/sec: 5204.56 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:24:32,430 epoch 7 - iter 1232/1546 - loss 0.13360333 - time (sec): 18.73 - samples/sec: 5237.46 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:24:34,865 epoch 7 - iter 1386/1546 - loss 0.13374572 - time (sec): 21.17 - samples/sec: 5228.21 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:24:37,242 epoch 7 - iter 1540/1546 - loss 0.13407398 - time (sec): 23.54 - samples/sec: 5262.00 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:24:37,328 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:24:37,329 EPOCH 7 done: loss 0.1339 - lr: 0.000010 2023-10-20 09:24:38,393 DEV : loss 0.0905207097530365 - f1-score (micro avg) 0.5603 2023-10-20 09:24:38,405 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:24:40,567 epoch 8 - iter 154/1546 - loss 0.13063320 - time (sec): 2.16 - samples/sec: 5847.04 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:24:42,906 epoch 8 - iter 308/1546 - loss 0.14726772 - time (sec): 4.50 - samples/sec: 5529.76 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:24:45,318 epoch 8 - iter 462/1546 - loss 0.14619992 - time (sec): 6.91 - samples/sec: 5476.37 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:24:47,668 epoch 8 - iter 616/1546 - loss 0.14164336 - time (sec): 9.26 - samples/sec: 5352.41 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:24:50,048 epoch 8 - iter 770/1546 - loss 0.13670213 - time (sec): 11.64 - samples/sec: 5370.75 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:24:52,428 epoch 8 - iter 924/1546 - loss 0.13211631 - time (sec): 14.02 - samples/sec: 5397.13 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:24:54,766 epoch 8 - iter 1078/1546 - loss 0.12689569 - time (sec): 16.36 - samples/sec: 5364.07 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:24:57,169 epoch 8 - iter 1232/1546 - loss 0.12935143 - time (sec): 18.76 - samples/sec: 5300.15 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:24:59,555 epoch 8 - iter 1386/1546 - loss 0.12672051 - time (sec): 21.15 - samples/sec: 5295.12 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:25:01,921 epoch 8 - iter 1540/1546 - loss 0.12817406 - time (sec): 23.52 - samples/sec: 5259.15 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:25:02,014 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:25:02,014 EPOCH 8 done: loss 0.1286 - lr: 0.000007 2023-10-20 09:25:03,129 DEV : loss 0.09427973628044128 - f1-score (micro avg) 0.5679 2023-10-20 09:25:03,142 saving best model 2023-10-20 09:25:03,183 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:25:05,584 epoch 9 - iter 154/1546 - loss 0.14790471 - time (sec): 2.40 - samples/sec: 5017.34 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:25:07,975 epoch 9 - iter 308/1546 - loss 0.13523772 - time (sec): 4.79 - samples/sec: 5137.52 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:25:10,387 epoch 9 - iter 462/1546 - loss 0.12483047 - time (sec): 7.20 - samples/sec: 5245.29 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:25:12,786 epoch 9 - iter 616/1546 - loss 0.12594618 - time (sec): 9.60 - samples/sec: 5261.51 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:25:15,158 epoch 9 - iter 770/1546 - loss 0.12504595 - time (sec): 11.97 - samples/sec: 5265.70 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:25:17,637 epoch 9 - iter 924/1546 - loss 0.12727744 - time (sec): 14.45 - samples/sec: 5207.78 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:25:19,958 epoch 9 - iter 1078/1546 - loss 0.12477542 - time (sec): 16.77 - samples/sec: 5179.25 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:25:22,309 epoch 9 - iter 1232/1546 - loss 0.12399062 - time (sec): 19.13 - samples/sec: 5194.17 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:25:24,716 epoch 9 - iter 1386/1546 - loss 0.12526472 - time (sec): 21.53 - samples/sec: 5190.83 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:25:27,063 epoch 9 - iter 1540/1546 - loss 0.12731966 - time (sec): 23.88 - samples/sec: 5184.05 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:25:27,157 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:25:27,157 EPOCH 9 done: loss 0.1271 - lr: 0.000003 2023-10-20 09:25:28,255 DEV : loss 0.0943358764052391 - f1-score (micro avg) 0.5782 2023-10-20 09:25:28,267 saving best model 2023-10-20 09:25:28,306 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:25:30,676 epoch 10 - iter 154/1546 - loss 0.14121008 - time (sec): 2.37 - samples/sec: 5065.46 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:25:33,055 epoch 10 - iter 308/1546 - loss 0.12929657 - time (sec): 4.75 - samples/sec: 5177.61 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:25:35,475 epoch 10 - iter 462/1546 - loss 0.12661390 - time (sec): 7.17 - samples/sec: 5166.63 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:25:37,890 epoch 10 - iter 616/1546 - loss 0.12903656 - time (sec): 9.58 - samples/sec: 5216.30 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:25:40,295 epoch 10 - iter 770/1546 - loss 0.12930312 - time (sec): 11.99 - samples/sec: 5208.12 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:25:42,742 epoch 10 - iter 924/1546 - loss 0.12539197 - time (sec): 14.44 - samples/sec: 5263.47 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:25:44,959 epoch 10 - iter 1078/1546 - loss 0.12288240 - time (sec): 16.65 - samples/sec: 5331.84 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:25:47,142 epoch 10 - iter 1232/1546 - loss 0.12366919 - time (sec): 18.84 - samples/sec: 5327.15 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:25:49,497 epoch 10 - iter 1386/1546 - loss 0.12424106 - time (sec): 21.19 - samples/sec: 5294.21 - lr: 0.000000 - momentum: 0.000000 2023-10-20 09:25:51,836 epoch 10 - iter 1540/1546 - loss 0.12454482 - time (sec): 23.53 - samples/sec: 5268.60 - lr: 0.000000 - momentum: 0.000000 2023-10-20 09:25:51,918 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:25:51,918 EPOCH 10 done: loss 0.1244 - lr: 0.000000 2023-10-20 09:25:53,010 DEV : loss 0.09444588422775269 - f1-score (micro avg) 0.5751 2023-10-20 09:25:53,052 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:25:53,052 Loading model from best epoch ... 2023-10-20 09:25:53,136 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-20 09:25:56,005 Results: - F-score (micro) 0.5346 - F-score (macro) 0.2224 - Accuracy 0.3719 By class: precision recall f1-score support LOC 0.6030 0.6004 0.6017 946 BUILDING 0.3333 0.0162 0.0309 185 STREET 0.5000 0.0179 0.0345 56 micro avg 0.6002 0.4819 0.5346 1187 macro avg 0.4788 0.2115 0.2224 1187 weighted avg 0.5561 0.4819 0.4860 1187 2023-10-20 09:25:56,005 ----------------------------------------------------------------------------------------------------