2023-10-25 10:37:21,901 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:37:21,902 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 10:37:21,902 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:37:21,903 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:37:21,903 Train: 6183 sentences 2023-10-25 10:37:21,903 (train_with_dev=False, train_with_test=False) 2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:37:21,903 Training Params: 2023-10-25 10:37:21,903 - learning_rate: "3e-05" 2023-10-25 10:37:21,903 - mini_batch_size: "8" 2023-10-25 10:37:21,903 - max_epochs: "10" 2023-10-25 10:37:21,903 - shuffle: "True" 2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:37:21,903 Plugins: 2023-10-25 10:37:21,903 - TensorboardLogger 2023-10-25 10:37:21,903 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:37:21,903 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 10:37:21,903 - metric: "('micro avg', 'f1-score')" 2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:37:21,903 Computation: 2023-10-25 10:37:21,903 - compute on device: cuda:0 2023-10-25 10:37:21,903 - embedding storage: none 2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:37:21,903 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:37:21,903 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 10:37:26,605 epoch 1 - iter 77/773 - loss 2.00342293 - time (sec): 4.70 - samples/sec: 2712.01 - lr: 0.000003 - momentum: 0.000000 2023-10-25 10:37:31,262 epoch 1 - iter 154/773 - loss 1.14055389 - time (sec): 9.36 - samples/sec: 2662.74 - lr: 0.000006 - momentum: 0.000000 2023-10-25 10:37:35,889 epoch 1 - iter 231/773 - loss 0.82625202 - time (sec): 13.98 - samples/sec: 2642.41 - lr: 0.000009 - momentum: 0.000000 2023-10-25 10:37:40,575 epoch 1 - iter 308/773 - loss 0.64493288 - time (sec): 18.67 - samples/sec: 2668.34 - lr: 0.000012 - momentum: 0.000000 2023-10-25 10:37:45,208 epoch 1 - iter 385/773 - loss 0.53761507 - time (sec): 23.30 - samples/sec: 2669.10 - lr: 0.000015 - momentum: 0.000000 2023-10-25 10:37:49,742 epoch 1 - iter 462/773 - loss 0.47184545 - time (sec): 27.84 - samples/sec: 2656.78 - lr: 0.000018 - momentum: 0.000000 2023-10-25 10:37:54,392 epoch 1 - iter 539/773 - loss 0.41778036 - time (sec): 32.49 - samples/sec: 2672.72 - lr: 0.000021 - momentum: 0.000000 2023-10-25 10:37:59,140 epoch 1 - iter 616/773 - loss 0.37526149 - time (sec): 37.24 - samples/sec: 2675.01 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:38:03,677 epoch 1 - iter 693/773 - loss 0.34438623 - time (sec): 41.77 - samples/sec: 2678.19 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:38:08,219 epoch 1 - iter 770/773 - loss 0.32016433 - time (sec): 46.31 - samples/sec: 2673.41 - lr: 0.000030 - momentum: 0.000000 2023-10-25 10:38:08,404 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:38:08,404 EPOCH 1 done: loss 0.3193 - lr: 0.000030 2023-10-25 10:38:11,901 DEV : loss 0.05575157329440117 - f1-score (micro avg) 0.7258 2023-10-25 10:38:11,920 saving best model 2023-10-25 10:38:12,442 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:38:17,190 epoch 2 - iter 77/773 - loss 0.06917782 - time (sec): 4.75 - samples/sec: 2589.73 - lr: 0.000030 - momentum: 0.000000 2023-10-25 10:38:21,845 epoch 2 - iter 154/773 - loss 0.07295557 - time (sec): 9.40 - samples/sec: 2635.98 - lr: 0.000029 - momentum: 0.000000 2023-10-25 10:38:26,432 epoch 2 - iter 231/773 - loss 0.07445178 - time (sec): 13.99 - samples/sec: 2577.67 - lr: 0.000029 - momentum: 0.000000 2023-10-25 10:38:31,123 epoch 2 - iter 308/773 - loss 0.07546608 - time (sec): 18.68 - samples/sec: 2620.82 - lr: 0.000029 - momentum: 0.000000 2023-10-25 10:38:35,870 epoch 2 - iter 385/773 - loss 0.07298737 - time (sec): 23.43 - samples/sec: 2630.85 - lr: 0.000028 - momentum: 0.000000 2023-10-25 10:38:40,555 epoch 2 - iter 462/773 - loss 0.07131937 - time (sec): 28.11 - samples/sec: 2642.70 - lr: 0.000028 - momentum: 0.000000 2023-10-25 10:38:45,260 epoch 2 - iter 539/773 - loss 0.07085615 - time (sec): 32.82 - samples/sec: 2652.90 - lr: 0.000028 - momentum: 0.000000 2023-10-25 10:38:50,000 epoch 2 - iter 616/773 - loss 0.07015877 - time (sec): 37.56 - samples/sec: 2628.79 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:38:54,537 epoch 2 - iter 693/773 - loss 0.06882789 - time (sec): 42.09 - samples/sec: 2624.41 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:38:59,395 epoch 2 - iter 770/773 - loss 0.06928794 - time (sec): 46.95 - samples/sec: 2635.97 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:38:59,585 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:38:59,586 EPOCH 2 done: loss 0.0691 - lr: 0.000027 2023-10-25 10:39:02,404 DEV : loss 0.049297548830509186 - f1-score (micro avg) 0.8142 2023-10-25 10:39:02,422 saving best model 2023-10-25 10:39:03,089 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:39:07,755 epoch 3 - iter 77/773 - loss 0.04741193 - time (sec): 4.66 - samples/sec: 2618.48 - lr: 0.000026 - momentum: 0.000000 2023-10-25 10:39:12,662 epoch 3 - iter 154/773 - loss 0.04548219 - time (sec): 9.57 - samples/sec: 2493.23 - lr: 0.000026 - momentum: 0.000000 2023-10-25 10:39:17,185 epoch 3 - iter 231/773 - loss 0.04459085 - time (sec): 14.09 - samples/sec: 2538.42 - lr: 0.000026 - momentum: 0.000000 2023-10-25 10:39:21,796 epoch 3 - iter 308/773 - loss 0.04386816 - time (sec): 18.70 - samples/sec: 2585.25 - lr: 0.000025 - momentum: 0.000000 2023-10-25 10:39:26,414 epoch 3 - iter 385/773 - loss 0.04428707 - time (sec): 23.32 - samples/sec: 2595.54 - lr: 0.000025 - momentum: 0.000000 2023-10-25 10:39:31,080 epoch 3 - iter 462/773 - loss 0.04386941 - time (sec): 27.99 - samples/sec: 2628.33 - lr: 0.000025 - momentum: 0.000000 2023-10-25 10:39:35,843 epoch 3 - iter 539/773 - loss 0.04425161 - time (sec): 32.75 - samples/sec: 2629.03 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:39:40,653 epoch 3 - iter 616/773 - loss 0.04565354 - time (sec): 37.56 - samples/sec: 2628.00 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:39:45,336 epoch 3 - iter 693/773 - loss 0.04638286 - time (sec): 42.24 - samples/sec: 2640.83 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:39:50,069 epoch 3 - iter 770/773 - loss 0.04580281 - time (sec): 46.98 - samples/sec: 2637.11 - lr: 0.000023 - momentum: 0.000000 2023-10-25 10:39:50,249 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:39:50,249 EPOCH 3 done: loss 0.0457 - lr: 0.000023 2023-10-25 10:39:53,011 DEV : loss 0.07478724420070648 - f1-score (micro avg) 0.7705 2023-10-25 10:39:53,029 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:39:57,751 epoch 4 - iter 77/773 - loss 0.02381115 - time (sec): 4.72 - samples/sec: 2644.65 - lr: 0.000023 - momentum: 0.000000 2023-10-25 10:40:02,403 epoch 4 - iter 154/773 - loss 0.02272232 - time (sec): 9.37 - samples/sec: 2696.05 - lr: 0.000023 - momentum: 0.000000 2023-10-25 10:40:07,048 epoch 4 - iter 231/773 - loss 0.02306162 - time (sec): 14.02 - samples/sec: 2694.29 - lr: 0.000022 - momentum: 0.000000 2023-10-25 10:40:11,691 epoch 4 - iter 308/773 - loss 0.02500263 - time (sec): 18.66 - samples/sec: 2695.98 - lr: 0.000022 - momentum: 0.000000 2023-10-25 10:40:16,351 epoch 4 - iter 385/773 - loss 0.02577652 - time (sec): 23.32 - samples/sec: 2669.27 - lr: 0.000022 - momentum: 0.000000 2023-10-25 10:40:21,109 epoch 4 - iter 462/773 - loss 0.02834569 - time (sec): 28.08 - samples/sec: 2640.19 - lr: 0.000021 - momentum: 0.000000 2023-10-25 10:40:26,055 epoch 4 - iter 539/773 - loss 0.02860700 - time (sec): 33.02 - samples/sec: 2629.75 - lr: 0.000021 - momentum: 0.000000 2023-10-25 10:40:30,766 epoch 4 - iter 616/773 - loss 0.02820990 - time (sec): 37.73 - samples/sec: 2639.20 - lr: 0.000021 - momentum: 0.000000 2023-10-25 10:40:35,442 epoch 4 - iter 693/773 - loss 0.02789789 - time (sec): 42.41 - samples/sec: 2648.82 - lr: 0.000020 - momentum: 0.000000 2023-10-25 10:40:39,951 epoch 4 - iter 770/773 - loss 0.02986146 - time (sec): 46.92 - samples/sec: 2638.85 - lr: 0.000020 - momentum: 0.000000 2023-10-25 10:40:40,133 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:40:40,133 EPOCH 4 done: loss 0.0299 - lr: 0.000020 2023-10-25 10:40:42,825 DEV : loss 0.08224356174468994 - f1-score (micro avg) 0.7658 2023-10-25 10:40:42,842 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:40:47,542 epoch 5 - iter 77/773 - loss 0.02345774 - time (sec): 4.70 - samples/sec: 2619.37 - lr: 0.000020 - momentum: 0.000000 2023-10-25 10:40:52,241 epoch 5 - iter 154/773 - loss 0.02122391 - time (sec): 9.40 - samples/sec: 2625.76 - lr: 0.000019 - momentum: 0.000000 2023-10-25 10:40:56,970 epoch 5 - iter 231/773 - loss 0.01984080 - time (sec): 14.13 - samples/sec: 2661.60 - lr: 0.000019 - momentum: 0.000000 2023-10-25 10:41:01,364 epoch 5 - iter 308/773 - loss 0.02263347 - time (sec): 18.52 - samples/sec: 2686.02 - lr: 0.000019 - momentum: 0.000000 2023-10-25 10:41:06,019 epoch 5 - iter 385/773 - loss 0.02194096 - time (sec): 23.18 - samples/sec: 2706.00 - lr: 0.000018 - momentum: 0.000000 2023-10-25 10:41:10,705 epoch 5 - iter 462/773 - loss 0.02179624 - time (sec): 27.86 - samples/sec: 2708.53 - lr: 0.000018 - momentum: 0.000000 2023-10-25 10:41:15,249 epoch 5 - iter 539/773 - loss 0.02057243 - time (sec): 32.41 - samples/sec: 2722.04 - lr: 0.000018 - momentum: 0.000000 2023-10-25 10:41:19,743 epoch 5 - iter 616/773 - loss 0.02059981 - time (sec): 36.90 - samples/sec: 2702.67 - lr: 0.000017 - momentum: 0.000000 2023-10-25 10:41:24,238 epoch 5 - iter 693/773 - loss 0.02048126 - time (sec): 41.39 - samples/sec: 2712.66 - lr: 0.000017 - momentum: 0.000000 2023-10-25 10:41:28,644 epoch 5 - iter 770/773 - loss 0.02099495 - time (sec): 45.80 - samples/sec: 2703.67 - lr: 0.000017 - momentum: 0.000000 2023-10-25 10:41:28,839 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:41:28,840 EPOCH 5 done: loss 0.0212 - lr: 0.000017 2023-10-25 10:41:31,552 DEV : loss 0.09945573657751083 - f1-score (micro avg) 0.781 2023-10-25 10:41:31,572 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:41:36,179 epoch 6 - iter 77/773 - loss 0.01534591 - time (sec): 4.60 - samples/sec: 2745.09 - lr: 0.000016 - momentum: 0.000000 2023-10-25 10:41:40,809 epoch 6 - iter 154/773 - loss 0.01519805 - time (sec): 9.24 - samples/sec: 2722.23 - lr: 0.000016 - momentum: 0.000000 2023-10-25 10:41:45,397 epoch 6 - iter 231/773 - loss 0.01464584 - time (sec): 13.82 - samples/sec: 2671.47 - lr: 0.000016 - momentum: 0.000000 2023-10-25 10:41:50,157 epoch 6 - iter 308/773 - loss 0.01414850 - time (sec): 18.58 - samples/sec: 2679.03 - lr: 0.000015 - momentum: 0.000000 2023-10-25 10:41:54,914 epoch 6 - iter 385/773 - loss 0.01433663 - time (sec): 23.34 - samples/sec: 2701.85 - lr: 0.000015 - momentum: 0.000000 2023-10-25 10:41:59,690 epoch 6 - iter 462/773 - loss 0.01277122 - time (sec): 28.12 - samples/sec: 2696.50 - lr: 0.000015 - momentum: 0.000000 2023-10-25 10:42:04,329 epoch 6 - iter 539/773 - loss 0.01360655 - time (sec): 32.76 - samples/sec: 2682.46 - lr: 0.000014 - momentum: 0.000000 2023-10-25 10:42:09,181 epoch 6 - iter 616/773 - loss 0.01363450 - time (sec): 37.61 - samples/sec: 2648.33 - lr: 0.000014 - momentum: 0.000000 2023-10-25 10:42:14,051 epoch 6 - iter 693/773 - loss 0.01353720 - time (sec): 42.48 - samples/sec: 2630.62 - lr: 0.000014 - momentum: 0.000000 2023-10-25 10:42:18,779 epoch 6 - iter 770/773 - loss 0.01363499 - time (sec): 47.21 - samples/sec: 2624.71 - lr: 0.000013 - momentum: 0.000000 2023-10-25 10:42:18,961 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:42:18,962 EPOCH 6 done: loss 0.0140 - lr: 0.000013 2023-10-25 10:42:22,522 DEV : loss 0.11278796941041946 - f1-score (micro avg) 0.7753 2023-10-25 10:42:22,540 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:42:27,293 epoch 7 - iter 77/773 - loss 0.00960456 - time (sec): 4.75 - samples/sec: 2680.96 - lr: 0.000013 - momentum: 0.000000 2023-10-25 10:42:32,004 epoch 7 - iter 154/773 - loss 0.00936374 - time (sec): 9.46 - samples/sec: 2626.52 - lr: 0.000013 - momentum: 0.000000 2023-10-25 10:42:36,826 epoch 7 - iter 231/773 - loss 0.00747500 - time (sec): 14.28 - samples/sec: 2713.22 - lr: 0.000012 - momentum: 0.000000 2023-10-25 10:42:41,284 epoch 7 - iter 308/773 - loss 0.00789143 - time (sec): 18.74 - samples/sec: 2647.02 - lr: 0.000012 - momentum: 0.000000 2023-10-25 10:42:45,927 epoch 7 - iter 385/773 - loss 0.00801181 - time (sec): 23.39 - samples/sec: 2644.58 - lr: 0.000012 - momentum: 0.000000 2023-10-25 10:42:50,560 epoch 7 - iter 462/773 - loss 0.00730589 - time (sec): 28.02 - samples/sec: 2658.91 - lr: 0.000011 - momentum: 0.000000 2023-10-25 10:42:55,182 epoch 7 - iter 539/773 - loss 0.00808199 - time (sec): 32.64 - samples/sec: 2631.74 - lr: 0.000011 - momentum: 0.000000 2023-10-25 10:42:59,804 epoch 7 - iter 616/773 - loss 0.00863132 - time (sec): 37.26 - samples/sec: 2626.61 - lr: 0.000011 - momentum: 0.000000 2023-10-25 10:43:04,706 epoch 7 - iter 693/773 - loss 0.00876967 - time (sec): 42.16 - samples/sec: 2631.43 - lr: 0.000010 - momentum: 0.000000 2023-10-25 10:43:09,406 epoch 7 - iter 770/773 - loss 0.00915212 - time (sec): 46.86 - samples/sec: 2639.82 - lr: 0.000010 - momentum: 0.000000 2023-10-25 10:43:09,584 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:43:09,584 EPOCH 7 done: loss 0.0091 - lr: 0.000010 2023-10-25 10:43:12,629 DEV : loss 0.11861388385295868 - f1-score (micro avg) 0.7724 2023-10-25 10:43:12,647 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:43:17,325 epoch 8 - iter 77/773 - loss 0.00461380 - time (sec): 4.68 - samples/sec: 2636.24 - lr: 0.000010 - momentum: 0.000000 2023-10-25 10:43:21,940 epoch 8 - iter 154/773 - loss 0.00702621 - time (sec): 9.29 - samples/sec: 2672.72 - lr: 0.000009 - momentum: 0.000000 2023-10-25 10:43:26,655 epoch 8 - iter 231/773 - loss 0.00812508 - time (sec): 14.01 - samples/sec: 2598.23 - lr: 0.000009 - momentum: 0.000000 2023-10-25 10:43:31,362 epoch 8 - iter 308/773 - loss 0.00661780 - time (sec): 18.71 - samples/sec: 2595.72 - lr: 0.000009 - momentum: 0.000000 2023-10-25 10:43:35,901 epoch 8 - iter 385/773 - loss 0.00640004 - time (sec): 23.25 - samples/sec: 2629.17 - lr: 0.000008 - momentum: 0.000000 2023-10-25 10:43:40,469 epoch 8 - iter 462/773 - loss 0.00633717 - time (sec): 27.82 - samples/sec: 2675.90 - lr: 0.000008 - momentum: 0.000000 2023-10-25 10:43:45,129 epoch 8 - iter 539/773 - loss 0.00653842 - time (sec): 32.48 - samples/sec: 2677.43 - lr: 0.000008 - momentum: 0.000000 2023-10-25 10:43:49,745 epoch 8 - iter 616/773 - loss 0.00732939 - time (sec): 37.10 - samples/sec: 2673.52 - lr: 0.000007 - momentum: 0.000000 2023-10-25 10:43:54,376 epoch 8 - iter 693/773 - loss 0.00709463 - time (sec): 41.73 - samples/sec: 2665.60 - lr: 0.000007 - momentum: 0.000000 2023-10-25 10:43:59,079 epoch 8 - iter 770/773 - loss 0.00668988 - time (sec): 46.43 - samples/sec: 2665.10 - lr: 0.000007 - momentum: 0.000000 2023-10-25 10:43:59,266 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:43:59,266 EPOCH 8 done: loss 0.0067 - lr: 0.000007 2023-10-25 10:44:02,424 DEV : loss 0.10935225337743759 - f1-score (micro avg) 0.7901 2023-10-25 10:44:02,442 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:44:07,211 epoch 9 - iter 77/773 - loss 0.00280222 - time (sec): 4.77 - samples/sec: 2641.93 - lr: 0.000006 - momentum: 0.000000 2023-10-25 10:44:11,756 epoch 9 - iter 154/773 - loss 0.00290731 - time (sec): 9.31 - samples/sec: 2687.34 - lr: 0.000006 - momentum: 0.000000 2023-10-25 10:44:16,394 epoch 9 - iter 231/773 - loss 0.00405151 - time (sec): 13.95 - samples/sec: 2681.13 - lr: 0.000006 - momentum: 0.000000 2023-10-25 10:44:21,122 epoch 9 - iter 308/773 - loss 0.00435854 - time (sec): 18.68 - samples/sec: 2697.32 - lr: 0.000005 - momentum: 0.000000 2023-10-25 10:44:25,805 epoch 9 - iter 385/773 - loss 0.00434929 - time (sec): 23.36 - samples/sec: 2682.52 - lr: 0.000005 - momentum: 0.000000 2023-10-25 10:44:30,607 epoch 9 - iter 462/773 - loss 0.00405141 - time (sec): 28.16 - samples/sec: 2661.17 - lr: 0.000005 - momentum: 0.000000 2023-10-25 10:44:35,355 epoch 9 - iter 539/773 - loss 0.00398165 - time (sec): 32.91 - samples/sec: 2641.67 - lr: 0.000004 - momentum: 0.000000 2023-10-25 10:44:40,206 epoch 9 - iter 616/773 - loss 0.00428787 - time (sec): 37.76 - samples/sec: 2622.47 - lr: 0.000004 - momentum: 0.000000 2023-10-25 10:44:44,909 epoch 9 - iter 693/773 - loss 0.00416455 - time (sec): 42.47 - samples/sec: 2642.13 - lr: 0.000004 - momentum: 0.000000 2023-10-25 10:44:49,527 epoch 9 - iter 770/773 - loss 0.00393684 - time (sec): 47.08 - samples/sec: 2633.23 - lr: 0.000003 - momentum: 0.000000 2023-10-25 10:44:49,707 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:44:49,707 EPOCH 9 done: loss 0.0039 - lr: 0.000003 2023-10-25 10:44:52,325 DEV : loss 0.11163745075464249 - f1-score (micro avg) 0.7942 2023-10-25 10:44:52,342 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:44:57,023 epoch 10 - iter 77/773 - loss 0.00160620 - time (sec): 4.68 - samples/sec: 2530.57 - lr: 0.000003 - momentum: 0.000000 2023-10-25 10:45:01,740 epoch 10 - iter 154/773 - loss 0.00218583 - time (sec): 9.40 - samples/sec: 2501.82 - lr: 0.000003 - momentum: 0.000000 2023-10-25 10:45:06,385 epoch 10 - iter 231/773 - loss 0.00253136 - time (sec): 14.04 - samples/sec: 2524.83 - lr: 0.000002 - momentum: 0.000000 2023-10-25 10:45:10,823 epoch 10 - iter 308/773 - loss 0.00321432 - time (sec): 18.48 - samples/sec: 2583.04 - lr: 0.000002 - momentum: 0.000000 2023-10-25 10:45:15,327 epoch 10 - iter 385/773 - loss 0.00292953 - time (sec): 22.98 - samples/sec: 2578.23 - lr: 0.000002 - momentum: 0.000000 2023-10-25 10:45:20,048 epoch 10 - iter 462/773 - loss 0.00283533 - time (sec): 27.70 - samples/sec: 2604.20 - lr: 0.000001 - momentum: 0.000000 2023-10-25 10:45:24,719 epoch 10 - iter 539/773 - loss 0.00275001 - time (sec): 32.37 - samples/sec: 2628.38 - lr: 0.000001 - momentum: 0.000000 2023-10-25 10:45:29,618 epoch 10 - iter 616/773 - loss 0.00267312 - time (sec): 37.27 - samples/sec: 2636.23 - lr: 0.000001 - momentum: 0.000000 2023-10-25 10:45:34,364 epoch 10 - iter 693/773 - loss 0.00242572 - time (sec): 42.02 - samples/sec: 2648.91 - lr: 0.000000 - momentum: 0.000000 2023-10-25 10:45:39,113 epoch 10 - iter 770/773 - loss 0.00279658 - time (sec): 46.77 - samples/sec: 2642.08 - lr: 0.000000 - momentum: 0.000000 2023-10-25 10:45:39,310 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:45:39,310 EPOCH 10 done: loss 0.0029 - lr: 0.000000 2023-10-25 10:45:42,349 DEV : loss 0.11523404717445374 - f1-score (micro avg) 0.7884 2023-10-25 10:45:43,297 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:45:43,299 Loading model from best epoch ... 2023-10-25 10:45:45,437 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-25 10:45:55,712 Results: - F-score (micro) 0.7656 - F-score (macro) 0.6513 - Accuracy 0.641 By class: precision recall f1-score support LOC 0.8262 0.8140 0.8200 946 BUILDING 0.5258 0.5514 0.5383 185 STREET 0.7368 0.5000 0.5957 56 micro avg 0.7732 0.7582 0.7656 1187 macro avg 0.6963 0.6218 0.6513 1187 weighted avg 0.7751 0.7582 0.7655 1187 2023-10-25 10:45:55,712 ----------------------------------------------------------------------------------------------------