2023-10-25 11:32:17,897 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:32:17,898 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 11:32:17,898 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:32:17,899 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:32:17,899 Train: 6183 sentences 2023-10-25 11:32:17,899 (train_with_dev=False, train_with_test=False) 2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:32:17,899 Training Params: 2023-10-25 11:32:17,899 - learning_rate: "5e-05" 2023-10-25 11:32:17,899 - mini_batch_size: "8" 2023-10-25 11:32:17,899 - max_epochs: "10" 2023-10-25 11:32:17,899 - shuffle: "True" 2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:32:17,899 Plugins: 2023-10-25 11:32:17,899 - TensorboardLogger 2023-10-25 11:32:17,899 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:32:17,899 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 11:32:17,899 - metric: "('micro avg', 'f1-score')" 2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:32:17,899 Computation: 2023-10-25 11:32:17,899 - compute on device: cuda:0 2023-10-25 11:32:17,899 - embedding storage: none 2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:32:17,899 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:32:17,900 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 11:32:22,467 epoch 1 - iter 77/773 - loss 1.55928995 - time (sec): 4.57 - samples/sec: 3117.25 - lr: 0.000005 - momentum: 0.000000 2023-10-25 11:32:26,899 epoch 1 - iter 154/773 - loss 0.94444348 - time (sec): 9.00 - samples/sec: 2895.74 - lr: 0.000010 - momentum: 0.000000 2023-10-25 11:32:31,544 epoch 1 - iter 231/773 - loss 0.68472071 - time (sec): 13.64 - samples/sec: 2835.90 - lr: 0.000015 - momentum: 0.000000 2023-10-25 11:32:36,101 epoch 1 - iter 308/773 - loss 0.54838258 - time (sec): 18.20 - samples/sec: 2792.56 - lr: 0.000020 - momentum: 0.000000 2023-10-25 11:32:40,729 epoch 1 - iter 385/773 - loss 0.46334242 - time (sec): 22.83 - samples/sec: 2770.44 - lr: 0.000025 - momentum: 0.000000 2023-10-25 11:32:45,076 epoch 1 - iter 462/773 - loss 0.41308200 - time (sec): 27.18 - samples/sec: 2740.33 - lr: 0.000030 - momentum: 0.000000 2023-10-25 11:32:49,556 epoch 1 - iter 539/773 - loss 0.37076947 - time (sec): 31.66 - samples/sec: 2735.35 - lr: 0.000035 - momentum: 0.000000 2023-10-25 11:32:53,999 epoch 1 - iter 616/773 - loss 0.33516167 - time (sec): 36.10 - samples/sec: 2746.05 - lr: 0.000040 - momentum: 0.000000 2023-10-25 11:32:58,447 epoch 1 - iter 693/773 - loss 0.30900288 - time (sec): 40.55 - samples/sec: 2748.87 - lr: 0.000045 - momentum: 0.000000 2023-10-25 11:33:03,589 epoch 1 - iter 770/773 - loss 0.28804151 - time (sec): 45.69 - samples/sec: 2708.60 - lr: 0.000050 - momentum: 0.000000 2023-10-25 11:33:03,765 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:33:03,766 EPOCH 1 done: loss 0.2873 - lr: 0.000050 2023-10-25 11:33:06,404 DEV : loss 0.06803968548774719 - f1-score (micro avg) 0.7209 2023-10-25 11:33:06,427 saving best model 2023-10-25 11:33:06,908 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:33:11,327 epoch 2 - iter 77/773 - loss 0.07623849 - time (sec): 4.42 - samples/sec: 2667.47 - lr: 0.000049 - momentum: 0.000000 2023-10-25 11:33:15,659 epoch 2 - iter 154/773 - loss 0.07951236 - time (sec): 8.75 - samples/sec: 2814.00 - lr: 0.000049 - momentum: 0.000000 2023-10-25 11:33:19,994 epoch 2 - iter 231/773 - loss 0.08070316 - time (sec): 13.08 - samples/sec: 2905.85 - lr: 0.000048 - momentum: 0.000000 2023-10-25 11:33:24,314 epoch 2 - iter 308/773 - loss 0.07668770 - time (sec): 17.40 - samples/sec: 2938.42 - lr: 0.000048 - momentum: 0.000000 2023-10-25 11:33:28,562 epoch 2 - iter 385/773 - loss 0.07601583 - time (sec): 21.65 - samples/sec: 2931.54 - lr: 0.000047 - momentum: 0.000000 2023-10-25 11:33:32,947 epoch 2 - iter 462/773 - loss 0.07553467 - time (sec): 26.04 - samples/sec: 2877.34 - lr: 0.000047 - momentum: 0.000000 2023-10-25 11:33:37,394 epoch 2 - iter 539/773 - loss 0.07726320 - time (sec): 30.48 - samples/sec: 2852.33 - lr: 0.000046 - momentum: 0.000000 2023-10-25 11:33:41,685 epoch 2 - iter 616/773 - loss 0.07687578 - time (sec): 34.77 - samples/sec: 2862.30 - lr: 0.000046 - momentum: 0.000000 2023-10-25 11:33:45,897 epoch 2 - iter 693/773 - loss 0.07690697 - time (sec): 38.99 - samples/sec: 2853.35 - lr: 0.000045 - momentum: 0.000000 2023-10-25 11:33:50,273 epoch 2 - iter 770/773 - loss 0.07652112 - time (sec): 43.36 - samples/sec: 2853.67 - lr: 0.000044 - momentum: 0.000000 2023-10-25 11:33:50,458 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:33:50,458 EPOCH 2 done: loss 0.0764 - lr: 0.000044 2023-10-25 11:33:54,305 DEV : loss 0.05765606090426445 - f1-score (micro avg) 0.6577 2023-10-25 11:33:54,328 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:33:58,799 epoch 3 - iter 77/773 - loss 0.04624094 - time (sec): 4.47 - samples/sec: 2773.78 - lr: 0.000044 - momentum: 0.000000 2023-10-25 11:34:03,308 epoch 3 - iter 154/773 - loss 0.04639348 - time (sec): 8.98 - samples/sec: 2763.86 - lr: 0.000043 - momentum: 0.000000 2023-10-25 11:34:07,837 epoch 3 - iter 231/773 - loss 0.04638161 - time (sec): 13.51 - samples/sec: 2848.46 - lr: 0.000043 - momentum: 0.000000 2023-10-25 11:34:12,399 epoch 3 - iter 308/773 - loss 0.04881426 - time (sec): 18.07 - samples/sec: 2807.68 - lr: 0.000042 - momentum: 0.000000 2023-10-25 11:34:17,014 epoch 3 - iter 385/773 - loss 0.05206470 - time (sec): 22.68 - samples/sec: 2748.41 - lr: 0.000042 - momentum: 0.000000 2023-10-25 11:34:21,488 epoch 3 - iter 462/773 - loss 0.05275003 - time (sec): 27.16 - samples/sec: 2704.18 - lr: 0.000041 - momentum: 0.000000 2023-10-25 11:34:25,906 epoch 3 - iter 539/773 - loss 0.05152024 - time (sec): 31.58 - samples/sec: 2736.60 - lr: 0.000041 - momentum: 0.000000 2023-10-25 11:34:30,464 epoch 3 - iter 616/773 - loss 0.05087172 - time (sec): 36.13 - samples/sec: 2732.63 - lr: 0.000040 - momentum: 0.000000 2023-10-25 11:34:35,078 epoch 3 - iter 693/773 - loss 0.05319417 - time (sec): 40.75 - samples/sec: 2731.67 - lr: 0.000039 - momentum: 0.000000 2023-10-25 11:34:39,720 epoch 3 - iter 770/773 - loss 0.05805865 - time (sec): 45.39 - samples/sec: 2731.10 - lr: 0.000039 - momentum: 0.000000 2023-10-25 11:34:39,889 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:34:39,889 EPOCH 3 done: loss 0.0581 - lr: 0.000039 2023-10-25 11:34:42,593 DEV : loss 0.09656655043363571 - f1-score (micro avg) 0.7287 2023-10-25 11:34:42,615 saving best model 2023-10-25 11:34:43,315 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:34:47,956 epoch 4 - iter 77/773 - loss 0.04468020 - time (sec): 4.64 - samples/sec: 2770.59 - lr: 0.000038 - momentum: 0.000000 2023-10-25 11:34:52,542 epoch 4 - iter 154/773 - loss 0.05357280 - time (sec): 9.22 - samples/sec: 2790.55 - lr: 0.000038 - momentum: 0.000000 2023-10-25 11:34:56,989 epoch 4 - iter 231/773 - loss 0.05948461 - time (sec): 13.67 - samples/sec: 2812.21 - lr: 0.000037 - momentum: 0.000000 2023-10-25 11:35:01,205 epoch 4 - iter 308/773 - loss 0.06530804 - time (sec): 17.89 - samples/sec: 2838.76 - lr: 0.000037 - momentum: 0.000000 2023-10-25 11:35:05,460 epoch 4 - iter 385/773 - loss 0.06132855 - time (sec): 22.14 - samples/sec: 2860.87 - lr: 0.000036 - momentum: 0.000000 2023-10-25 11:35:09,680 epoch 4 - iter 462/773 - loss 0.05751702 - time (sec): 26.36 - samples/sec: 2876.95 - lr: 0.000036 - momentum: 0.000000 2023-10-25 11:35:13,858 epoch 4 - iter 539/773 - loss 0.06019540 - time (sec): 30.54 - samples/sec: 2865.69 - lr: 0.000035 - momentum: 0.000000 2023-10-25 11:35:17,970 epoch 4 - iter 616/773 - loss 0.06214758 - time (sec): 34.65 - samples/sec: 2860.96 - lr: 0.000034 - momentum: 0.000000 2023-10-25 11:35:22,136 epoch 4 - iter 693/773 - loss 0.06374761 - time (sec): 38.82 - samples/sec: 2850.03 - lr: 0.000034 - momentum: 0.000000 2023-10-25 11:35:26,302 epoch 4 - iter 770/773 - loss 0.06105828 - time (sec): 42.98 - samples/sec: 2878.38 - lr: 0.000033 - momentum: 0.000000 2023-10-25 11:35:26,475 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:35:26,476 EPOCH 4 done: loss 0.0609 - lr: 0.000033 2023-10-25 11:35:29,209 DEV : loss 0.10266965627670288 - f1-score (micro avg) 0.7418 2023-10-25 11:35:29,226 saving best model 2023-10-25 11:35:29,827 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:35:34,167 epoch 5 - iter 77/773 - loss 0.07354317 - time (sec): 4.34 - samples/sec: 2703.66 - lr: 0.000033 - momentum: 0.000000 2023-10-25 11:35:38,533 epoch 5 - iter 154/773 - loss 0.05656659 - time (sec): 8.70 - samples/sec: 2754.18 - lr: 0.000032 - momentum: 0.000000 2023-10-25 11:35:43,064 epoch 5 - iter 231/773 - loss 0.06643580 - time (sec): 13.24 - samples/sec: 2772.49 - lr: 0.000032 - momentum: 0.000000 2023-10-25 11:35:47,551 epoch 5 - iter 308/773 - loss 0.06782891 - time (sec): 17.72 - samples/sec: 2771.55 - lr: 0.000031 - momentum: 0.000000 2023-10-25 11:35:52,054 epoch 5 - iter 385/773 - loss 0.06184014 - time (sec): 22.23 - samples/sec: 2727.18 - lr: 0.000031 - momentum: 0.000000 2023-10-25 11:35:56,527 epoch 5 - iter 462/773 - loss 0.05866160 - time (sec): 26.70 - samples/sec: 2749.72 - lr: 0.000030 - momentum: 0.000000 2023-10-25 11:36:00,938 epoch 5 - iter 539/773 - loss 0.06250747 - time (sec): 31.11 - samples/sec: 2755.48 - lr: 0.000029 - momentum: 0.000000 2023-10-25 11:36:05,448 epoch 5 - iter 616/773 - loss 0.06026997 - time (sec): 35.62 - samples/sec: 2756.11 - lr: 0.000029 - momentum: 0.000000 2023-10-25 11:36:09,984 epoch 5 - iter 693/773 - loss 0.05814121 - time (sec): 40.16 - samples/sec: 2749.97 - lr: 0.000028 - momentum: 0.000000 2023-10-25 11:36:14,741 epoch 5 - iter 770/773 - loss 0.05494553 - time (sec): 44.91 - samples/sec: 2759.99 - lr: 0.000028 - momentum: 0.000000 2023-10-25 11:36:14,907 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:36:14,907 EPOCH 5 done: loss 0.0549 - lr: 0.000028 2023-10-25 11:36:17,604 DEV : loss 0.1113104522228241 - f1-score (micro avg) 0.7619 2023-10-25 11:36:17,624 saving best model 2023-10-25 11:36:18,324 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:36:22,987 epoch 6 - iter 77/773 - loss 0.04865046 - time (sec): 4.66 - samples/sec: 2616.38 - lr: 0.000027 - momentum: 0.000000 2023-10-25 11:36:27,557 epoch 6 - iter 154/773 - loss 0.03475110 - time (sec): 9.23 - samples/sec: 2676.91 - lr: 0.000027 - momentum: 0.000000 2023-10-25 11:36:32,097 epoch 6 - iter 231/773 - loss 0.02956246 - time (sec): 13.77 - samples/sec: 2710.98 - lr: 0.000026 - momentum: 0.000000 2023-10-25 11:36:36,609 epoch 6 - iter 308/773 - loss 0.03026117 - time (sec): 18.28 - samples/sec: 2695.78 - lr: 0.000026 - momentum: 0.000000 2023-10-25 11:36:41,206 epoch 6 - iter 385/773 - loss 0.03316285 - time (sec): 22.88 - samples/sec: 2765.88 - lr: 0.000025 - momentum: 0.000000 2023-10-25 11:36:45,639 epoch 6 - iter 462/773 - loss 0.03323264 - time (sec): 27.31 - samples/sec: 2774.22 - lr: 0.000024 - momentum: 0.000000 2023-10-25 11:36:50,098 epoch 6 - iter 539/773 - loss 0.03456321 - time (sec): 31.77 - samples/sec: 2750.32 - lr: 0.000024 - momentum: 0.000000 2023-10-25 11:36:54,524 epoch 6 - iter 616/773 - loss 0.03464651 - time (sec): 36.20 - samples/sec: 2751.08 - lr: 0.000023 - momentum: 0.000000 2023-10-25 11:36:58,914 epoch 6 - iter 693/773 - loss 0.03425099 - time (sec): 40.59 - samples/sec: 2748.59 - lr: 0.000023 - momentum: 0.000000 2023-10-25 11:37:03,254 epoch 6 - iter 770/773 - loss 0.03297911 - time (sec): 44.93 - samples/sec: 2756.29 - lr: 0.000022 - momentum: 0.000000 2023-10-25 11:37:03,415 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:37:03,416 EPOCH 6 done: loss 0.0329 - lr: 0.000022 2023-10-25 11:37:06,006 DEV : loss 0.09328915923833847 - f1-score (micro avg) 0.7741 2023-10-25 11:37:06,024 saving best model 2023-10-25 11:37:06,667 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:37:11,201 epoch 7 - iter 77/773 - loss 0.02826965 - time (sec): 4.53 - samples/sec: 2689.74 - lr: 0.000022 - momentum: 0.000000 2023-10-25 11:37:16,301 epoch 7 - iter 154/773 - loss 0.02916549 - time (sec): 9.63 - samples/sec: 2579.06 - lr: 0.000021 - momentum: 0.000000 2023-10-25 11:37:20,597 epoch 7 - iter 231/773 - loss 0.02694630 - time (sec): 13.93 - samples/sec: 2688.77 - lr: 0.000021 - momentum: 0.000000 2023-10-25 11:37:25,060 epoch 7 - iter 308/773 - loss 0.02382137 - time (sec): 18.39 - samples/sec: 2700.08 - lr: 0.000020 - momentum: 0.000000 2023-10-25 11:37:29,623 epoch 7 - iter 385/773 - loss 0.02400061 - time (sec): 22.95 - samples/sec: 2674.46 - lr: 0.000019 - momentum: 0.000000 2023-10-25 11:37:34,323 epoch 7 - iter 462/773 - loss 0.02420007 - time (sec): 27.65 - samples/sec: 2692.42 - lr: 0.000019 - momentum: 0.000000 2023-10-25 11:37:39,239 epoch 7 - iter 539/773 - loss 0.02350486 - time (sec): 32.57 - samples/sec: 2653.80 - lr: 0.000018 - momentum: 0.000000 2023-10-25 11:37:44,444 epoch 7 - iter 616/773 - loss 0.02263543 - time (sec): 37.77 - samples/sec: 2600.25 - lr: 0.000018 - momentum: 0.000000 2023-10-25 11:37:48,997 epoch 7 - iter 693/773 - loss 0.02248745 - time (sec): 42.33 - samples/sec: 2624.42 - lr: 0.000017 - momentum: 0.000000 2023-10-25 11:37:53,293 epoch 7 - iter 770/773 - loss 0.02276205 - time (sec): 46.62 - samples/sec: 2656.63 - lr: 0.000017 - momentum: 0.000000 2023-10-25 11:37:53,458 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:37:53,458 EPOCH 7 done: loss 0.0227 - lr: 0.000017 2023-10-25 11:37:56,498 DEV : loss 0.09706912189722061 - f1-score (micro avg) 0.786 2023-10-25 11:37:56,520 saving best model 2023-10-25 11:37:57,194 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:38:01,852 epoch 8 - iter 77/773 - loss 0.00963689 - time (sec): 4.65 - samples/sec: 2569.03 - lr: 0.000016 - momentum: 0.000000 2023-10-25 11:38:06,649 epoch 8 - iter 154/773 - loss 0.01048478 - time (sec): 9.45 - samples/sec: 2646.74 - lr: 0.000016 - momentum: 0.000000 2023-10-25 11:38:11,176 epoch 8 - iter 231/773 - loss 0.01023261 - time (sec): 13.98 - samples/sec: 2696.04 - lr: 0.000015 - momentum: 0.000000 2023-10-25 11:38:15,527 epoch 8 - iter 308/773 - loss 0.01078646 - time (sec): 18.33 - samples/sec: 2741.57 - lr: 0.000014 - momentum: 0.000000 2023-10-25 11:38:19,868 epoch 8 - iter 385/773 - loss 0.01293255 - time (sec): 22.67 - samples/sec: 2762.41 - lr: 0.000014 - momentum: 0.000000 2023-10-25 11:38:24,519 epoch 8 - iter 462/773 - loss 0.01414220 - time (sec): 27.32 - samples/sec: 2713.20 - lr: 0.000013 - momentum: 0.000000 2023-10-25 11:38:29,076 epoch 8 - iter 539/773 - loss 0.01443726 - time (sec): 31.88 - samples/sec: 2720.03 - lr: 0.000013 - momentum: 0.000000 2023-10-25 11:38:33,590 epoch 8 - iter 616/773 - loss 0.01493887 - time (sec): 36.39 - samples/sec: 2719.16 - lr: 0.000012 - momentum: 0.000000 2023-10-25 11:38:38,269 epoch 8 - iter 693/773 - loss 0.01475114 - time (sec): 41.07 - samples/sec: 2706.28 - lr: 0.000012 - momentum: 0.000000 2023-10-25 11:38:42,783 epoch 8 - iter 770/773 - loss 0.01430847 - time (sec): 45.58 - samples/sec: 2712.30 - lr: 0.000011 - momentum: 0.000000 2023-10-25 11:38:42,957 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:38:42,958 EPOCH 8 done: loss 0.0143 - lr: 0.000011 2023-10-25 11:38:45,598 DEV : loss 0.10801413655281067 - f1-score (micro avg) 0.7515 2023-10-25 11:38:45,615 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:38:50,161 epoch 9 - iter 77/773 - loss 0.00466196 - time (sec): 4.54 - samples/sec: 2698.86 - lr: 0.000011 - momentum: 0.000000 2023-10-25 11:38:54,650 epoch 9 - iter 154/773 - loss 0.00740543 - time (sec): 9.03 - samples/sec: 2702.12 - lr: 0.000010 - momentum: 0.000000 2023-10-25 11:38:59,263 epoch 9 - iter 231/773 - loss 0.00858285 - time (sec): 13.65 - samples/sec: 2678.11 - lr: 0.000009 - momentum: 0.000000 2023-10-25 11:39:03,513 epoch 9 - iter 308/773 - loss 0.00944576 - time (sec): 17.90 - samples/sec: 2731.89 - lr: 0.000009 - momentum: 0.000000 2023-10-25 11:39:08,048 epoch 9 - iter 385/773 - loss 0.00881598 - time (sec): 22.43 - samples/sec: 2778.28 - lr: 0.000008 - momentum: 0.000000 2023-10-25 11:39:12,207 epoch 9 - iter 462/773 - loss 0.00893348 - time (sec): 26.59 - samples/sec: 2804.33 - lr: 0.000008 - momentum: 0.000000 2023-10-25 11:39:16,419 epoch 9 - iter 539/773 - loss 0.00944343 - time (sec): 30.80 - samples/sec: 2818.54 - lr: 0.000007 - momentum: 0.000000 2023-10-25 11:39:20,930 epoch 9 - iter 616/773 - loss 0.00960352 - time (sec): 35.31 - samples/sec: 2824.99 - lr: 0.000007 - momentum: 0.000000 2023-10-25 11:39:25,174 epoch 9 - iter 693/773 - loss 0.00940583 - time (sec): 39.56 - samples/sec: 2841.88 - lr: 0.000006 - momentum: 0.000000 2023-10-25 11:39:29,510 epoch 9 - iter 770/773 - loss 0.00955294 - time (sec): 43.89 - samples/sec: 2821.72 - lr: 0.000006 - momentum: 0.000000 2023-10-25 11:39:29,686 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:39:29,687 EPOCH 9 done: loss 0.0095 - lr: 0.000006 2023-10-25 11:39:32,704 DEV : loss 0.11337698251008987 - f1-score (micro avg) 0.7683 2023-10-25 11:39:32,724 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:39:37,694 epoch 10 - iter 77/773 - loss 0.00238185 - time (sec): 4.97 - samples/sec: 2756.02 - lr: 0.000005 - momentum: 0.000000 2023-10-25 11:39:42,114 epoch 10 - iter 154/773 - loss 0.00380767 - time (sec): 9.39 - samples/sec: 2716.28 - lr: 0.000005 - momentum: 0.000000 2023-10-25 11:39:46,400 epoch 10 - iter 231/773 - loss 0.00374973 - time (sec): 13.68 - samples/sec: 2800.44 - lr: 0.000004 - momentum: 0.000000 2023-10-25 11:39:50,769 epoch 10 - iter 308/773 - loss 0.00342233 - time (sec): 18.04 - samples/sec: 2807.72 - lr: 0.000003 - momentum: 0.000000 2023-10-25 11:39:54,989 epoch 10 - iter 385/773 - loss 0.00446767 - time (sec): 22.26 - samples/sec: 2856.40 - lr: 0.000003 - momentum: 0.000000 2023-10-25 11:39:59,798 epoch 10 - iter 462/773 - loss 0.00465055 - time (sec): 27.07 - samples/sec: 2805.78 - lr: 0.000002 - momentum: 0.000000 2023-10-25 11:40:03,966 epoch 10 - iter 539/773 - loss 0.00454917 - time (sec): 31.24 - samples/sec: 2810.46 - lr: 0.000002 - momentum: 0.000000 2023-10-25 11:40:08,229 epoch 10 - iter 616/773 - loss 0.00492510 - time (sec): 35.50 - samples/sec: 2808.44 - lr: 0.000001 - momentum: 0.000000 2023-10-25 11:40:12,467 epoch 10 - iter 693/773 - loss 0.00477115 - time (sec): 39.74 - samples/sec: 2816.91 - lr: 0.000001 - momentum: 0.000000 2023-10-25 11:40:16,634 epoch 10 - iter 770/773 - loss 0.00517975 - time (sec): 43.91 - samples/sec: 2821.68 - lr: 0.000000 - momentum: 0.000000 2023-10-25 11:40:16,791 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:16,791 EPOCH 10 done: loss 0.0052 - lr: 0.000000 2023-10-25 11:40:20,311 DEV : loss 0.12045777589082718 - f1-score (micro avg) 0.7648 2023-10-25 11:40:20,827 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:20,828 Loading model from best epoch ... 2023-10-25 11:40:22,671 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-25 11:40:33,409 Results: - F-score (micro) 0.7908 - F-score (macro) 0.6692 - Accuracy 0.6765 By class: precision recall f1-score support LOC 0.8141 0.8562 0.8346 946 BUILDING 0.6667 0.5622 0.6100 185 STREET 0.6170 0.5179 0.5631 56 micro avg 0.7871 0.7944 0.7908 1187 macro avg 0.6993 0.6454 0.6692 1187 weighted avg 0.7818 0.7944 0.7868 1187 2023-10-25 11:40:33,410 ----------------------------------------------------------------------------------------------------