2023-10-20 09:48:42,765 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:48:42,765 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-20 09:48:42,765 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:48:42,765 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-20 09:48:42,765 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:48:42,765 Train: 6183 sentences 2023-10-20 09:48:42,765 (train_with_dev=False, train_with_test=False) 2023-10-20 09:48:42,765 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:48:42,765 Training Params: 2023-10-20 09:48:42,765 - learning_rate: "5e-05" 2023-10-20 09:48:42,765 - mini_batch_size: "8" 2023-10-20 09:48:42,765 - max_epochs: "10" 2023-10-20 09:48:42,765 - shuffle: "True" 2023-10-20 09:48:42,765 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:48:42,766 Plugins: 2023-10-20 09:48:42,766 - TensorboardLogger 2023-10-20 09:48:42,766 - LinearScheduler | warmup_fraction: '0.1' 2023-10-20 09:48:42,766 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:48:42,766 Final evaluation on model from best epoch (best-model.pt) 2023-10-20 09:48:42,766 - metric: "('micro avg', 'f1-score')" 2023-10-20 09:48:42,766 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:48:42,766 Computation: 2023-10-20 09:48:42,766 - compute on device: cuda:0 2023-10-20 09:48:42,766 - embedding storage: none 2023-10-20 09:48:42,766 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:48:42,766 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-20 09:48:42,766 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:48:42,766 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:48:42,766 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-20 09:48:44,557 epoch 1 - iter 77/773 - loss 2.87926452 - time (sec): 1.79 - samples/sec: 6926.45 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:48:46,323 epoch 1 - iter 154/773 - loss 2.58090791 - time (sec): 3.56 - samples/sec: 7242.62 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:48:48,037 epoch 1 - iter 231/773 - loss 2.19347159 - time (sec): 5.27 - samples/sec: 7066.32 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:48:49,762 epoch 1 - iter 308/773 - loss 1.79530466 - time (sec): 7.00 - samples/sec: 6939.16 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:48:51,483 epoch 1 - iter 385/773 - loss 1.49005800 - time (sec): 8.72 - samples/sec: 6982.79 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:48:53,275 epoch 1 - iter 462/773 - loss 1.28696614 - time (sec): 10.51 - samples/sec: 6966.72 - lr: 0.000030 - momentum: 0.000000 2023-10-20 09:48:55,026 epoch 1 - iter 539/773 - loss 1.12674983 - time (sec): 12.26 - samples/sec: 7093.68 - lr: 0.000035 - momentum: 0.000000 2023-10-20 09:48:56,741 epoch 1 - iter 616/773 - loss 1.02283949 - time (sec): 13.97 - samples/sec: 7070.20 - lr: 0.000040 - momentum: 0.000000 2023-10-20 09:48:58,434 epoch 1 - iter 693/773 - loss 0.93498457 - time (sec): 15.67 - samples/sec: 7095.90 - lr: 0.000045 - momentum: 0.000000 2023-10-20 09:49:00,161 epoch 1 - iter 770/773 - loss 0.86157976 - time (sec): 17.39 - samples/sec: 7118.20 - lr: 0.000050 - momentum: 0.000000 2023-10-20 09:49:00,225 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:49:00,225 EPOCH 1 done: loss 0.8589 - lr: 0.000050 2023-10-20 09:49:00,920 DEV : loss 0.13765180110931396 - f1-score (micro avg) 0.0 2023-10-20 09:49:00,931 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:49:02,640 epoch 2 - iter 77/773 - loss 0.21698829 - time (sec): 1.71 - samples/sec: 7968.37 - lr: 0.000049 - momentum: 0.000000 2023-10-20 09:49:04,280 epoch 2 - iter 154/773 - loss 0.20811745 - time (sec): 3.35 - samples/sec: 7749.56 - lr: 0.000049 - momentum: 0.000000 2023-10-20 09:49:06,063 epoch 2 - iter 231/773 - loss 0.20796981 - time (sec): 5.13 - samples/sec: 7473.80 - lr: 0.000048 - momentum: 0.000000 2023-10-20 09:49:08,149 epoch 2 - iter 308/773 - loss 0.20289547 - time (sec): 7.22 - samples/sec: 6999.97 - lr: 0.000048 - momentum: 0.000000 2023-10-20 09:49:09,929 epoch 2 - iter 385/773 - loss 0.20413434 - time (sec): 9.00 - samples/sec: 6985.69 - lr: 0.000047 - momentum: 0.000000 2023-10-20 09:49:11,712 epoch 2 - iter 462/773 - loss 0.19808681 - time (sec): 10.78 - samples/sec: 7019.02 - lr: 0.000047 - momentum: 0.000000 2023-10-20 09:49:13,398 epoch 2 - iter 539/773 - loss 0.19621468 - time (sec): 12.47 - samples/sec: 7043.99 - lr: 0.000046 - momentum: 0.000000 2023-10-20 09:49:15,069 epoch 2 - iter 616/773 - loss 0.19508540 - time (sec): 14.14 - samples/sec: 7028.74 - lr: 0.000046 - momentum: 0.000000 2023-10-20 09:49:16,802 epoch 2 - iter 693/773 - loss 0.19324075 - time (sec): 15.87 - samples/sec: 7033.26 - lr: 0.000045 - momentum: 0.000000 2023-10-20 09:49:18,554 epoch 2 - iter 770/773 - loss 0.18967567 - time (sec): 17.62 - samples/sec: 7025.86 - lr: 0.000044 - momentum: 0.000000 2023-10-20 09:49:18,621 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:49:18,621 EPOCH 2 done: loss 0.1896 - lr: 0.000044 2023-10-20 09:49:19,698 DEV : loss 0.09595523029565811 - f1-score (micro avg) 0.3657 2023-10-20 09:49:19,709 saving best model 2023-10-20 09:49:19,738 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:49:21,498 epoch 3 - iter 77/773 - loss 0.18262781 - time (sec): 1.76 - samples/sec: 6333.46 - lr: 0.000044 - momentum: 0.000000 2023-10-20 09:49:23,277 epoch 3 - iter 154/773 - loss 0.16257290 - time (sec): 3.54 - samples/sec: 6881.12 - lr: 0.000043 - momentum: 0.000000 2023-10-20 09:49:25,002 epoch 3 - iter 231/773 - loss 0.16928954 - time (sec): 5.26 - samples/sec: 6976.00 - lr: 0.000043 - momentum: 0.000000 2023-10-20 09:49:26,737 epoch 3 - iter 308/773 - loss 0.16039613 - time (sec): 7.00 - samples/sec: 6906.62 - lr: 0.000042 - momentum: 0.000000 2023-10-20 09:49:28,481 epoch 3 - iter 385/773 - loss 0.15828599 - time (sec): 8.74 - samples/sec: 6981.46 - lr: 0.000042 - momentum: 0.000000 2023-10-20 09:49:30,238 epoch 3 - iter 462/773 - loss 0.15435750 - time (sec): 10.50 - samples/sec: 7057.56 - lr: 0.000041 - momentum: 0.000000 2023-10-20 09:49:31,993 epoch 3 - iter 539/773 - loss 0.15526689 - time (sec): 12.25 - samples/sec: 7053.55 - lr: 0.000041 - momentum: 0.000000 2023-10-20 09:49:33,650 epoch 3 - iter 616/773 - loss 0.15528455 - time (sec): 13.91 - samples/sec: 7090.48 - lr: 0.000040 - momentum: 0.000000 2023-10-20 09:49:35,406 epoch 3 - iter 693/773 - loss 0.15441732 - time (sec): 15.67 - samples/sec: 7124.04 - lr: 0.000039 - momentum: 0.000000 2023-10-20 09:49:37,244 epoch 3 - iter 770/773 - loss 0.15466267 - time (sec): 17.50 - samples/sec: 7073.24 - lr: 0.000039 - momentum: 0.000000 2023-10-20 09:49:37,308 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:49:37,309 EPOCH 3 done: loss 0.1552 - lr: 0.000039 2023-10-20 09:49:38,391 DEV : loss 0.09129446744918823 - f1-score (micro avg) 0.4545 2023-10-20 09:49:38,402 saving best model 2023-10-20 09:49:38,437 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:49:40,135 epoch 4 - iter 77/773 - loss 0.12897817 - time (sec): 1.70 - samples/sec: 8025.57 - lr: 0.000038 - momentum: 0.000000 2023-10-20 09:49:41,847 epoch 4 - iter 154/773 - loss 0.14162760 - time (sec): 3.41 - samples/sec: 7389.73 - lr: 0.000038 - momentum: 0.000000 2023-10-20 09:49:43,541 epoch 4 - iter 231/773 - loss 0.13190156 - time (sec): 5.10 - samples/sec: 7334.53 - lr: 0.000037 - momentum: 0.000000 2023-10-20 09:49:45,309 epoch 4 - iter 308/773 - loss 0.13315426 - time (sec): 6.87 - samples/sec: 7213.08 - lr: 0.000037 - momentum: 0.000000 2023-10-20 09:49:47,080 epoch 4 - iter 385/773 - loss 0.13213692 - time (sec): 8.64 - samples/sec: 7243.47 - lr: 0.000036 - momentum: 0.000000 2023-10-20 09:49:48,813 epoch 4 - iter 462/773 - loss 0.13554983 - time (sec): 10.38 - samples/sec: 7207.39 - lr: 0.000036 - momentum: 0.000000 2023-10-20 09:49:50,578 epoch 4 - iter 539/773 - loss 0.13667052 - time (sec): 12.14 - samples/sec: 7200.13 - lr: 0.000035 - momentum: 0.000000 2023-10-20 09:49:52,341 epoch 4 - iter 616/773 - loss 0.13714767 - time (sec): 13.90 - samples/sec: 7204.78 - lr: 0.000034 - momentum: 0.000000 2023-10-20 09:49:54,053 epoch 4 - iter 693/773 - loss 0.13504563 - time (sec): 15.62 - samples/sec: 7204.58 - lr: 0.000034 - momentum: 0.000000 2023-10-20 09:49:55,783 epoch 4 - iter 770/773 - loss 0.13630022 - time (sec): 17.35 - samples/sec: 7139.71 - lr: 0.000033 - momentum: 0.000000 2023-10-20 09:49:55,846 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:49:55,847 EPOCH 4 done: loss 0.1362 - lr: 0.000033 2023-10-20 09:49:56,915 DEV : loss 0.08750458806753159 - f1-score (micro avg) 0.5182 2023-10-20 09:49:56,927 saving best model 2023-10-20 09:49:56,969 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:49:58,649 epoch 5 - iter 77/773 - loss 0.11837622 - time (sec): 1.68 - samples/sec: 7426.52 - lr: 0.000033 - momentum: 0.000000 2023-10-20 09:50:00,345 epoch 5 - iter 154/773 - loss 0.11950342 - time (sec): 3.38 - samples/sec: 7326.73 - lr: 0.000032 - momentum: 0.000000 2023-10-20 09:50:02,083 epoch 5 - iter 231/773 - loss 0.12281718 - time (sec): 5.11 - samples/sec: 7176.13 - lr: 0.000032 - momentum: 0.000000 2023-10-20 09:50:03,759 epoch 5 - iter 308/773 - loss 0.12163778 - time (sec): 6.79 - samples/sec: 7307.34 - lr: 0.000031 - momentum: 0.000000 2023-10-20 09:50:05,449 epoch 5 - iter 385/773 - loss 0.11961392 - time (sec): 8.48 - samples/sec: 7367.32 - lr: 0.000031 - momentum: 0.000000 2023-10-20 09:50:07,225 epoch 5 - iter 462/773 - loss 0.12444845 - time (sec): 10.26 - samples/sec: 7314.02 - lr: 0.000030 - momentum: 0.000000 2023-10-20 09:50:08,940 epoch 5 - iter 539/773 - loss 0.12636329 - time (sec): 11.97 - samples/sec: 7281.68 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:50:10,627 epoch 5 - iter 616/773 - loss 0.12908626 - time (sec): 13.66 - samples/sec: 7262.66 - lr: 0.000029 - momentum: 0.000000 2023-10-20 09:50:12,304 epoch 5 - iter 693/773 - loss 0.12873301 - time (sec): 15.33 - samples/sec: 7273.24 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:50:14,033 epoch 5 - iter 770/773 - loss 0.12550489 - time (sec): 17.06 - samples/sec: 7257.85 - lr: 0.000028 - momentum: 0.000000 2023-10-20 09:50:14,093 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:50:14,093 EPOCH 5 done: loss 0.1256 - lr: 0.000028 2023-10-20 09:50:15,180 DEV : loss 0.08463426679372787 - f1-score (micro avg) 0.5385 2023-10-20 09:50:15,192 saving best model 2023-10-20 09:50:15,232 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:50:16,956 epoch 6 - iter 77/773 - loss 0.15202072 - time (sec): 1.72 - samples/sec: 7215.52 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:50:18,717 epoch 6 - iter 154/773 - loss 0.12842673 - time (sec): 3.48 - samples/sec: 7212.32 - lr: 0.000027 - momentum: 0.000000 2023-10-20 09:50:20,453 epoch 6 - iter 231/773 - loss 0.12246865 - time (sec): 5.22 - samples/sec: 7160.36 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:50:22,137 epoch 6 - iter 308/773 - loss 0.12273321 - time (sec): 6.90 - samples/sec: 7074.89 - lr: 0.000026 - momentum: 0.000000 2023-10-20 09:50:23,834 epoch 6 - iter 385/773 - loss 0.12023376 - time (sec): 8.60 - samples/sec: 7223.55 - lr: 0.000025 - momentum: 0.000000 2023-10-20 09:50:25,485 epoch 6 - iter 462/773 - loss 0.11889169 - time (sec): 10.25 - samples/sec: 7235.65 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:50:27,185 epoch 6 - iter 539/773 - loss 0.11869253 - time (sec): 11.95 - samples/sec: 7243.32 - lr: 0.000024 - momentum: 0.000000 2023-10-20 09:50:28,935 epoch 6 - iter 616/773 - loss 0.11634542 - time (sec): 13.70 - samples/sec: 7221.14 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:50:30,696 epoch 6 - iter 693/773 - loss 0.11714691 - time (sec): 15.46 - samples/sec: 7245.60 - lr: 0.000023 - momentum: 0.000000 2023-10-20 09:50:32,382 epoch 6 - iter 770/773 - loss 0.11633289 - time (sec): 17.15 - samples/sec: 7219.80 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:50:32,450 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:50:32,451 EPOCH 6 done: loss 0.1160 - lr: 0.000022 2023-10-20 09:50:33,540 DEV : loss 0.08559702336788177 - f1-score (micro avg) 0.5348 2023-10-20 09:50:33,551 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:50:35,193 epoch 7 - iter 77/773 - loss 0.13418942 - time (sec): 1.64 - samples/sec: 7322.35 - lr: 0.000022 - momentum: 0.000000 2023-10-20 09:50:36,886 epoch 7 - iter 154/773 - loss 0.11605993 - time (sec): 3.33 - samples/sec: 7552.83 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:50:38,581 epoch 7 - iter 231/773 - loss 0.11974916 - time (sec): 5.03 - samples/sec: 7346.15 - lr: 0.000021 - momentum: 0.000000 2023-10-20 09:50:40,361 epoch 7 - iter 308/773 - loss 0.11273530 - time (sec): 6.81 - samples/sec: 7309.01 - lr: 0.000020 - momentum: 0.000000 2023-10-20 09:50:42,109 epoch 7 - iter 385/773 - loss 0.11096649 - time (sec): 8.56 - samples/sec: 7100.74 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:50:43,841 epoch 7 - iter 462/773 - loss 0.11189813 - time (sec): 10.29 - samples/sec: 7072.68 - lr: 0.000019 - momentum: 0.000000 2023-10-20 09:50:45,715 epoch 7 - iter 539/773 - loss 0.11121427 - time (sec): 12.16 - samples/sec: 7043.16 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:50:47,508 epoch 7 - iter 616/773 - loss 0.11017605 - time (sec): 13.96 - samples/sec: 7059.63 - lr: 0.000018 - momentum: 0.000000 2023-10-20 09:50:49,317 epoch 7 - iter 693/773 - loss 0.10910748 - time (sec): 15.76 - samples/sec: 7072.09 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:50:51,038 epoch 7 - iter 770/773 - loss 0.11083517 - time (sec): 17.49 - samples/sec: 7079.64 - lr: 0.000017 - momentum: 0.000000 2023-10-20 09:50:51,101 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:50:51,101 EPOCH 7 done: loss 0.1107 - lr: 0.000017 2023-10-20 09:50:52,197 DEV : loss 0.08398106694221497 - f1-score (micro avg) 0.5572 2023-10-20 09:50:52,209 saving best model 2023-10-20 09:50:52,248 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:50:53,959 epoch 8 - iter 77/773 - loss 0.07735418 - time (sec): 1.71 - samples/sec: 7032.51 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:50:55,671 epoch 8 - iter 154/773 - loss 0.09799317 - time (sec): 3.42 - samples/sec: 6976.98 - lr: 0.000016 - momentum: 0.000000 2023-10-20 09:50:57,435 epoch 8 - iter 231/773 - loss 0.10334869 - time (sec): 5.19 - samples/sec: 7043.50 - lr: 0.000015 - momentum: 0.000000 2023-10-20 09:50:59,177 epoch 8 - iter 308/773 - loss 0.09645731 - time (sec): 6.93 - samples/sec: 7215.93 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:51:00,995 epoch 8 - iter 385/773 - loss 0.09864950 - time (sec): 8.75 - samples/sec: 7181.08 - lr: 0.000014 - momentum: 0.000000 2023-10-20 09:51:02,709 epoch 8 - iter 462/773 - loss 0.10127673 - time (sec): 10.46 - samples/sec: 7190.72 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:51:04,451 epoch 8 - iter 539/773 - loss 0.10595667 - time (sec): 12.20 - samples/sec: 7126.05 - lr: 0.000013 - momentum: 0.000000 2023-10-20 09:51:06,293 epoch 8 - iter 616/773 - loss 0.10535196 - time (sec): 14.05 - samples/sec: 7091.97 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:51:08,042 epoch 8 - iter 693/773 - loss 0.10603763 - time (sec): 15.79 - samples/sec: 7039.09 - lr: 0.000012 - momentum: 0.000000 2023-10-20 09:51:09,809 epoch 8 - iter 770/773 - loss 0.10450961 - time (sec): 17.56 - samples/sec: 7048.84 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:51:09,879 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:51:09,880 EPOCH 8 done: loss 0.1044 - lr: 0.000011 2023-10-20 09:51:10,964 DEV : loss 0.08721306174993515 - f1-score (micro avg) 0.557 2023-10-20 09:51:10,977 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:51:12,693 epoch 9 - iter 77/773 - loss 0.08508022 - time (sec): 1.72 - samples/sec: 6831.62 - lr: 0.000011 - momentum: 0.000000 2023-10-20 09:51:14,449 epoch 9 - iter 154/773 - loss 0.10086226 - time (sec): 3.47 - samples/sec: 7068.70 - lr: 0.000010 - momentum: 0.000000 2023-10-20 09:51:16,184 epoch 9 - iter 231/773 - loss 0.10411885 - time (sec): 5.21 - samples/sec: 7041.88 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:51:17,985 epoch 9 - iter 308/773 - loss 0.10360436 - time (sec): 7.01 - samples/sec: 7036.34 - lr: 0.000009 - momentum: 0.000000 2023-10-20 09:51:19,766 epoch 9 - iter 385/773 - loss 0.10607327 - time (sec): 8.79 - samples/sec: 7123.12 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:51:21,608 epoch 9 - iter 462/773 - loss 0.10462746 - time (sec): 10.63 - samples/sec: 6995.10 - lr: 0.000008 - momentum: 0.000000 2023-10-20 09:51:23,423 epoch 9 - iter 539/773 - loss 0.10355217 - time (sec): 12.45 - samples/sec: 6945.51 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:51:25,251 epoch 9 - iter 616/773 - loss 0.10298710 - time (sec): 14.27 - samples/sec: 6947.34 - lr: 0.000007 - momentum: 0.000000 2023-10-20 09:51:26,965 epoch 9 - iter 693/773 - loss 0.10206767 - time (sec): 15.99 - samples/sec: 6974.89 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:51:28,689 epoch 9 - iter 770/773 - loss 0.10193084 - time (sec): 17.71 - samples/sec: 6991.53 - lr: 0.000006 - momentum: 0.000000 2023-10-20 09:51:28,754 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:51:28,755 EPOCH 9 done: loss 0.1020 - lr: 0.000006 2023-10-20 09:51:29,845 DEV : loss 0.08746206760406494 - f1-score (micro avg) 0.5598 2023-10-20 09:51:29,857 saving best model 2023-10-20 09:51:29,898 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:51:31,632 epoch 10 - iter 77/773 - loss 0.09052757 - time (sec): 1.73 - samples/sec: 7036.26 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:51:33,295 epoch 10 - iter 154/773 - loss 0.08568932 - time (sec): 3.40 - samples/sec: 6743.47 - lr: 0.000005 - momentum: 0.000000 2023-10-20 09:51:35,057 epoch 10 - iter 231/773 - loss 0.09090122 - time (sec): 5.16 - samples/sec: 7114.36 - lr: 0.000004 - momentum: 0.000000 2023-10-20 09:51:36,834 epoch 10 - iter 308/773 - loss 0.09401948 - time (sec): 6.94 - samples/sec: 7126.06 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:51:38,537 epoch 10 - iter 385/773 - loss 0.09521242 - time (sec): 8.64 - samples/sec: 7168.24 - lr: 0.000003 - momentum: 0.000000 2023-10-20 09:51:40,283 epoch 10 - iter 462/773 - loss 0.09812942 - time (sec): 10.38 - samples/sec: 7175.35 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:51:42,013 epoch 10 - iter 539/773 - loss 0.09933101 - time (sec): 12.11 - samples/sec: 7131.17 - lr: 0.000002 - momentum: 0.000000 2023-10-20 09:51:43,770 epoch 10 - iter 616/773 - loss 0.09792858 - time (sec): 13.87 - samples/sec: 7186.32 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:51:45,469 epoch 10 - iter 693/773 - loss 0.09892283 - time (sec): 15.57 - samples/sec: 7150.93 - lr: 0.000001 - momentum: 0.000000 2023-10-20 09:51:47,214 epoch 10 - iter 770/773 - loss 0.10051481 - time (sec): 17.32 - samples/sec: 7154.09 - lr: 0.000000 - momentum: 0.000000 2023-10-20 09:51:47,278 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:51:47,278 EPOCH 10 done: loss 0.1004 - lr: 0.000000 2023-10-20 09:51:48,362 DEV : loss 0.08812259882688522 - f1-score (micro avg) 0.565 2023-10-20 09:51:48,374 saving best model 2023-10-20 09:51:48,448 ---------------------------------------------------------------------------------------------------- 2023-10-20 09:51:48,449 Loading model from best epoch ... 2023-10-20 09:51:48,522 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-20 09:51:51,447 Results: - F-score (micro) 0.5627 - F-score (macro) 0.3215 - Accuracy 0.4019 By class: precision recall f1-score support LOC 0.6176 0.6438 0.6304 946 BUILDING 0.2535 0.0973 0.1406 185 STREET 1.0000 0.1071 0.1935 56 micro avg 0.5955 0.5333 0.5627 1187 macro avg 0.6237 0.2827 0.3215 1187 weighted avg 0.5789 0.5333 0.5335 1187 2023-10-20 09:51:51,447 ----------------------------------------------------------------------------------------------------