2023-10-25 10:55:20,713 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:55:20,714 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:55:20,715 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:55:20,715 Train: 6183 sentences 2023-10-25 10:55:20,715 (train_with_dev=False, train_with_test=False) 2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:55:20,715 Training Params: 2023-10-25 10:55:20,715 - learning_rate: "3e-05" 2023-10-25 10:55:20,715 - mini_batch_size: "4" 2023-10-25 10:55:20,715 - max_epochs: "10" 2023-10-25 10:55:20,715 - shuffle: "True" 2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:55:20,715 Plugins: 2023-10-25 10:55:20,715 - TensorboardLogger 2023-10-25 10:55:20,715 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:55:20,715 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 10:55:20,715 - metric: "('micro avg', 'f1-score')" 2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:55:20,716 Computation: 2023-10-25 10:55:20,716 - compute on device: cuda:0 2023-10-25 10:55:20,716 - embedding storage: none 2023-10-25 10:55:20,716 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:55:20,716 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-25 10:55:20,716 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:55:20,716 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:55:20,716 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 10:55:28,912 epoch 1 - iter 154/1546 - loss 1.65969452 - time (sec): 8.20 - samples/sec: 1555.63 - lr: 0.000003 - momentum: 0.000000 2023-10-25 10:55:36,950 epoch 1 - iter 308/1546 - loss 0.93754585 - time (sec): 16.23 - samples/sec: 1534.93 - lr: 0.000006 - momentum: 0.000000 2023-10-25 10:55:44,983 epoch 1 - iter 462/1546 - loss 0.68571099 - time (sec): 24.27 - samples/sec: 1522.86 - lr: 0.000009 - momentum: 0.000000 2023-10-25 10:55:53,052 epoch 1 - iter 616/1546 - loss 0.54254456 - time (sec): 32.33 - samples/sec: 1540.72 - lr: 0.000012 - momentum: 0.000000 2023-10-25 10:56:01,033 epoch 1 - iter 770/1546 - loss 0.45649672 - time (sec): 40.32 - samples/sec: 1542.76 - lr: 0.000015 - momentum: 0.000000 2023-10-25 10:56:09,335 epoch 1 - iter 924/1546 - loss 0.40353128 - time (sec): 48.62 - samples/sec: 1521.17 - lr: 0.000018 - momentum: 0.000000 2023-10-25 10:56:17,417 epoch 1 - iter 1078/1546 - loss 0.36171769 - time (sec): 56.70 - samples/sec: 1531.40 - lr: 0.000021 - momentum: 0.000000 2023-10-25 10:56:25,605 epoch 1 - iter 1232/1546 - loss 0.32774519 - time (sec): 64.89 - samples/sec: 1535.03 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:56:33,774 epoch 1 - iter 1386/1546 - loss 0.30231800 - time (sec): 73.06 - samples/sec: 1531.32 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:56:41,853 epoch 1 - iter 1540/1546 - loss 0.28338013 - time (sec): 81.14 - samples/sec: 1526.05 - lr: 0.000030 - momentum: 0.000000 2023-10-25 10:56:42,169 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:56:42,170 EPOCH 1 done: loss 0.2827 - lr: 0.000030 2023-10-25 10:56:45,326 DEV : loss 0.062169428914785385 - f1-score (micro avg) 0.7459 2023-10-25 10:56:45,343 saving best model 2023-10-25 10:56:45,846 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:56:53,944 epoch 2 - iter 154/1546 - loss 0.08228178 - time (sec): 8.10 - samples/sec: 1517.82 - lr: 0.000030 - momentum: 0.000000 2023-10-25 10:57:02,045 epoch 2 - iter 308/1546 - loss 0.09250857 - time (sec): 16.20 - samples/sec: 1529.83 - lr: 0.000029 - momentum: 0.000000 2023-10-25 10:57:10,520 epoch 2 - iter 462/1546 - loss 0.09380708 - time (sec): 24.67 - samples/sec: 1461.41 - lr: 0.000029 - momentum: 0.000000 2023-10-25 10:57:18,800 epoch 2 - iter 616/1546 - loss 0.09259481 - time (sec): 32.95 - samples/sec: 1485.60 - lr: 0.000029 - momentum: 0.000000 2023-10-25 10:57:27,133 epoch 2 - iter 770/1546 - loss 0.08939337 - time (sec): 41.29 - samples/sec: 1492.76 - lr: 0.000028 - momentum: 0.000000 2023-10-25 10:57:35,482 epoch 2 - iter 924/1546 - loss 0.08678052 - time (sec): 49.63 - samples/sec: 1496.72 - lr: 0.000028 - momentum: 0.000000 2023-10-25 10:57:43,622 epoch 2 - iter 1078/1546 - loss 0.08519177 - time (sec): 57.77 - samples/sec: 1506.85 - lr: 0.000028 - momentum: 0.000000 2023-10-25 10:57:51,769 epoch 2 - iter 1232/1546 - loss 0.08376653 - time (sec): 65.92 - samples/sec: 1497.63 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:58:00,015 epoch 2 - iter 1386/1546 - loss 0.08475831 - time (sec): 74.17 - samples/sec: 1489.46 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:58:08,149 epoch 2 - iter 1540/1546 - loss 0.08472997 - time (sec): 82.30 - samples/sec: 1503.75 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:58:08,448 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:58:08,449 EPOCH 2 done: loss 0.0846 - lr: 0.000027 2023-10-25 10:58:11,612 DEV : loss 0.07261277735233307 - f1-score (micro avg) 0.7277 2023-10-25 10:58:11,628 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:58:19,774 epoch 3 - iter 154/1546 - loss 0.05836815 - time (sec): 8.14 - samples/sec: 1498.97 - lr: 0.000026 - momentum: 0.000000 2023-10-25 10:58:27,451 epoch 3 - iter 308/1546 - loss 0.05750916 - time (sec): 15.82 - samples/sec: 1507.96 - lr: 0.000026 - momentum: 0.000000 2023-10-25 10:58:35,457 epoch 3 - iter 462/1546 - loss 0.05789807 - time (sec): 23.83 - samples/sec: 1501.32 - lr: 0.000026 - momentum: 0.000000 2023-10-25 10:58:43,631 epoch 3 - iter 616/1546 - loss 0.05366765 - time (sec): 32.00 - samples/sec: 1510.95 - lr: 0.000025 - momentum: 0.000000 2023-10-25 10:58:51,877 epoch 3 - iter 770/1546 - loss 0.05391996 - time (sec): 40.25 - samples/sec: 1503.99 - lr: 0.000025 - momentum: 0.000000 2023-10-25 10:59:00,343 epoch 3 - iter 924/1546 - loss 0.05489213 - time (sec): 48.71 - samples/sec: 1510.07 - lr: 0.000025 - momentum: 0.000000 2023-10-25 10:59:08,613 epoch 3 - iter 1078/1546 - loss 0.05440378 - time (sec): 56.98 - samples/sec: 1510.99 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:59:16,017 epoch 3 - iter 1232/1546 - loss 0.05523524 - time (sec): 64.39 - samples/sec: 1533.04 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:59:23,782 epoch 3 - iter 1386/1546 - loss 0.05498877 - time (sec): 72.15 - samples/sec: 1546.14 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:59:31,809 epoch 3 - iter 1540/1546 - loss 0.05337318 - time (sec): 80.18 - samples/sec: 1545.06 - lr: 0.000023 - momentum: 0.000000 2023-10-25 10:59:32,114 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:59:32,114 EPOCH 3 done: loss 0.0533 - lr: 0.000023 2023-10-25 10:59:34,713 DEV : loss 0.09182113409042358 - f1-score (micro avg) 0.7679 2023-10-25 10:59:34,735 saving best model 2023-10-25 10:59:35,483 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:59:43,457 epoch 4 - iter 154/1546 - loss 0.02507648 - time (sec): 7.97 - samples/sec: 1565.78 - lr: 0.000023 - momentum: 0.000000 2023-10-25 10:59:51,780 epoch 4 - iter 308/1546 - loss 0.03175758 - time (sec): 16.29 - samples/sec: 1550.68 - lr: 0.000023 - momentum: 0.000000 2023-10-25 10:59:59,992 epoch 4 - iter 462/1546 - loss 0.03194911 - time (sec): 24.51 - samples/sec: 1540.95 - lr: 0.000022 - momentum: 0.000000 2023-10-25 11:00:08,358 epoch 4 - iter 616/1546 - loss 0.03245614 - time (sec): 32.87 - samples/sec: 1530.37 - lr: 0.000022 - momentum: 0.000000 2023-10-25 11:00:16,776 epoch 4 - iter 770/1546 - loss 0.03363787 - time (sec): 41.29 - samples/sec: 1507.55 - lr: 0.000022 - momentum: 0.000000 2023-10-25 11:00:25,086 epoch 4 - iter 924/1546 - loss 0.03411037 - time (sec): 49.60 - samples/sec: 1494.55 - lr: 0.000021 - momentum: 0.000000 2023-10-25 11:00:33,183 epoch 4 - iter 1078/1546 - loss 0.03460341 - time (sec): 57.70 - samples/sec: 1505.14 - lr: 0.000021 - momentum: 0.000000 2023-10-25 11:00:41,566 epoch 4 - iter 1232/1546 - loss 0.03464367 - time (sec): 66.08 - samples/sec: 1507.10 - lr: 0.000021 - momentum: 0.000000 2023-10-25 11:00:50,390 epoch 4 - iter 1386/1546 - loss 0.03534451 - time (sec): 74.90 - samples/sec: 1499.75 - lr: 0.000020 - momentum: 0.000000 2023-10-25 11:00:59,048 epoch 4 - iter 1540/1546 - loss 0.03677995 - time (sec): 83.56 - samples/sec: 1481.68 - lr: 0.000020 - momentum: 0.000000 2023-10-25 11:00:59,372 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:00:59,372 EPOCH 4 done: loss 0.0367 - lr: 0.000020 2023-10-25 11:01:02,425 DEV : loss 0.09009607881307602 - f1-score (micro avg) 0.7484 2023-10-25 11:01:02,448 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:01:11,044 epoch 5 - iter 154/1546 - loss 0.02548701 - time (sec): 8.59 - samples/sec: 1432.16 - lr: 0.000020 - momentum: 0.000000 2023-10-25 11:01:19,545 epoch 5 - iter 308/1546 - loss 0.02652036 - time (sec): 17.10 - samples/sec: 1443.33 - lr: 0.000019 - momentum: 0.000000 2023-10-25 11:01:28,336 epoch 5 - iter 462/1546 - loss 0.02539993 - time (sec): 25.89 - samples/sec: 1452.56 - lr: 0.000019 - momentum: 0.000000 2023-10-25 11:01:36,776 epoch 5 - iter 616/1546 - loss 0.02612452 - time (sec): 34.33 - samples/sec: 1449.22 - lr: 0.000019 - momentum: 0.000000 2023-10-25 11:01:45,209 epoch 5 - iter 770/1546 - loss 0.02524168 - time (sec): 42.76 - samples/sec: 1466.66 - lr: 0.000018 - momentum: 0.000000 2023-10-25 11:01:53,861 epoch 5 - iter 924/1546 - loss 0.02572133 - time (sec): 51.41 - samples/sec: 1467.85 - lr: 0.000018 - momentum: 0.000000 2023-10-25 11:02:02,324 epoch 5 - iter 1078/1546 - loss 0.02424334 - time (sec): 59.87 - samples/sec: 1473.27 - lr: 0.000018 - momentum: 0.000000 2023-10-25 11:02:11,012 epoch 5 - iter 1232/1546 - loss 0.02389485 - time (sec): 68.56 - samples/sec: 1454.57 - lr: 0.000017 - momentum: 0.000000 2023-10-25 11:02:19,339 epoch 5 - iter 1386/1546 - loss 0.02362872 - time (sec): 76.89 - samples/sec: 1460.40 - lr: 0.000017 - momentum: 0.000000 2023-10-25 11:02:27,570 epoch 5 - iter 1540/1546 - loss 0.02429264 - time (sec): 85.12 - samples/sec: 1454.78 - lr: 0.000017 - momentum: 0.000000 2023-10-25 11:02:27,893 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:02:27,894 EPOCH 5 done: loss 0.0243 - lr: 0.000017 2023-10-25 11:02:30,998 DEV : loss 0.10089725255966187 - f1-score (micro avg) 0.7832 2023-10-25 11:02:31,025 saving best model 2023-10-25 11:02:31,719 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:02:40,219 epoch 6 - iter 154/1546 - loss 0.02003037 - time (sec): 8.50 - samples/sec: 1487.59 - lr: 0.000016 - momentum: 0.000000 2023-10-25 11:02:48,637 epoch 6 - iter 308/1546 - loss 0.02358693 - time (sec): 16.92 - samples/sec: 1486.16 - lr: 0.000016 - momentum: 0.000000 2023-10-25 11:02:57,000 epoch 6 - iter 462/1546 - loss 0.02117204 - time (sec): 25.28 - samples/sec: 1460.82 - lr: 0.000016 - momentum: 0.000000 2023-10-25 11:03:05,526 epoch 6 - iter 616/1546 - loss 0.02069151 - time (sec): 33.80 - samples/sec: 1472.70 - lr: 0.000015 - momentum: 0.000000 2023-10-25 11:03:13,317 epoch 6 - iter 770/1546 - loss 0.02110916 - time (sec): 41.60 - samples/sec: 1516.05 - lr: 0.000015 - momentum: 0.000000 2023-10-25 11:03:20,828 epoch 6 - iter 924/1546 - loss 0.01896075 - time (sec): 49.11 - samples/sec: 1543.87 - lr: 0.000015 - momentum: 0.000000 2023-10-25 11:03:28,603 epoch 6 - iter 1078/1546 - loss 0.01988098 - time (sec): 56.88 - samples/sec: 1544.66 - lr: 0.000014 - momentum: 0.000000 2023-10-25 11:03:36,263 epoch 6 - iter 1232/1546 - loss 0.01957224 - time (sec): 64.54 - samples/sec: 1543.12 - lr: 0.000014 - momentum: 0.000000 2023-10-25 11:03:43,921 epoch 6 - iter 1386/1546 - loss 0.01892299 - time (sec): 72.20 - samples/sec: 1547.68 - lr: 0.000014 - momentum: 0.000000 2023-10-25 11:03:51,944 epoch 6 - iter 1540/1546 - loss 0.01883330 - time (sec): 80.22 - samples/sec: 1544.46 - lr: 0.000013 - momentum: 0.000000 2023-10-25 11:03:52,223 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:03:52,223 EPOCH 6 done: loss 0.0188 - lr: 0.000013 2023-10-25 11:03:54,800 DEV : loss 0.11670850217342377 - f1-score (micro avg) 0.7555 2023-10-25 11:03:54,821 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:04:02,733 epoch 7 - iter 154/1546 - loss 0.01309731 - time (sec): 7.91 - samples/sec: 1610.31 - lr: 0.000013 - momentum: 0.000000 2023-10-25 11:04:10,327 epoch 7 - iter 308/1546 - loss 0.01392533 - time (sec): 15.50 - samples/sec: 1602.90 - lr: 0.000013 - momentum: 0.000000 2023-10-25 11:04:17,838 epoch 7 - iter 462/1546 - loss 0.01201188 - time (sec): 23.02 - samples/sec: 1683.89 - lr: 0.000012 - momentum: 0.000000 2023-10-25 11:04:25,232 epoch 7 - iter 616/1546 - loss 0.01294140 - time (sec): 30.41 - samples/sec: 1631.46 - lr: 0.000012 - momentum: 0.000000 2023-10-25 11:04:32,693 epoch 7 - iter 770/1546 - loss 0.01266465 - time (sec): 37.87 - samples/sec: 1633.04 - lr: 0.000012 - momentum: 0.000000 2023-10-25 11:04:40,354 epoch 7 - iter 924/1546 - loss 0.01178168 - time (sec): 45.53 - samples/sec: 1636.18 - lr: 0.000011 - momentum: 0.000000 2023-10-25 11:04:48,093 epoch 7 - iter 1078/1546 - loss 0.01250303 - time (sec): 53.27 - samples/sec: 1612.54 - lr: 0.000011 - momentum: 0.000000 2023-10-25 11:04:55,739 epoch 7 - iter 1232/1546 - loss 0.01293497 - time (sec): 60.92 - samples/sec: 1606.69 - lr: 0.000011 - momentum: 0.000000 2023-10-25 11:05:03,548 epoch 7 - iter 1386/1546 - loss 0.01307263 - time (sec): 68.72 - samples/sec: 1614.44 - lr: 0.000010 - momentum: 0.000000 2023-10-25 11:05:11,371 epoch 7 - iter 1540/1546 - loss 0.01279875 - time (sec): 76.55 - samples/sec: 1616.14 - lr: 0.000010 - momentum: 0.000000 2023-10-25 11:05:11,669 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:05:11,669 EPOCH 7 done: loss 0.0128 - lr: 0.000010 2023-10-25 11:05:15,197 DEV : loss 0.12019308656454086 - f1-score (micro avg) 0.7705 2023-10-25 11:05:15,216 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:05:23,412 epoch 8 - iter 154/1546 - loss 0.00807042 - time (sec): 8.19 - samples/sec: 1504.53 - lr: 0.000010 - momentum: 0.000000 2023-10-25 11:05:31,450 epoch 8 - iter 308/1546 - loss 0.00746783 - time (sec): 16.23 - samples/sec: 1529.78 - lr: 0.000009 - momentum: 0.000000 2023-10-25 11:05:39,597 epoch 8 - iter 462/1546 - loss 0.00860174 - time (sec): 24.38 - samples/sec: 1492.71 - lr: 0.000009 - momentum: 0.000000 2023-10-25 11:05:47,486 epoch 8 - iter 616/1546 - loss 0.00814822 - time (sec): 32.27 - samples/sec: 1505.31 - lr: 0.000009 - momentum: 0.000000 2023-10-25 11:05:55,271 epoch 8 - iter 770/1546 - loss 0.00790389 - time (sec): 40.05 - samples/sec: 1526.31 - lr: 0.000008 - momentum: 0.000000 2023-10-25 11:06:03,085 epoch 8 - iter 924/1546 - loss 0.00841588 - time (sec): 47.87 - samples/sec: 1555.20 - lr: 0.000008 - momentum: 0.000000 2023-10-25 11:06:10,637 epoch 8 - iter 1078/1546 - loss 0.00891299 - time (sec): 55.42 - samples/sec: 1569.17 - lr: 0.000008 - momentum: 0.000000 2023-10-25 11:06:18,161 epoch 8 - iter 1232/1546 - loss 0.00897866 - time (sec): 62.94 - samples/sec: 1575.65 - lr: 0.000007 - momentum: 0.000000 2023-10-25 11:06:26,204 epoch 8 - iter 1386/1546 - loss 0.00922351 - time (sec): 70.99 - samples/sec: 1566.90 - lr: 0.000007 - momentum: 0.000000 2023-10-25 11:06:34,371 epoch 8 - iter 1540/1546 - loss 0.00871974 - time (sec): 79.15 - samples/sec: 1563.32 - lr: 0.000007 - momentum: 0.000000 2023-10-25 11:06:34,699 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:06:34,700 EPOCH 8 done: loss 0.0087 - lr: 0.000007 2023-10-25 11:06:37,436 DEV : loss 0.11731629073619843 - f1-score (micro avg) 0.7885 2023-10-25 11:06:37,454 saving best model 2023-10-25 11:06:38,144 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:06:45,643 epoch 9 - iter 154/1546 - loss 0.00438247 - time (sec): 7.50 - samples/sec: 1680.48 - lr: 0.000006 - momentum: 0.000000 2023-10-25 11:06:52,902 epoch 9 - iter 308/1546 - loss 0.00372726 - time (sec): 14.76 - samples/sec: 1696.09 - lr: 0.000006 - momentum: 0.000000 2023-10-25 11:07:00,237 epoch 9 - iter 462/1546 - loss 0.00385813 - time (sec): 22.09 - samples/sec: 1693.20 - lr: 0.000006 - momentum: 0.000000 2023-10-25 11:07:07,380 epoch 9 - iter 616/1546 - loss 0.00377794 - time (sec): 29.23 - samples/sec: 1723.44 - lr: 0.000005 - momentum: 0.000000 2023-10-25 11:07:14,702 epoch 9 - iter 770/1546 - loss 0.00460608 - time (sec): 36.55 - samples/sec: 1714.33 - lr: 0.000005 - momentum: 0.000000 2023-10-25 11:07:22,280 epoch 9 - iter 924/1546 - loss 0.00436436 - time (sec): 44.13 - samples/sec: 1698.21 - lr: 0.000005 - momentum: 0.000000 2023-10-25 11:07:29,621 epoch 9 - iter 1078/1546 - loss 0.00482249 - time (sec): 51.47 - samples/sec: 1689.03 - lr: 0.000004 - momentum: 0.000000 2023-10-25 11:07:36,843 epoch 9 - iter 1232/1546 - loss 0.00450838 - time (sec): 58.70 - samples/sec: 1687.18 - lr: 0.000004 - momentum: 0.000000 2023-10-25 11:07:44,631 epoch 9 - iter 1386/1546 - loss 0.00440336 - time (sec): 66.48 - samples/sec: 1687.63 - lr: 0.000004 - momentum: 0.000000 2023-10-25 11:07:52,067 epoch 9 - iter 1540/1546 - loss 0.00411316 - time (sec): 73.92 - samples/sec: 1677.25 - lr: 0.000003 - momentum: 0.000000 2023-10-25 11:07:52,352 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:07:52,352 EPOCH 9 done: loss 0.0041 - lr: 0.000003 2023-10-25 11:07:54,893 DEV : loss 0.1307971030473709 - f1-score (micro avg) 0.7692 2023-10-25 11:07:54,916 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:08:02,148 epoch 10 - iter 154/1546 - loss 0.00204746 - time (sec): 7.23 - samples/sec: 1637.33 - lr: 0.000003 - momentum: 0.000000 2023-10-25 11:08:09,285 epoch 10 - iter 308/1546 - loss 0.00355914 - time (sec): 14.37 - samples/sec: 1636.15 - lr: 0.000003 - momentum: 0.000000 2023-10-25 11:08:16,545 epoch 10 - iter 462/1546 - loss 0.00381047 - time (sec): 21.63 - samples/sec: 1639.14 - lr: 0.000002 - momentum: 0.000000 2023-10-25 11:08:23,743 epoch 10 - iter 616/1546 - loss 0.00367323 - time (sec): 28.83 - samples/sec: 1655.86 - lr: 0.000002 - momentum: 0.000000 2023-10-25 11:08:31,133 epoch 10 - iter 770/1546 - loss 0.00339860 - time (sec): 36.22 - samples/sec: 1636.15 - lr: 0.000002 - momentum: 0.000000 2023-10-25 11:08:38,585 epoch 10 - iter 924/1546 - loss 0.00310024 - time (sec): 43.67 - samples/sec: 1652.22 - lr: 0.000001 - momentum: 0.000000 2023-10-25 11:08:46,049 epoch 10 - iter 1078/1546 - loss 0.00291881 - time (sec): 51.13 - samples/sec: 1664.18 - lr: 0.000001 - momentum: 0.000000 2023-10-25 11:08:53,576 epoch 10 - iter 1232/1546 - loss 0.00282386 - time (sec): 58.66 - samples/sec: 1675.15 - lr: 0.000001 - momentum: 0.000000 2023-10-25 11:09:01,133 epoch 10 - iter 1386/1546 - loss 0.00272207 - time (sec): 66.22 - samples/sec: 1680.98 - lr: 0.000000 - momentum: 0.000000 2023-10-25 11:09:08,863 epoch 10 - iter 1540/1546 - loss 0.00284420 - time (sec): 73.95 - samples/sec: 1671.06 - lr: 0.000000 - momentum: 0.000000 2023-10-25 11:09:09,151 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:09:09,151 EPOCH 10 done: loss 0.0029 - lr: 0.000000 2023-10-25 11:09:12,044 DEV : loss 0.13086602091789246 - f1-score (micro avg) 0.7695 2023-10-25 11:09:12,513 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:09:12,514 Loading model from best epoch ... 2023-10-25 11:09:14,434 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-25 11:09:23,205 Results: - F-score (micro) 0.8064 - F-score (macro) 0.7186 - Accuracy 0.6987 By class: precision recall f1-score support LOC 0.8493 0.8520 0.8507 946 BUILDING 0.6020 0.6378 0.6194 185 STREET 0.7347 0.6429 0.6857 56 micro avg 0.8040 0.8088 0.8064 1187 macro avg 0.7287 0.7109 0.7186 1187 weighted avg 0.8054 0.8088 0.8068 1187 2023-10-25 11:09:23,205 ----------------------------------------------------------------------------------------------------