2023-10-25 11:40:57,109 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:57,110 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 11:40:57,110 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:57,110 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-25 11:40:57,110 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:57,110 Train: 6183 sentences 2023-10-25 11:40:57,110 (train_with_dev=False, train_with_test=False) 2023-10-25 11:40:57,110 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:57,110 Training Params: 2023-10-25 11:40:57,110 - learning_rate: "3e-05" 2023-10-25 11:40:57,110 - mini_batch_size: "4" 2023-10-25 11:40:57,110 - max_epochs: "10" 2023-10-25 11:40:57,110 - shuffle: "True" 2023-10-25 11:40:57,110 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:57,110 Plugins: 2023-10-25 11:40:57,110 - TensorboardLogger 2023-10-25 11:40:57,110 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 11:40:57,110 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:57,110 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 11:40:57,110 - metric: "('micro avg', 'f1-score')" 2023-10-25 11:40:57,111 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:57,111 Computation: 2023-10-25 11:40:57,111 - compute on device: cuda:0 2023-10-25 11:40:57,111 - embedding storage: none 2023-10-25 11:40:57,111 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:57,111 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-25 11:40:57,111 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:57,111 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:40:57,111 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 11:41:05,057 epoch 1 - iter 154/1546 - loss 1.55266265 - time (sec): 7.95 - samples/sec: 1791.42 - lr: 0.000003 - momentum: 0.000000 2023-10-25 11:41:13,150 epoch 1 - iter 308/1546 - loss 0.94061942 - time (sec): 16.04 - samples/sec: 1624.65 - lr: 0.000006 - momentum: 0.000000 2023-10-25 11:41:21,568 epoch 1 - iter 462/1546 - loss 0.68349218 - time (sec): 24.46 - samples/sec: 1582.07 - lr: 0.000009 - momentum: 0.000000 2023-10-25 11:41:30,073 epoch 1 - iter 616/1546 - loss 0.54998564 - time (sec): 32.96 - samples/sec: 1542.02 - lr: 0.000012 - momentum: 0.000000 2023-10-25 11:41:38,325 epoch 1 - iter 770/1546 - loss 0.46698132 - time (sec): 41.21 - samples/sec: 1534.59 - lr: 0.000015 - momentum: 0.000000 2023-10-25 11:41:45,872 epoch 1 - iter 924/1546 - loss 0.41898577 - time (sec): 48.76 - samples/sec: 1527.28 - lr: 0.000018 - momentum: 0.000000 2023-10-25 11:41:53,216 epoch 1 - iter 1078/1546 - loss 0.37674351 - time (sec): 56.10 - samples/sec: 1543.34 - lr: 0.000021 - momentum: 0.000000 2023-10-25 11:42:00,450 epoch 1 - iter 1232/1546 - loss 0.34169079 - time (sec): 63.34 - samples/sec: 1565.07 - lr: 0.000024 - momentum: 0.000000 2023-10-25 11:42:07,964 epoch 1 - iter 1386/1546 - loss 0.31471666 - time (sec): 70.85 - samples/sec: 1573.11 - lr: 0.000027 - momentum: 0.000000 2023-10-25 11:42:15,740 epoch 1 - iter 1540/1546 - loss 0.29260162 - time (sec): 78.63 - samples/sec: 1573.88 - lr: 0.000030 - momentum: 0.000000 2023-10-25 11:42:16,041 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:42:16,041 EPOCH 1 done: loss 0.2921 - lr: 0.000030 2023-10-25 11:42:19,443 DEV : loss 0.09200076758861542 - f1-score (micro avg) 0.7104 2023-10-25 11:42:19,505 saving best model 2023-10-25 11:42:19,988 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:42:27,749 epoch 2 - iter 154/1546 - loss 0.08643753 - time (sec): 7.76 - samples/sec: 1518.18 - lr: 0.000030 - momentum: 0.000000 2023-10-25 11:42:35,209 epoch 2 - iter 308/1546 - loss 0.08754762 - time (sec): 15.22 - samples/sec: 1617.53 - lr: 0.000029 - momentum: 0.000000 2023-10-25 11:42:43,000 epoch 2 - iter 462/1546 - loss 0.08702917 - time (sec): 23.01 - samples/sec: 1652.19 - lr: 0.000029 - momentum: 0.000000 2023-10-25 11:42:50,731 epoch 2 - iter 616/1546 - loss 0.08590301 - time (sec): 30.74 - samples/sec: 1663.47 - lr: 0.000029 - momentum: 0.000000 2023-10-25 11:42:58,625 epoch 2 - iter 770/1546 - loss 0.08621250 - time (sec): 38.64 - samples/sec: 1642.86 - lr: 0.000028 - momentum: 0.000000 2023-10-25 11:43:06,846 epoch 2 - iter 924/1546 - loss 0.08561525 - time (sec): 46.86 - samples/sec: 1598.86 - lr: 0.000028 - momentum: 0.000000 2023-10-25 11:43:14,954 epoch 2 - iter 1078/1546 - loss 0.08510488 - time (sec): 54.96 - samples/sec: 1581.90 - lr: 0.000028 - momentum: 0.000000 2023-10-25 11:43:23,190 epoch 2 - iter 1232/1546 - loss 0.08390433 - time (sec): 63.20 - samples/sec: 1574.91 - lr: 0.000027 - momentum: 0.000000 2023-10-25 11:43:30,898 epoch 2 - iter 1386/1546 - loss 0.08577022 - time (sec): 70.91 - samples/sec: 1568.81 - lr: 0.000027 - momentum: 0.000000 2023-10-25 11:43:38,755 epoch 2 - iter 1540/1546 - loss 0.08535406 - time (sec): 78.77 - samples/sec: 1571.01 - lr: 0.000027 - momentum: 0.000000 2023-10-25 11:43:39,079 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:43:39,079 EPOCH 2 done: loss 0.0852 - lr: 0.000027 2023-10-25 11:43:42,007 DEV : loss 0.066593699157238 - f1-score (micro avg) 0.7393 2023-10-25 11:43:42,025 saving best model 2023-10-25 11:43:42,732 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:43:51,297 epoch 3 - iter 154/1546 - loss 0.04138820 - time (sec): 8.56 - samples/sec: 1447.77 - lr: 0.000026 - momentum: 0.000000 2023-10-25 11:43:58,801 epoch 3 - iter 308/1546 - loss 0.04277290 - time (sec): 16.07 - samples/sec: 1544.50 - lr: 0.000026 - momentum: 0.000000 2023-10-25 11:44:06,351 epoch 3 - iter 462/1546 - loss 0.04350451 - time (sec): 23.62 - samples/sec: 1629.16 - lr: 0.000026 - momentum: 0.000000 2023-10-25 11:44:13,821 epoch 3 - iter 616/1546 - loss 0.04607685 - time (sec): 31.09 - samples/sec: 1631.91 - lr: 0.000025 - momentum: 0.000000 2023-10-25 11:44:21,502 epoch 3 - iter 770/1546 - loss 0.05107878 - time (sec): 38.77 - samples/sec: 1608.19 - lr: 0.000025 - momentum: 0.000000 2023-10-25 11:44:28,906 epoch 3 - iter 924/1546 - loss 0.05156383 - time (sec): 46.17 - samples/sec: 1590.58 - lr: 0.000025 - momentum: 0.000000 2023-10-25 11:44:36,592 epoch 3 - iter 1078/1546 - loss 0.05068057 - time (sec): 53.86 - samples/sec: 1604.46 - lr: 0.000024 - momentum: 0.000000 2023-10-25 11:44:44,617 epoch 3 - iter 1232/1546 - loss 0.05052840 - time (sec): 61.88 - samples/sec: 1595.62 - lr: 0.000024 - momentum: 0.000000 2023-10-25 11:44:52,726 epoch 3 - iter 1386/1546 - loss 0.05069567 - time (sec): 69.99 - samples/sec: 1590.32 - lr: 0.000024 - momentum: 0.000000 2023-10-25 11:45:01,021 epoch 3 - iter 1540/1546 - loss 0.05027230 - time (sec): 78.29 - samples/sec: 1583.49 - lr: 0.000023 - momentum: 0.000000 2023-10-25 11:45:01,330 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:45:01,331 EPOCH 3 done: loss 0.0502 - lr: 0.000023 2023-10-25 11:45:03,901 DEV : loss 0.09328124672174454 - f1-score (micro avg) 0.7527 2023-10-25 11:45:03,917 saving best model 2023-10-25 11:45:04,593 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:45:12,869 epoch 4 - iter 154/1546 - loss 0.02327694 - time (sec): 8.27 - samples/sec: 1553.05 - lr: 0.000023 - momentum: 0.000000 2023-10-25 11:45:20,914 epoch 4 - iter 308/1546 - loss 0.02927266 - time (sec): 16.32 - samples/sec: 1577.43 - lr: 0.000023 - momentum: 0.000000 2023-10-25 11:45:28,426 epoch 4 - iter 462/1546 - loss 0.03046429 - time (sec): 23.83 - samples/sec: 1613.33 - lr: 0.000022 - momentum: 0.000000 2023-10-25 11:45:36,589 epoch 4 - iter 616/1546 - loss 0.03163201 - time (sec): 31.99 - samples/sec: 1587.16 - lr: 0.000022 - momentum: 0.000000 2023-10-25 11:45:44,726 epoch 4 - iter 770/1546 - loss 0.03021401 - time (sec): 40.13 - samples/sec: 1578.48 - lr: 0.000022 - momentum: 0.000000 2023-10-25 11:45:52,982 epoch 4 - iter 924/1546 - loss 0.03146767 - time (sec): 48.39 - samples/sec: 1567.45 - lr: 0.000021 - momentum: 0.000000 2023-10-25 11:46:00,669 epoch 4 - iter 1078/1546 - loss 0.03286338 - time (sec): 56.07 - samples/sec: 1560.80 - lr: 0.000021 - momentum: 0.000000 2023-10-25 11:46:08,054 epoch 4 - iter 1232/1546 - loss 0.03397091 - time (sec): 63.46 - samples/sec: 1562.28 - lr: 0.000021 - momentum: 0.000000 2023-10-25 11:46:15,204 epoch 4 - iter 1386/1546 - loss 0.03468811 - time (sec): 70.61 - samples/sec: 1566.87 - lr: 0.000020 - momentum: 0.000000 2023-10-25 11:46:22,911 epoch 4 - iter 1540/1546 - loss 0.03452000 - time (sec): 78.32 - samples/sec: 1579.84 - lr: 0.000020 - momentum: 0.000000 2023-10-25 11:46:23,217 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:46:23,217 EPOCH 4 done: loss 0.0346 - lr: 0.000020 2023-10-25 11:46:26,216 DEV : loss 0.09854534268379211 - f1-score (micro avg) 0.7613 2023-10-25 11:46:26,234 saving best model 2023-10-25 11:46:26,927 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:46:35,257 epoch 5 - iter 154/1546 - loss 0.02327685 - time (sec): 8.33 - samples/sec: 1408.83 - lr: 0.000020 - momentum: 0.000000 2023-10-25 11:46:43,409 epoch 5 - iter 308/1546 - loss 0.02278957 - time (sec): 16.48 - samples/sec: 1454.94 - lr: 0.000019 - momentum: 0.000000 2023-10-25 11:46:51,583 epoch 5 - iter 462/1546 - loss 0.02308603 - time (sec): 24.65 - samples/sec: 1488.56 - lr: 0.000019 - momentum: 0.000000 2023-10-25 11:46:59,313 epoch 5 - iter 616/1546 - loss 0.02319153 - time (sec): 32.38 - samples/sec: 1516.91 - lr: 0.000019 - momentum: 0.000000 2023-10-25 11:47:06,564 epoch 5 - iter 770/1546 - loss 0.02316930 - time (sec): 39.63 - samples/sec: 1529.36 - lr: 0.000018 - momentum: 0.000000 2023-10-25 11:47:14,022 epoch 5 - iter 924/1546 - loss 0.02359386 - time (sec): 47.09 - samples/sec: 1558.97 - lr: 0.000018 - momentum: 0.000000 2023-10-25 11:47:21,604 epoch 5 - iter 1078/1546 - loss 0.02311920 - time (sec): 54.67 - samples/sec: 1567.92 - lr: 0.000018 - momentum: 0.000000 2023-10-25 11:47:29,162 epoch 5 - iter 1232/1546 - loss 0.02275668 - time (sec): 62.23 - samples/sec: 1577.55 - lr: 0.000017 - momentum: 0.000000 2023-10-25 11:47:37,006 epoch 5 - iter 1386/1546 - loss 0.02359042 - time (sec): 70.08 - samples/sec: 1575.82 - lr: 0.000017 - momentum: 0.000000 2023-10-25 11:47:44,982 epoch 5 - iter 1540/1546 - loss 0.02268543 - time (sec): 78.05 - samples/sec: 1588.18 - lr: 0.000017 - momentum: 0.000000 2023-10-25 11:47:45,298 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:47:45,299 EPOCH 5 done: loss 0.0228 - lr: 0.000017 2023-10-25 11:47:48,095 DEV : loss 0.1056293249130249 - f1-score (micro avg) 0.7926 2023-10-25 11:47:48,116 saving best model 2023-10-25 11:47:48,876 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:47:57,191 epoch 6 - iter 154/1546 - loss 0.01292429 - time (sec): 8.31 - samples/sec: 1466.94 - lr: 0.000016 - momentum: 0.000000 2023-10-25 11:48:04,971 epoch 6 - iter 308/1546 - loss 0.01368133 - time (sec): 16.09 - samples/sec: 1535.52 - lr: 0.000016 - momentum: 0.000000 2023-10-25 11:48:12,959 epoch 6 - iter 462/1546 - loss 0.01365126 - time (sec): 24.08 - samples/sec: 1550.26 - lr: 0.000016 - momentum: 0.000000 2023-10-25 11:48:20,763 epoch 6 - iter 616/1546 - loss 0.01466652 - time (sec): 31.88 - samples/sec: 1545.83 - lr: 0.000015 - momentum: 0.000000 2023-10-25 11:48:28,824 epoch 6 - iter 770/1546 - loss 0.01706353 - time (sec): 39.95 - samples/sec: 1584.26 - lr: 0.000015 - momentum: 0.000000 2023-10-25 11:48:36,966 epoch 6 - iter 924/1546 - loss 0.01675685 - time (sec): 48.09 - samples/sec: 1575.71 - lr: 0.000015 - momentum: 0.000000 2023-10-25 11:48:45,163 epoch 6 - iter 1078/1546 - loss 0.01632322 - time (sec): 56.29 - samples/sec: 1552.50 - lr: 0.000014 - momentum: 0.000000 2023-10-25 11:48:53,237 epoch 6 - iter 1232/1546 - loss 0.01604706 - time (sec): 64.36 - samples/sec: 1547.33 - lr: 0.000014 - momentum: 0.000000 2023-10-25 11:49:01,192 epoch 6 - iter 1386/1546 - loss 0.01741202 - time (sec): 72.31 - samples/sec: 1542.73 - lr: 0.000014 - momentum: 0.000000 2023-10-25 11:49:08,780 epoch 6 - iter 1540/1546 - loss 0.01651650 - time (sec): 79.90 - samples/sec: 1549.84 - lr: 0.000013 - momentum: 0.000000 2023-10-25 11:49:09,076 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:49:09,076 EPOCH 6 done: loss 0.0165 - lr: 0.000013 2023-10-25 11:49:12,408 DEV : loss 0.10305089503526688 - f1-score (micro avg) 0.7934 2023-10-25 11:49:12,426 saving best model 2023-10-25 11:49:13,132 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:49:20,970 epoch 7 - iter 154/1546 - loss 0.00979127 - time (sec): 7.83 - samples/sec: 1555.77 - lr: 0.000013 - momentum: 0.000000 2023-10-25 11:49:28,763 epoch 7 - iter 308/1546 - loss 0.01181370 - time (sec): 15.63 - samples/sec: 1589.43 - lr: 0.000013 - momentum: 0.000000 2023-10-25 11:49:36,669 epoch 7 - iter 462/1546 - loss 0.01088413 - time (sec): 23.53 - samples/sec: 1591.24 - lr: 0.000012 - momentum: 0.000000 2023-10-25 11:49:44,608 epoch 7 - iter 616/1546 - loss 0.01066891 - time (sec): 31.47 - samples/sec: 1577.71 - lr: 0.000012 - momentum: 0.000000 2023-10-25 11:49:52,822 epoch 7 - iter 770/1546 - loss 0.01093293 - time (sec): 39.69 - samples/sec: 1546.79 - lr: 0.000012 - momentum: 0.000000 2023-10-25 11:50:00,722 epoch 7 - iter 924/1546 - loss 0.01055665 - time (sec): 47.59 - samples/sec: 1564.61 - lr: 0.000011 - momentum: 0.000000 2023-10-25 11:50:08,287 epoch 7 - iter 1078/1546 - loss 0.01119685 - time (sec): 55.15 - samples/sec: 1567.17 - lr: 0.000011 - momentum: 0.000000 2023-10-25 11:50:15,945 epoch 7 - iter 1232/1546 - loss 0.01087104 - time (sec): 62.81 - samples/sec: 1563.84 - lr: 0.000011 - momentum: 0.000000 2023-10-25 11:50:23,729 epoch 7 - iter 1386/1546 - loss 0.01036992 - time (sec): 70.59 - samples/sec: 1573.58 - lr: 0.000010 - momentum: 0.000000 2023-10-25 11:50:31,083 epoch 7 - iter 1540/1546 - loss 0.01045956 - time (sec): 77.95 - samples/sec: 1589.05 - lr: 0.000010 - momentum: 0.000000 2023-10-25 11:50:31,359 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:50:31,359 EPOCH 7 done: loss 0.0104 - lr: 0.000010 2023-10-25 11:50:34,513 DEV : loss 0.1298506259918213 - f1-score (micro avg) 0.7475 2023-10-25 11:50:34,532 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:50:42,351 epoch 8 - iter 154/1546 - loss 0.00701688 - time (sec): 7.82 - samples/sec: 1529.17 - lr: 0.000010 - momentum: 0.000000 2023-10-25 11:50:49,882 epoch 8 - iter 308/1546 - loss 0.00645018 - time (sec): 15.35 - samples/sec: 1629.63 - lr: 0.000009 - momentum: 0.000000 2023-10-25 11:50:57,282 epoch 8 - iter 462/1546 - loss 0.00628734 - time (sec): 22.75 - samples/sec: 1656.53 - lr: 0.000009 - momentum: 0.000000 2023-10-25 11:51:04,437 epoch 8 - iter 616/1546 - loss 0.00631298 - time (sec): 29.90 - samples/sec: 1680.34 - lr: 0.000009 - momentum: 0.000000 2023-10-25 11:51:11,671 epoch 8 - iter 770/1546 - loss 0.00790924 - time (sec): 37.14 - samples/sec: 1686.23 - lr: 0.000008 - momentum: 0.000000 2023-10-25 11:51:19,533 epoch 8 - iter 924/1546 - loss 0.00831544 - time (sec): 45.00 - samples/sec: 1647.23 - lr: 0.000008 - momentum: 0.000000 2023-10-25 11:51:27,297 epoch 8 - iter 1078/1546 - loss 0.00748678 - time (sec): 52.76 - samples/sec: 1643.33 - lr: 0.000008 - momentum: 0.000000 2023-10-25 11:51:34,604 epoch 8 - iter 1232/1546 - loss 0.00772769 - time (sec): 60.07 - samples/sec: 1647.28 - lr: 0.000007 - momentum: 0.000000 2023-10-25 11:51:42,052 epoch 8 - iter 1386/1546 - loss 0.00787427 - time (sec): 67.52 - samples/sec: 1646.17 - lr: 0.000007 - momentum: 0.000000 2023-10-25 11:51:49,581 epoch 8 - iter 1540/1546 - loss 0.00769658 - time (sec): 75.05 - samples/sec: 1647.48 - lr: 0.000007 - momentum: 0.000000 2023-10-25 11:51:49,868 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:51:49,869 EPOCH 8 done: loss 0.0077 - lr: 0.000007 2023-10-25 11:51:52,497 DEV : loss 0.12457352131605148 - f1-score (micro avg) 0.7711 2023-10-25 11:51:52,514 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:51:59,681 epoch 9 - iter 154/1546 - loss 0.00041456 - time (sec): 7.16 - samples/sec: 1711.85 - lr: 0.000006 - momentum: 0.000000 2023-10-25 11:52:06,803 epoch 9 - iter 308/1546 - loss 0.00177273 - time (sec): 14.29 - samples/sec: 1708.48 - lr: 0.000006 - momentum: 0.000000 2023-10-25 11:52:14,009 epoch 9 - iter 462/1546 - loss 0.00236157 - time (sec): 21.49 - samples/sec: 1700.40 - lr: 0.000006 - momentum: 0.000000 2023-10-25 11:52:21,509 epoch 9 - iter 616/1546 - loss 0.00235457 - time (sec): 28.99 - samples/sec: 1686.28 - lr: 0.000005 - momentum: 0.000000 2023-10-25 11:52:28,988 epoch 9 - iter 770/1546 - loss 0.00238341 - time (sec): 36.47 - samples/sec: 1708.73 - lr: 0.000005 - momentum: 0.000000 2023-10-25 11:52:36,229 epoch 9 - iter 924/1546 - loss 0.00293196 - time (sec): 43.71 - samples/sec: 1705.91 - lr: 0.000005 - momentum: 0.000000 2023-10-25 11:52:43,395 epoch 9 - iter 1078/1546 - loss 0.00332746 - time (sec): 50.88 - samples/sec: 1706.36 - lr: 0.000004 - momentum: 0.000000 2023-10-25 11:52:50,803 epoch 9 - iter 1232/1546 - loss 0.00325536 - time (sec): 58.29 - samples/sec: 1711.56 - lr: 0.000004 - momentum: 0.000000 2023-10-25 11:52:57,940 epoch 9 - iter 1386/1546 - loss 0.00305967 - time (sec): 65.42 - samples/sec: 1718.29 - lr: 0.000004 - momentum: 0.000000 2023-10-25 11:53:05,168 epoch 9 - iter 1540/1546 - loss 0.00331796 - time (sec): 72.65 - samples/sec: 1704.75 - lr: 0.000003 - momentum: 0.000000 2023-10-25 11:53:05,460 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:53:05,461 EPOCH 9 done: loss 0.0033 - lr: 0.000003 2023-10-25 11:53:08,203 DEV : loss 0.13910257816314697 - f1-score (micro avg) 0.7556 2023-10-25 11:53:08,222 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:53:16,231 epoch 10 - iter 154/1546 - loss 0.00160290 - time (sec): 8.01 - samples/sec: 1710.16 - lr: 0.000003 - momentum: 0.000000 2023-10-25 11:53:23,763 epoch 10 - iter 308/1546 - loss 0.00092618 - time (sec): 15.54 - samples/sec: 1641.12 - lr: 0.000003 - momentum: 0.000000 2023-10-25 11:53:31,052 epoch 10 - iter 462/1546 - loss 0.00089840 - time (sec): 22.83 - samples/sec: 1677.56 - lr: 0.000002 - momentum: 0.000000 2023-10-25 11:53:38,496 epoch 10 - iter 616/1546 - loss 0.00080765 - time (sec): 30.27 - samples/sec: 1673.54 - lr: 0.000002 - momentum: 0.000000 2023-10-25 11:53:46,437 epoch 10 - iter 770/1546 - loss 0.00123779 - time (sec): 38.21 - samples/sec: 1664.13 - lr: 0.000002 - momentum: 0.000000 2023-10-25 11:53:54,207 epoch 10 - iter 924/1546 - loss 0.00139221 - time (sec): 45.98 - samples/sec: 1651.90 - lr: 0.000001 - momentum: 0.000000 2023-10-25 11:54:01,443 epoch 10 - iter 1078/1546 - loss 0.00139080 - time (sec): 53.22 - samples/sec: 1649.75 - lr: 0.000001 - momentum: 0.000000 2023-10-25 11:54:08,582 epoch 10 - iter 1232/1546 - loss 0.00157486 - time (sec): 60.36 - samples/sec: 1651.99 - lr: 0.000001 - momentum: 0.000000 2023-10-25 11:54:15,731 epoch 10 - iter 1386/1546 - loss 0.00171419 - time (sec): 67.51 - samples/sec: 1658.30 - lr: 0.000000 - momentum: 0.000000 2023-10-25 11:54:22,847 epoch 10 - iter 1540/1546 - loss 0.00206141 - time (sec): 74.62 - samples/sec: 1660.27 - lr: 0.000000 - momentum: 0.000000 2023-10-25 11:54:23,121 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:54:23,121 EPOCH 10 done: loss 0.0022 - lr: 0.000000 2023-10-25 11:54:25,618 DEV : loss 0.1443907469511032 - f1-score (micro avg) 0.7711 2023-10-25 11:54:26,104 ---------------------------------------------------------------------------------------------------- 2023-10-25 11:54:26,106 Loading model from best epoch ... 2023-10-25 11:54:27,965 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-25 11:54:37,116 Results: - F-score (micro) 0.7975 - F-score (macro) 0.7224 - Accuracy 0.6816 By class: precision recall f1-score support LOC 0.8264 0.8552 0.8405 946 BUILDING 0.5924 0.5892 0.5908 185 STREET 0.7800 0.6964 0.7358 56 micro avg 0.7890 0.8062 0.7975 1187 macro avg 0.7329 0.7136 0.7224 1187 weighted avg 0.7877 0.8062 0.7967 1187 2023-10-25 11:54:37,116 ----------------------------------------------------------------------------------------------------