2023-10-17 12:10:54,015 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:54,017 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 12:10:54,017 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:54,017 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-17 12:10:54,017 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:54,018 Train: 6183 sentences 2023-10-17 12:10:54,018 (train_with_dev=False, train_with_test=False) 2023-10-17 12:10:54,018 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:54,018 Training Params: 2023-10-17 12:10:54,018 - learning_rate: "5e-05" 2023-10-17 12:10:54,018 - mini_batch_size: "8" 2023-10-17 12:10:54,018 - max_epochs: "10" 2023-10-17 12:10:54,018 - shuffle: "True" 2023-10-17 12:10:54,018 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:54,018 Plugins: 2023-10-17 12:10:54,018 - TensorboardLogger 2023-10-17 12:10:54,018 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 12:10:54,018 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:54,018 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 12:10:54,018 - metric: "('micro avg', 'f1-score')" 2023-10-17 12:10:54,019 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:54,019 Computation: 2023-10-17 12:10:54,019 - compute on device: cuda:0 2023-10-17 12:10:54,019 - embedding storage: none 2023-10-17 12:10:54,019 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:54,019 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 12:10:54,019 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:54,019 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:54,019 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 12:11:01,865 epoch 1 - iter 77/773 - loss 2.27358902 - time (sec): 7.84 - samples/sec: 1568.90 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:11:09,552 epoch 1 - iter 154/773 - loss 1.28492320 - time (sec): 15.53 - samples/sec: 1586.43 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:11:16,988 epoch 1 - iter 231/773 - loss 0.89630823 - time (sec): 22.97 - samples/sec: 1630.13 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:11:24,232 epoch 1 - iter 308/773 - loss 0.69503268 - time (sec): 30.21 - samples/sec: 1659.48 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:11:31,214 epoch 1 - iter 385/773 - loss 0.57422721 - time (sec): 37.19 - samples/sec: 1681.47 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:11:38,106 epoch 1 - iter 462/773 - loss 0.50399350 - time (sec): 44.09 - samples/sec: 1681.33 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:11:45,019 epoch 1 - iter 539/773 - loss 0.45234319 - time (sec): 51.00 - samples/sec: 1677.54 - lr: 0.000035 - momentum: 0.000000 2023-10-17 12:11:52,068 epoch 1 - iter 616/773 - loss 0.40829546 - time (sec): 58.05 - samples/sec: 1690.46 - lr: 0.000040 - momentum: 0.000000 2023-10-17 12:11:59,335 epoch 1 - iter 693/773 - loss 0.37216900 - time (sec): 65.31 - samples/sec: 1702.63 - lr: 0.000045 - momentum: 0.000000 2023-10-17 12:12:06,619 epoch 1 - iter 770/773 - loss 0.34504409 - time (sec): 72.60 - samples/sec: 1704.05 - lr: 0.000050 - momentum: 0.000000 2023-10-17 12:12:06,892 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:06,892 EPOCH 1 done: loss 0.3436 - lr: 0.000050 2023-10-17 12:12:09,714 DEV : loss 0.05054343491792679 - f1-score (micro avg) 0.7709 2023-10-17 12:12:09,743 saving best model 2023-10-17 12:12:10,290 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:17,247 epoch 2 - iter 77/773 - loss 0.08155197 - time (sec): 6.95 - samples/sec: 1705.10 - lr: 0.000049 - momentum: 0.000000 2023-10-17 12:12:24,390 epoch 2 - iter 154/773 - loss 0.07622488 - time (sec): 14.10 - samples/sec: 1702.47 - lr: 0.000049 - momentum: 0.000000 2023-10-17 12:12:31,488 epoch 2 - iter 231/773 - loss 0.07965895 - time (sec): 21.20 - samples/sec: 1718.95 - lr: 0.000048 - momentum: 0.000000 2023-10-17 12:12:38,688 epoch 2 - iter 308/773 - loss 0.07956966 - time (sec): 28.40 - samples/sec: 1742.65 - lr: 0.000048 - momentum: 0.000000 2023-10-17 12:12:45,612 epoch 2 - iter 385/773 - loss 0.07795751 - time (sec): 35.32 - samples/sec: 1741.30 - lr: 0.000047 - momentum: 0.000000 2023-10-17 12:12:53,054 epoch 2 - iter 462/773 - loss 0.07683204 - time (sec): 42.76 - samples/sec: 1742.62 - lr: 0.000047 - momentum: 0.000000 2023-10-17 12:13:00,341 epoch 2 - iter 539/773 - loss 0.07597155 - time (sec): 50.05 - samples/sec: 1728.99 - lr: 0.000046 - momentum: 0.000000 2023-10-17 12:13:07,374 epoch 2 - iter 616/773 - loss 0.07614787 - time (sec): 57.08 - samples/sec: 1719.02 - lr: 0.000046 - momentum: 0.000000 2023-10-17 12:13:14,672 epoch 2 - iter 693/773 - loss 0.07596161 - time (sec): 64.38 - samples/sec: 1740.23 - lr: 0.000045 - momentum: 0.000000 2023-10-17 12:13:21,624 epoch 2 - iter 770/773 - loss 0.07575798 - time (sec): 71.33 - samples/sec: 1736.44 - lr: 0.000044 - momentum: 0.000000 2023-10-17 12:13:21,880 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:13:21,880 EPOCH 2 done: loss 0.0764 - lr: 0.000044 2023-10-17 12:13:24,769 DEV : loss 0.060969553887844086 - f1-score (micro avg) 0.7797 2023-10-17 12:13:24,797 saving best model 2023-10-17 12:13:26,229 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:13:33,708 epoch 3 - iter 77/773 - loss 0.05691335 - time (sec): 7.48 - samples/sec: 1588.57 - lr: 0.000044 - momentum: 0.000000 2023-10-17 12:13:40,862 epoch 3 - iter 154/773 - loss 0.05543103 - time (sec): 14.63 - samples/sec: 1575.99 - lr: 0.000043 - momentum: 0.000000 2023-10-17 12:13:48,597 epoch 3 - iter 231/773 - loss 0.05223850 - time (sec): 22.37 - samples/sec: 1588.53 - lr: 0.000043 - momentum: 0.000000 2023-10-17 12:13:56,436 epoch 3 - iter 308/773 - loss 0.05701590 - time (sec): 30.20 - samples/sec: 1604.74 - lr: 0.000042 - momentum: 0.000000 2023-10-17 12:14:03,668 epoch 3 - iter 385/773 - loss 0.05680066 - time (sec): 37.44 - samples/sec: 1636.55 - lr: 0.000042 - momentum: 0.000000 2023-10-17 12:14:10,974 epoch 3 - iter 462/773 - loss 0.05553539 - time (sec): 44.74 - samples/sec: 1647.92 - lr: 0.000041 - momentum: 0.000000 2023-10-17 12:14:18,543 epoch 3 - iter 539/773 - loss 0.05371091 - time (sec): 52.31 - samples/sec: 1646.81 - lr: 0.000041 - momentum: 0.000000 2023-10-17 12:14:25,634 epoch 3 - iter 616/773 - loss 0.05290883 - time (sec): 59.40 - samples/sec: 1658.95 - lr: 0.000040 - momentum: 0.000000 2023-10-17 12:14:32,607 epoch 3 - iter 693/773 - loss 0.05303557 - time (sec): 66.38 - samples/sec: 1675.11 - lr: 0.000039 - momentum: 0.000000 2023-10-17 12:14:39,527 epoch 3 - iter 770/773 - loss 0.05343209 - time (sec): 73.30 - samples/sec: 1690.35 - lr: 0.000039 - momentum: 0.000000 2023-10-17 12:14:39,792 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:14:39,792 EPOCH 3 done: loss 0.0534 - lr: 0.000039 2023-10-17 12:14:43,369 DEV : loss 0.06324774026870728 - f1-score (micro avg) 0.7828 2023-10-17 12:14:43,398 saving best model 2023-10-17 12:14:44,792 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:14:51,947 epoch 4 - iter 77/773 - loss 0.03068313 - time (sec): 7.15 - samples/sec: 1814.87 - lr: 0.000038 - momentum: 0.000000 2023-10-17 12:14:58,828 epoch 4 - iter 154/773 - loss 0.03265423 - time (sec): 14.03 - samples/sec: 1878.85 - lr: 0.000038 - momentum: 0.000000 2023-10-17 12:15:05,950 epoch 4 - iter 231/773 - loss 0.03449530 - time (sec): 21.15 - samples/sec: 1807.32 - lr: 0.000037 - momentum: 0.000000 2023-10-17 12:15:13,436 epoch 4 - iter 308/773 - loss 0.03398335 - time (sec): 28.64 - samples/sec: 1761.03 - lr: 0.000037 - momentum: 0.000000 2023-10-17 12:15:20,851 epoch 4 - iter 385/773 - loss 0.03467835 - time (sec): 36.05 - samples/sec: 1738.11 - lr: 0.000036 - momentum: 0.000000 2023-10-17 12:15:27,828 epoch 4 - iter 462/773 - loss 0.03562380 - time (sec): 43.03 - samples/sec: 1740.98 - lr: 0.000036 - momentum: 0.000000 2023-10-17 12:15:34,828 epoch 4 - iter 539/773 - loss 0.03552931 - time (sec): 50.03 - samples/sec: 1732.13 - lr: 0.000035 - momentum: 0.000000 2023-10-17 12:15:42,164 epoch 4 - iter 616/773 - loss 0.03520683 - time (sec): 57.37 - samples/sec: 1714.17 - lr: 0.000034 - momentum: 0.000000 2023-10-17 12:15:49,162 epoch 4 - iter 693/773 - loss 0.03524038 - time (sec): 64.37 - samples/sec: 1730.61 - lr: 0.000034 - momentum: 0.000000 2023-10-17 12:15:56,435 epoch 4 - iter 770/773 - loss 0.03520498 - time (sec): 71.64 - samples/sec: 1727.23 - lr: 0.000033 - momentum: 0.000000 2023-10-17 12:15:56,704 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:15:56,704 EPOCH 4 done: loss 0.0351 - lr: 0.000033 2023-10-17 12:15:59,760 DEV : loss 0.07864461094141006 - f1-score (micro avg) 0.7883 2023-10-17 12:15:59,793 saving best model 2023-10-17 12:16:01,268 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:16:09,350 epoch 5 - iter 77/773 - loss 0.01806880 - time (sec): 8.08 - samples/sec: 1516.45 - lr: 0.000033 - momentum: 0.000000 2023-10-17 12:16:17,412 epoch 5 - iter 154/773 - loss 0.01634694 - time (sec): 16.14 - samples/sec: 1586.65 - lr: 0.000032 - momentum: 0.000000 2023-10-17 12:16:25,081 epoch 5 - iter 231/773 - loss 0.02116858 - time (sec): 23.81 - samples/sec: 1556.13 - lr: 0.000032 - momentum: 0.000000 2023-10-17 12:16:32,141 epoch 5 - iter 308/773 - loss 0.02265586 - time (sec): 30.87 - samples/sec: 1582.52 - lr: 0.000031 - momentum: 0.000000 2023-10-17 12:16:39,584 epoch 5 - iter 385/773 - loss 0.02309313 - time (sec): 38.31 - samples/sec: 1591.66 - lr: 0.000031 - momentum: 0.000000 2023-10-17 12:16:46,705 epoch 5 - iter 462/773 - loss 0.02432431 - time (sec): 45.43 - samples/sec: 1613.87 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:16:54,223 epoch 5 - iter 539/773 - loss 0.02426732 - time (sec): 52.95 - samples/sec: 1627.20 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:17:01,621 epoch 5 - iter 616/773 - loss 0.02452926 - time (sec): 60.35 - samples/sec: 1638.68 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:17:08,488 epoch 5 - iter 693/773 - loss 0.02576954 - time (sec): 67.22 - samples/sec: 1655.54 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:17:15,864 epoch 5 - iter 770/773 - loss 0.02497077 - time (sec): 74.59 - samples/sec: 1655.89 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:17:16,171 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:17:16,172 EPOCH 5 done: loss 0.0250 - lr: 0.000028 2023-10-17 12:17:19,089 DEV : loss 0.10095161199569702 - f1-score (micro avg) 0.7728 2023-10-17 12:17:19,118 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:17:26,175 epoch 6 - iter 77/773 - loss 0.01526363 - time (sec): 7.05 - samples/sec: 1679.43 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:17:33,267 epoch 6 - iter 154/773 - loss 0.01252046 - time (sec): 14.15 - samples/sec: 1634.37 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:17:40,472 epoch 6 - iter 231/773 - loss 0.01333030 - time (sec): 21.35 - samples/sec: 1678.20 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:17:47,744 epoch 6 - iter 308/773 - loss 0.01413712 - time (sec): 28.62 - samples/sec: 1716.06 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:17:55,009 epoch 6 - iter 385/773 - loss 0.01429188 - time (sec): 35.89 - samples/sec: 1715.87 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:18:02,070 epoch 6 - iter 462/773 - loss 0.01408333 - time (sec): 42.95 - samples/sec: 1718.77 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:18:09,083 epoch 6 - iter 539/773 - loss 0.01441330 - time (sec): 49.96 - samples/sec: 1730.73 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:18:16,336 epoch 6 - iter 616/773 - loss 0.01417851 - time (sec): 57.22 - samples/sec: 1726.44 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:18:23,259 epoch 6 - iter 693/773 - loss 0.01508300 - time (sec): 64.14 - samples/sec: 1736.84 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:18:30,644 epoch 6 - iter 770/773 - loss 0.01499628 - time (sec): 71.52 - samples/sec: 1730.24 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:18:30,936 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:18:30,936 EPOCH 6 done: loss 0.0149 - lr: 0.000022 2023-10-17 12:18:34,013 DEV : loss 0.11657045781612396 - f1-score (micro avg) 0.7821 2023-10-17 12:18:34,050 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:18:41,082 epoch 7 - iter 77/773 - loss 0.02008253 - time (sec): 7.03 - samples/sec: 1674.41 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:18:48,169 epoch 7 - iter 154/773 - loss 0.01535619 - time (sec): 14.12 - samples/sec: 1700.59 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:18:55,677 epoch 7 - iter 231/773 - loss 0.01294834 - time (sec): 21.62 - samples/sec: 1697.36 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:19:03,627 epoch 7 - iter 308/773 - loss 0.01346406 - time (sec): 29.57 - samples/sec: 1641.60 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:19:10,786 epoch 7 - iter 385/773 - loss 0.01342158 - time (sec): 36.73 - samples/sec: 1641.83 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:19:18,609 epoch 7 - iter 462/773 - loss 0.01355346 - time (sec): 44.56 - samples/sec: 1635.28 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:19:26,629 epoch 7 - iter 539/773 - loss 0.01366443 - time (sec): 52.58 - samples/sec: 1657.81 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:19:33,567 epoch 7 - iter 616/773 - loss 0.01485489 - time (sec): 59.52 - samples/sec: 1666.22 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:19:40,418 epoch 7 - iter 693/773 - loss 0.01448091 - time (sec): 66.37 - samples/sec: 1675.73 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:19:47,473 epoch 7 - iter 770/773 - loss 0.01438771 - time (sec): 73.42 - samples/sec: 1682.77 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:19:47,841 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:19:47,841 EPOCH 7 done: loss 0.0145 - lr: 0.000017 2023-10-17 12:19:51,213 DEV : loss 0.11184996366500854 - f1-score (micro avg) 0.7992 2023-10-17 12:19:51,254 saving best model 2023-10-17 12:19:54,198 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:20:01,469 epoch 8 - iter 77/773 - loss 0.01187014 - time (sec): 7.26 - samples/sec: 1607.87 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:20:08,493 epoch 8 - iter 154/773 - loss 0.00827544 - time (sec): 14.28 - samples/sec: 1695.05 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:20:15,539 epoch 8 - iter 231/773 - loss 0.00856734 - time (sec): 21.33 - samples/sec: 1670.13 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:20:22,640 epoch 8 - iter 308/773 - loss 0.00970665 - time (sec): 28.43 - samples/sec: 1664.01 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:20:30,456 epoch 8 - iter 385/773 - loss 0.00933979 - time (sec): 36.24 - samples/sec: 1668.25 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:20:37,908 epoch 8 - iter 462/773 - loss 0.00935672 - time (sec): 43.70 - samples/sec: 1689.61 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:20:44,824 epoch 8 - iter 539/773 - loss 0.00851661 - time (sec): 50.61 - samples/sec: 1703.67 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:20:51,845 epoch 8 - iter 616/773 - loss 0.00834268 - time (sec): 57.63 - samples/sec: 1719.14 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:20:59,467 epoch 8 - iter 693/773 - loss 0.00915411 - time (sec): 65.26 - samples/sec: 1704.30 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:21:07,364 epoch 8 - iter 770/773 - loss 0.00869560 - time (sec): 73.15 - samples/sec: 1691.62 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:21:07,665 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:21:07,665 EPOCH 8 done: loss 0.0087 - lr: 0.000011 2023-10-17 12:21:10,540 DEV : loss 0.11721143871545792 - f1-score (micro avg) 0.7928 2023-10-17 12:21:10,573 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:21:17,760 epoch 9 - iter 77/773 - loss 0.00404184 - time (sec): 7.19 - samples/sec: 1752.92 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:21:25,221 epoch 9 - iter 154/773 - loss 0.00338043 - time (sec): 14.65 - samples/sec: 1786.59 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:21:32,259 epoch 9 - iter 231/773 - loss 0.00420525 - time (sec): 21.68 - samples/sec: 1752.96 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:21:39,366 epoch 9 - iter 308/773 - loss 0.00385165 - time (sec): 28.79 - samples/sec: 1760.89 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:21:46,536 epoch 9 - iter 385/773 - loss 0.00371246 - time (sec): 35.96 - samples/sec: 1751.60 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:21:53,556 epoch 9 - iter 462/773 - loss 0.00392293 - time (sec): 42.98 - samples/sec: 1735.64 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:22:01,385 epoch 9 - iter 539/773 - loss 0.00392321 - time (sec): 50.81 - samples/sec: 1710.15 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:22:09,681 epoch 9 - iter 616/773 - loss 0.00415350 - time (sec): 59.11 - samples/sec: 1684.90 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:22:17,492 epoch 9 - iter 693/773 - loss 0.00398640 - time (sec): 66.92 - samples/sec: 1683.37 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:22:24,368 epoch 9 - iter 770/773 - loss 0.00430291 - time (sec): 73.79 - samples/sec: 1677.23 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:22:24,640 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:22:24,641 EPOCH 9 done: loss 0.0043 - lr: 0.000006 2023-10-17 12:22:27,525 DEV : loss 0.12214227765798569 - f1-score (micro avg) 0.7764 2023-10-17 12:22:27,555 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:22:34,540 epoch 10 - iter 77/773 - loss 0.00155759 - time (sec): 6.98 - samples/sec: 1732.40 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:22:41,794 epoch 10 - iter 154/773 - loss 0.00134873 - time (sec): 14.24 - samples/sec: 1811.13 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:22:48,787 epoch 10 - iter 231/773 - loss 0.00196218 - time (sec): 21.23 - samples/sec: 1803.61 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:22:55,719 epoch 10 - iter 308/773 - loss 0.00238035 - time (sec): 28.16 - samples/sec: 1780.78 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:23:02,983 epoch 10 - iter 385/773 - loss 0.00250737 - time (sec): 35.43 - samples/sec: 1776.31 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:23:10,609 epoch 10 - iter 462/773 - loss 0.00226211 - time (sec): 43.05 - samples/sec: 1760.71 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:23:18,057 epoch 10 - iter 539/773 - loss 0.00212807 - time (sec): 50.50 - samples/sec: 1745.25 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:23:24,780 epoch 10 - iter 616/773 - loss 0.00232170 - time (sec): 57.22 - samples/sec: 1728.97 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:23:32,018 epoch 10 - iter 693/773 - loss 0.00220710 - time (sec): 64.46 - samples/sec: 1736.39 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:23:38,926 epoch 10 - iter 770/773 - loss 0.00216413 - time (sec): 71.37 - samples/sec: 1734.60 - lr: 0.000000 - momentum: 0.000000 2023-10-17 12:23:39,203 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:23:39,203 EPOCH 10 done: loss 0.0022 - lr: 0.000000 2023-10-17 12:23:42,011 DEV : loss 0.12312635034322739 - f1-score (micro avg) 0.7927 2023-10-17 12:23:42,956 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:23:42,958 Loading model from best epoch ... 2023-10-17 12:23:45,060 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-17 12:23:52,985 Results: - F-score (micro) 0.792 - F-score (macro) 0.6961 - Accuracy 0.6819 By class: precision recall f1-score support LOC 0.7973 0.8732 0.8335 946 BUILDING 0.6277 0.6378 0.6327 185 STREET 0.5316 0.7500 0.6222 56 micro avg 0.7567 0.8307 0.7920 1187 macro avg 0.6522 0.7537 0.6961 1187 weighted avg 0.7583 0.8307 0.7922 1187 2023-10-17 12:23:52,985 ----------------------------------------------------------------------------------------------------