2023-10-17 14:02:10,167 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:02:10,168 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 14:02:10,169 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:02:10,169 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator 2023-10-17 14:02:10,169 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:02:10,169 Train: 14465 sentences 2023-10-17 14:02:10,169 (train_with_dev=False, train_with_test=False) 2023-10-17 14:02:10,169 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:02:10,169 Training Params: 2023-10-17 14:02:10,169 - learning_rate: "5e-05" 2023-10-17 14:02:10,169 - mini_batch_size: "4" 2023-10-17 14:02:10,169 - max_epochs: "10" 2023-10-17 14:02:10,169 - shuffle: "True" 2023-10-17 14:02:10,169 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:02:10,169 Plugins: 2023-10-17 14:02:10,169 - TensorboardLogger 2023-10-17 14:02:10,169 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 14:02:10,169 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:02:10,169 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 14:02:10,169 - metric: "('micro avg', 'f1-score')" 2023-10-17 14:02:10,169 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:02:10,169 Computation: 2023-10-17 14:02:10,170 - compute on device: cuda:0 2023-10-17 14:02:10,170 - embedding storage: none 2023-10-17 14:02:10,170 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:02:10,170 Model training base path: "hmbench-letemps/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 14:02:10,170 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:02:10,170 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:02:10,170 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 14:02:34,256 epoch 1 - iter 361/3617 - loss 1.48829931 - time (sec): 24.08 - samples/sec: 1540.09 - lr: 0.000005 - momentum: 0.000000 2023-10-17 14:02:56,755 epoch 1 - iter 722/3617 - loss 0.85033394 - time (sec): 46.58 - samples/sec: 1585.21 - lr: 0.000010 - momentum: 0.000000 2023-10-17 14:03:19,865 epoch 1 - iter 1083/3617 - loss 0.61422689 - time (sec): 69.69 - samples/sec: 1603.39 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:03:42,586 epoch 1 - iter 1444/3617 - loss 0.48871525 - time (sec): 92.42 - samples/sec: 1627.24 - lr: 0.000020 - momentum: 0.000000 2023-10-17 14:04:05,993 epoch 1 - iter 1805/3617 - loss 0.42029506 - time (sec): 115.82 - samples/sec: 1608.33 - lr: 0.000025 - momentum: 0.000000 2023-10-17 14:04:29,266 epoch 1 - iter 2166/3617 - loss 0.36961441 - time (sec): 139.10 - samples/sec: 1616.75 - lr: 0.000030 - momentum: 0.000000 2023-10-17 14:04:52,250 epoch 1 - iter 2527/3617 - loss 0.33613265 - time (sec): 162.08 - samples/sec: 1612.76 - lr: 0.000035 - momentum: 0.000000 2023-10-17 14:05:14,431 epoch 1 - iter 2888/3617 - loss 0.30650713 - time (sec): 184.26 - samples/sec: 1630.13 - lr: 0.000040 - momentum: 0.000000 2023-10-17 14:05:36,910 epoch 1 - iter 3249/3617 - loss 0.28708837 - time (sec): 206.74 - samples/sec: 1640.44 - lr: 0.000045 - momentum: 0.000000 2023-10-17 14:05:58,689 epoch 1 - iter 3610/3617 - loss 0.27015219 - time (sec): 228.52 - samples/sec: 1658.39 - lr: 0.000050 - momentum: 0.000000 2023-10-17 14:05:59,151 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:05:59,152 EPOCH 1 done: loss 0.2697 - lr: 0.000050 2023-10-17 14:06:05,187 DEV : loss 0.1629505157470703 - f1-score (micro avg) 0.5772 2023-10-17 14:06:05,251 saving best model 2023-10-17 14:06:05,769 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:06:28,184 epoch 2 - iter 361/3617 - loss 0.10265159 - time (sec): 22.41 - samples/sec: 1705.78 - lr: 0.000049 - momentum: 0.000000 2023-10-17 14:06:50,375 epoch 2 - iter 722/3617 - loss 0.10329841 - time (sec): 44.60 - samples/sec: 1719.99 - lr: 0.000049 - momentum: 0.000000 2023-10-17 14:07:14,154 epoch 2 - iter 1083/3617 - loss 0.10399428 - time (sec): 68.38 - samples/sec: 1692.32 - lr: 0.000048 - momentum: 0.000000 2023-10-17 14:07:36,459 epoch 2 - iter 1444/3617 - loss 0.10410871 - time (sec): 90.69 - samples/sec: 1699.62 - lr: 0.000048 - momentum: 0.000000 2023-10-17 14:07:58,704 epoch 2 - iter 1805/3617 - loss 0.10181973 - time (sec): 112.93 - samples/sec: 1700.26 - lr: 0.000047 - momentum: 0.000000 2023-10-17 14:08:20,360 epoch 2 - iter 2166/3617 - loss 0.10423388 - time (sec): 134.59 - samples/sec: 1705.53 - lr: 0.000047 - momentum: 0.000000 2023-10-17 14:08:43,107 epoch 2 - iter 2527/3617 - loss 0.10442492 - time (sec): 157.34 - samples/sec: 1699.32 - lr: 0.000046 - momentum: 0.000000 2023-10-17 14:09:05,172 epoch 2 - iter 2888/3617 - loss 0.10365023 - time (sec): 179.40 - samples/sec: 1697.94 - lr: 0.000046 - momentum: 0.000000 2023-10-17 14:09:27,667 epoch 2 - iter 3249/3617 - loss 0.10389621 - time (sec): 201.90 - samples/sec: 1697.09 - lr: 0.000045 - momentum: 0.000000 2023-10-17 14:09:50,045 epoch 2 - iter 3610/3617 - loss 0.10374026 - time (sec): 224.27 - samples/sec: 1690.41 - lr: 0.000044 - momentum: 0.000000 2023-10-17 14:09:50,465 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:09:50,466 EPOCH 2 done: loss 0.1043 - lr: 0.000044 2023-10-17 14:09:56,868 DEV : loss 0.11751864850521088 - f1-score (micro avg) 0.601 2023-10-17 14:09:56,911 saving best model 2023-10-17 14:09:58,265 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:20,522 epoch 3 - iter 361/3617 - loss 0.07782100 - time (sec): 22.26 - samples/sec: 1620.38 - lr: 0.000044 - momentum: 0.000000 2023-10-17 14:10:42,917 epoch 3 - iter 722/3617 - loss 0.13095645 - time (sec): 44.65 - samples/sec: 1649.46 - lr: 0.000043 - momentum: 0.000000 2023-10-17 14:11:06,504 epoch 3 - iter 1083/3617 - loss 0.19694256 - time (sec): 68.24 - samples/sec: 1650.76 - lr: 0.000043 - momentum: 0.000000 2023-10-17 14:11:29,931 epoch 3 - iter 1444/3617 - loss 0.22957207 - time (sec): 91.66 - samples/sec: 1629.45 - lr: 0.000042 - momentum: 0.000000 2023-10-17 14:11:53,130 epoch 3 - iter 1805/3617 - loss 0.24523156 - time (sec): 114.86 - samples/sec: 1630.73 - lr: 0.000042 - momentum: 0.000000 2023-10-17 14:12:16,115 epoch 3 - iter 2166/3617 - loss 0.25071181 - time (sec): 137.85 - samples/sec: 1636.92 - lr: 0.000041 - momentum: 0.000000 2023-10-17 14:12:39,825 epoch 3 - iter 2527/3617 - loss 0.25706230 - time (sec): 161.56 - samples/sec: 1632.66 - lr: 0.000041 - momentum: 0.000000 2023-10-17 14:13:02,066 epoch 3 - iter 2888/3617 - loss 0.26647748 - time (sec): 183.80 - samples/sec: 1641.06 - lr: 0.000040 - momentum: 0.000000 2023-10-17 14:13:24,095 epoch 3 - iter 3249/3617 - loss 0.26949558 - time (sec): 205.83 - samples/sec: 1655.05 - lr: 0.000039 - momentum: 0.000000 2023-10-17 14:13:46,900 epoch 3 - iter 3610/3617 - loss 0.27219290 - time (sec): 228.63 - samples/sec: 1658.50 - lr: 0.000039 - momentum: 0.000000 2023-10-17 14:13:47,341 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:13:47,342 EPOCH 3 done: loss 0.2720 - lr: 0.000039 2023-10-17 14:13:53,508 DEV : loss 0.27629604935646057 - f1-score (micro avg) 0.0 2023-10-17 14:13:53,549 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:14:15,770 epoch 4 - iter 361/3617 - loss 0.28434821 - time (sec): 22.22 - samples/sec: 1745.31 - lr: 0.000038 - momentum: 0.000000 2023-10-17 14:14:37,994 epoch 4 - iter 722/3617 - loss 0.28662856 - time (sec): 44.44 - samples/sec: 1709.19 - lr: 0.000038 - momentum: 0.000000 2023-10-17 14:15:01,635 epoch 4 - iter 1083/3617 - loss 0.28765719 - time (sec): 68.08 - samples/sec: 1682.03 - lr: 0.000037 - momentum: 0.000000 2023-10-17 14:15:25,259 epoch 4 - iter 1444/3617 - loss 0.29310210 - time (sec): 91.71 - samples/sec: 1658.94 - lr: 0.000037 - momentum: 0.000000 2023-10-17 14:15:47,651 epoch 4 - iter 1805/3617 - loss 0.29723610 - time (sec): 114.10 - samples/sec: 1661.49 - lr: 0.000036 - momentum: 0.000000 2023-10-17 14:16:09,310 epoch 4 - iter 2166/3617 - loss 0.29899007 - time (sec): 135.76 - samples/sec: 1679.40 - lr: 0.000036 - momentum: 0.000000 2023-10-17 14:16:29,608 epoch 4 - iter 2527/3617 - loss 0.30085377 - time (sec): 156.06 - samples/sec: 1698.84 - lr: 0.000035 - momentum: 0.000000 2023-10-17 14:16:50,719 epoch 4 - iter 2888/3617 - loss 0.29818638 - time (sec): 177.17 - samples/sec: 1708.80 - lr: 0.000034 - momentum: 0.000000 2023-10-17 14:17:12,618 epoch 4 - iter 3249/3617 - loss 0.29842786 - time (sec): 199.07 - samples/sec: 1710.84 - lr: 0.000034 - momentum: 0.000000 2023-10-17 14:17:34,892 epoch 4 - iter 3610/3617 - loss 0.29517798 - time (sec): 221.34 - samples/sec: 1713.54 - lr: 0.000033 - momentum: 0.000000 2023-10-17 14:17:35,313 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:17:35,313 EPOCH 4 done: loss 0.2954 - lr: 0.000033 2023-10-17 14:17:41,706 DEV : loss 0.27378401160240173 - f1-score (micro avg) 0.0 2023-10-17 14:17:41,753 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:18:03,827 epoch 5 - iter 361/3617 - loss 0.29585582 - time (sec): 22.07 - samples/sec: 1693.01 - lr: 0.000033 - momentum: 0.000000 2023-10-17 14:18:25,797 epoch 5 - iter 722/3617 - loss 0.30734634 - time (sec): 44.04 - samples/sec: 1741.31 - lr: 0.000032 - momentum: 0.000000 2023-10-17 14:18:47,751 epoch 5 - iter 1083/3617 - loss 0.30677295 - time (sec): 66.00 - samples/sec: 1732.69 - lr: 0.000032 - momentum: 0.000000 2023-10-17 14:19:09,931 epoch 5 - iter 1444/3617 - loss 0.30178700 - time (sec): 88.18 - samples/sec: 1722.96 - lr: 0.000031 - momentum: 0.000000 2023-10-17 14:19:29,287 epoch 5 - iter 1805/3617 - loss 0.30360762 - time (sec): 107.53 - samples/sec: 1760.45 - lr: 0.000031 - momentum: 0.000000 2023-10-17 14:19:47,284 epoch 5 - iter 2166/3617 - loss 0.29560017 - time (sec): 125.53 - samples/sec: 1819.40 - lr: 0.000030 - momentum: 0.000000 2023-10-17 14:20:07,689 epoch 5 - iter 2527/3617 - loss 0.29672511 - time (sec): 145.93 - samples/sec: 1819.18 - lr: 0.000029 - momentum: 0.000000 2023-10-17 14:20:31,679 epoch 5 - iter 2888/3617 - loss 0.29532907 - time (sec): 169.92 - samples/sec: 1780.66 - lr: 0.000029 - momentum: 0.000000 2023-10-17 14:20:56,188 epoch 5 - iter 3249/3617 - loss 0.29258929 - time (sec): 194.43 - samples/sec: 1748.47 - lr: 0.000028 - momentum: 0.000000 2023-10-17 14:21:19,125 epoch 5 - iter 3610/3617 - loss 0.29276919 - time (sec): 217.37 - samples/sec: 1744.62 - lr: 0.000028 - momentum: 0.000000 2023-10-17 14:21:19,529 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:21:19,529 EPOCH 5 done: loss 0.2923 - lr: 0.000028 2023-10-17 14:21:26,668 DEV : loss 0.28267860412597656 - f1-score (micro avg) 0.0 2023-10-17 14:21:26,710 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:21:49,628 epoch 6 - iter 361/3617 - loss 0.31633895 - time (sec): 22.92 - samples/sec: 1657.30 - lr: 0.000027 - momentum: 0.000000 2023-10-17 14:22:12,592 epoch 6 - iter 722/3617 - loss 0.30176974 - time (sec): 45.88 - samples/sec: 1675.35 - lr: 0.000027 - momentum: 0.000000 2023-10-17 14:22:35,655 epoch 6 - iter 1083/3617 - loss 0.29196082 - time (sec): 68.94 - samples/sec: 1628.11 - lr: 0.000026 - momentum: 0.000000 2023-10-17 14:22:57,362 epoch 6 - iter 1444/3617 - loss 0.29309770 - time (sec): 90.65 - samples/sec: 1655.27 - lr: 0.000026 - momentum: 0.000000 2023-10-17 14:23:19,415 epoch 6 - iter 1805/3617 - loss 0.29093340 - time (sec): 112.70 - samples/sec: 1670.26 - lr: 0.000025 - momentum: 0.000000 2023-10-17 14:23:42,194 epoch 6 - iter 2166/3617 - loss 0.28848656 - time (sec): 135.48 - samples/sec: 1663.06 - lr: 0.000024 - momentum: 0.000000 2023-10-17 14:24:02,974 epoch 6 - iter 2527/3617 - loss 0.28834683 - time (sec): 156.26 - samples/sec: 1688.07 - lr: 0.000024 - momentum: 0.000000 2023-10-17 14:24:25,241 epoch 6 - iter 2888/3617 - loss 0.28767752 - time (sec): 178.53 - samples/sec: 1700.00 - lr: 0.000023 - momentum: 0.000000 2023-10-17 14:24:46,320 epoch 6 - iter 3249/3617 - loss 0.28898388 - time (sec): 199.61 - samples/sec: 1708.79 - lr: 0.000023 - momentum: 0.000000 2023-10-17 14:25:08,367 epoch 6 - iter 3610/3617 - loss 0.29119534 - time (sec): 221.66 - samples/sec: 1711.55 - lr: 0.000022 - momentum: 0.000000 2023-10-17 14:25:08,813 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:25:08,813 EPOCH 6 done: loss 0.2909 - lr: 0.000022 2023-10-17 14:25:15,238 DEV : loss 0.2753566801548004 - f1-score (micro avg) 0.0 2023-10-17 14:25:15,284 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:25:38,480 epoch 7 - iter 361/3617 - loss 0.29459342 - time (sec): 23.19 - samples/sec: 1582.47 - lr: 0.000022 - momentum: 0.000000 2023-10-17 14:26:02,167 epoch 7 - iter 722/3617 - loss 0.29999554 - time (sec): 46.88 - samples/sec: 1574.61 - lr: 0.000021 - momentum: 0.000000 2023-10-17 14:26:22,244 epoch 7 - iter 1083/3617 - loss 0.30367724 - time (sec): 66.96 - samples/sec: 1675.78 - lr: 0.000021 - momentum: 0.000000 2023-10-17 14:26:42,989 epoch 7 - iter 1444/3617 - loss 0.29949911 - time (sec): 87.70 - samples/sec: 1729.77 - lr: 0.000020 - momentum: 0.000000 2023-10-17 14:27:04,907 epoch 7 - iter 1805/3617 - loss 0.29458090 - time (sec): 109.62 - samples/sec: 1730.90 - lr: 0.000019 - momentum: 0.000000 2023-10-17 14:27:27,152 epoch 7 - iter 2166/3617 - loss 0.29447266 - time (sec): 131.87 - samples/sec: 1722.54 - lr: 0.000019 - momentum: 0.000000 2023-10-17 14:27:49,678 epoch 7 - iter 2527/3617 - loss 0.29196948 - time (sec): 154.39 - samples/sec: 1713.11 - lr: 0.000018 - momentum: 0.000000 2023-10-17 14:28:12,093 epoch 7 - iter 2888/3617 - loss 0.29160380 - time (sec): 176.81 - samples/sec: 1714.83 - lr: 0.000018 - momentum: 0.000000 2023-10-17 14:28:32,813 epoch 7 - iter 3249/3617 - loss 0.29100505 - time (sec): 197.53 - samples/sec: 1720.07 - lr: 0.000017 - momentum: 0.000000 2023-10-17 14:28:55,485 epoch 7 - iter 3610/3617 - loss 0.28909597 - time (sec): 220.20 - samples/sec: 1722.46 - lr: 0.000017 - momentum: 0.000000 2023-10-17 14:28:55,904 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:28:55,904 EPOCH 7 done: loss 0.2891 - lr: 0.000017 2023-10-17 14:29:02,968 DEV : loss 0.27626705169677734 - f1-score (micro avg) 0.0 2023-10-17 14:29:03,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:29:25,242 epoch 8 - iter 361/3617 - loss 0.31236022 - time (sec): 22.23 - samples/sec: 1716.62 - lr: 0.000016 - momentum: 0.000000 2023-10-17 14:29:47,467 epoch 8 - iter 722/3617 - loss 0.28321192 - time (sec): 44.45 - samples/sec: 1720.68 - lr: 0.000016 - momentum: 0.000000 2023-10-17 14:30:09,788 epoch 8 - iter 1083/3617 - loss 0.29106394 - time (sec): 66.78 - samples/sec: 1723.35 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:30:28,711 epoch 8 - iter 1444/3617 - loss 0.28998021 - time (sec): 85.70 - samples/sec: 1792.72 - lr: 0.000014 - momentum: 0.000000 2023-10-17 14:30:48,687 epoch 8 - iter 1805/3617 - loss 0.29273776 - time (sec): 105.67 - samples/sec: 1807.53 - lr: 0.000014 - momentum: 0.000000 2023-10-17 14:31:12,091 epoch 8 - iter 2166/3617 - loss 0.29087306 - time (sec): 129.08 - samples/sec: 1765.52 - lr: 0.000013 - momentum: 0.000000 2023-10-17 14:31:35,020 epoch 8 - iter 2527/3617 - loss 0.29104844 - time (sec): 152.01 - samples/sec: 1752.55 - lr: 0.000013 - momentum: 0.000000 2023-10-17 14:31:57,321 epoch 8 - iter 2888/3617 - loss 0.28886061 - time (sec): 174.31 - samples/sec: 1744.19 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:32:19,421 epoch 8 - iter 3249/3617 - loss 0.28980765 - time (sec): 196.41 - samples/sec: 1736.96 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:32:42,527 epoch 8 - iter 3610/3617 - loss 0.28967184 - time (sec): 219.51 - samples/sec: 1727.61 - lr: 0.000011 - momentum: 0.000000 2023-10-17 14:32:42,982 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:32:42,983 EPOCH 8 done: loss 0.2896 - lr: 0.000011 2023-10-17 14:32:49,305 DEV : loss 0.27210137248039246 - f1-score (micro avg) 0.0 2023-10-17 14:32:49,346 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:33:13,178 epoch 9 - iter 361/3617 - loss 0.28067007 - time (sec): 23.83 - samples/sec: 1591.67 - lr: 0.000011 - momentum: 0.000000 2023-10-17 14:33:36,507 epoch 9 - iter 722/3617 - loss 0.29173550 - time (sec): 47.16 - samples/sec: 1600.24 - lr: 0.000010 - momentum: 0.000000 2023-10-17 14:33:58,625 epoch 9 - iter 1083/3617 - loss 0.29526277 - time (sec): 69.28 - samples/sec: 1635.40 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:34:20,661 epoch 9 - iter 1444/3617 - loss 0.29866563 - time (sec): 91.31 - samples/sec: 1655.50 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:34:42,535 epoch 9 - iter 1805/3617 - loss 0.29421108 - time (sec): 113.19 - samples/sec: 1656.52 - lr: 0.000008 - momentum: 0.000000 2023-10-17 14:35:05,072 epoch 9 - iter 2166/3617 - loss 0.29156705 - time (sec): 135.72 - samples/sec: 1650.25 - lr: 0.000008 - momentum: 0.000000 2023-10-17 14:35:27,983 epoch 9 - iter 2527/3617 - loss 0.28905986 - time (sec): 158.63 - samples/sec: 1657.66 - lr: 0.000007 - momentum: 0.000000 2023-10-17 14:35:51,105 epoch 9 - iter 2888/3617 - loss 0.28899897 - time (sec): 181.76 - samples/sec: 1660.96 - lr: 0.000007 - momentum: 0.000000 2023-10-17 14:36:13,747 epoch 9 - iter 3249/3617 - loss 0.28780211 - time (sec): 204.40 - samples/sec: 1665.07 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:36:36,682 epoch 9 - iter 3610/3617 - loss 0.28731489 - time (sec): 227.33 - samples/sec: 1668.06 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:36:37,104 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:36:37,105 EPOCH 9 done: loss 0.2872 - lr: 0.000006 2023-10-17 14:36:44,236 DEV : loss 0.2753986716270447 - f1-score (micro avg) 0.0 2023-10-17 14:36:44,285 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:37:06,410 epoch 10 - iter 361/3617 - loss 0.27043071 - time (sec): 22.12 - samples/sec: 1709.90 - lr: 0.000005 - momentum: 0.000000 2023-10-17 14:37:28,371 epoch 10 - iter 722/3617 - loss 0.28203851 - time (sec): 44.08 - samples/sec: 1741.00 - lr: 0.000004 - momentum: 0.000000 2023-10-17 14:37:51,166 epoch 10 - iter 1083/3617 - loss 0.28543814 - time (sec): 66.88 - samples/sec: 1699.35 - lr: 0.000004 - momentum: 0.000000 2023-10-17 14:38:13,771 epoch 10 - iter 1444/3617 - loss 0.29077042 - time (sec): 89.48 - samples/sec: 1675.01 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:38:36,539 epoch 10 - iter 1805/3617 - loss 0.28769869 - time (sec): 112.25 - samples/sec: 1672.98 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:38:59,473 epoch 10 - iter 2166/3617 - loss 0.28835903 - time (sec): 135.19 - samples/sec: 1667.81 - lr: 0.000002 - momentum: 0.000000 2023-10-17 14:39:21,741 epoch 10 - iter 2527/3617 - loss 0.28763744 - time (sec): 157.45 - samples/sec: 1685.07 - lr: 0.000002 - momentum: 0.000000 2023-10-17 14:39:43,531 epoch 10 - iter 2888/3617 - loss 0.28638070 - time (sec): 179.24 - samples/sec: 1686.58 - lr: 0.000001 - momentum: 0.000000 2023-10-17 14:40:05,491 epoch 10 - iter 3249/3617 - loss 0.28462302 - time (sec): 201.20 - samples/sec: 1694.56 - lr: 0.000001 - momentum: 0.000000 2023-10-17 14:40:27,823 epoch 10 - iter 3610/3617 - loss 0.28726530 - time (sec): 223.54 - samples/sec: 1696.99 - lr: 0.000000 - momentum: 0.000000 2023-10-17 14:40:28,262 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:40:28,262 EPOCH 10 done: loss 0.2875 - lr: 0.000000 2023-10-17 14:40:34,582 DEV : loss 0.27373775839805603 - f1-score (micro avg) 0.0 2023-10-17 14:40:35,146 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:40:35,148 Loading model from best epoch ... 2023-10-17 14:40:36,894 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org 2023-10-17 14:40:45,860 Results: - F-score (micro) 0.6123 - F-score (macro) 0.403 - Accuracy 0.452 By class: precision recall f1-score support loc 0.6609 0.7750 0.7134 591 pers 0.5073 0.4846 0.4957 357 org 0.0000 0.0000 0.0000 79 micro avg 0.6103 0.6144 0.6123 1027 macro avg 0.3894 0.4199 0.4030 1027 weighted avg 0.5567 0.6144 0.5828 1027 2023-10-17 14:40:45,860 ----------------------------------------------------------------------------------------------------