2023-10-17 16:08:32,176 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:32,177 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 16:08:32,177 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:32,178 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator 2023-10-17 16:08:32,178 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:32,178 Train: 14465 sentences 2023-10-17 16:08:32,178 (train_with_dev=False, train_with_test=False) 2023-10-17 16:08:32,178 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:32,178 Training Params: 2023-10-17 16:08:32,178 - learning_rate: "5e-05" 2023-10-17 16:08:32,178 - mini_batch_size: "4" 2023-10-17 16:08:32,178 - max_epochs: "10" 2023-10-17 16:08:32,178 - shuffle: "True" 2023-10-17 16:08:32,178 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:32,178 Plugins: 2023-10-17 16:08:32,178 - TensorboardLogger 2023-10-17 16:08:32,178 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 16:08:32,178 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:32,178 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 16:08:32,178 - metric: "('micro avg', 'f1-score')" 2023-10-17 16:08:32,178 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:32,178 Computation: 2023-10-17 16:08:32,178 - compute on device: cuda:0 2023-10-17 16:08:32,178 - embedding storage: none 2023-10-17 16:08:32,178 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:32,178 Model training base path: "hmbench-letemps/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 16:08:32,179 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:32,179 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:32,179 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 16:08:54,759 epoch 1 - iter 361/3617 - loss 1.59316003 - time (sec): 22.58 - samples/sec: 1623.07 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:09:16,935 epoch 1 - iter 722/3617 - loss 0.87718397 - time (sec): 44.76 - samples/sec: 1693.28 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:09:38,479 epoch 1 - iter 1083/3617 - loss 0.63431926 - time (sec): 66.30 - samples/sec: 1714.61 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:09:59,988 epoch 1 - iter 1444/3617 - loss 0.50679500 - time (sec): 87.81 - samples/sec: 1739.68 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:10:21,568 epoch 1 - iter 1805/3617 - loss 0.43084256 - time (sec): 109.39 - samples/sec: 1733.32 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:10:43,946 epoch 1 - iter 2166/3617 - loss 0.37907471 - time (sec): 131.77 - samples/sec: 1730.83 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:11:05,571 epoch 1 - iter 2527/3617 - loss 0.34218192 - time (sec): 153.39 - samples/sec: 1735.86 - lr: 0.000035 - momentum: 0.000000 2023-10-17 16:11:28,268 epoch 1 - iter 2888/3617 - loss 0.31329874 - time (sec): 176.09 - samples/sec: 1736.84 - lr: 0.000040 - momentum: 0.000000 2023-10-17 16:11:50,780 epoch 1 - iter 3249/3617 - loss 0.29265770 - time (sec): 198.60 - samples/sec: 1727.16 - lr: 0.000045 - momentum: 0.000000 2023-10-17 16:12:12,713 epoch 1 - iter 3610/3617 - loss 0.27878591 - time (sec): 220.53 - samples/sec: 1720.18 - lr: 0.000050 - momentum: 0.000000 2023-10-17 16:12:13,136 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:12:13,136 EPOCH 1 done: loss 0.2785 - lr: 0.000050 2023-10-17 16:12:18,675 DEV : loss 0.11415738612413406 - f1-score (micro avg) 0.5792 2023-10-17 16:12:18,723 saving best model 2023-10-17 16:12:19,229 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:12:41,586 epoch 2 - iter 361/3617 - loss 0.11018875 - time (sec): 22.36 - samples/sec: 1738.46 - lr: 0.000049 - momentum: 0.000000 2023-10-17 16:13:03,651 epoch 2 - iter 722/3617 - loss 0.10673270 - time (sec): 44.42 - samples/sec: 1724.62 - lr: 0.000049 - momentum: 0.000000 2023-10-17 16:13:25,253 epoch 2 - iter 1083/3617 - loss 0.09957929 - time (sec): 66.02 - samples/sec: 1741.81 - lr: 0.000048 - momentum: 0.000000 2023-10-17 16:13:46,707 epoch 2 - iter 1444/3617 - loss 0.09924106 - time (sec): 87.48 - samples/sec: 1734.06 - lr: 0.000048 - momentum: 0.000000 2023-10-17 16:14:08,448 epoch 2 - iter 1805/3617 - loss 0.09971665 - time (sec): 109.22 - samples/sec: 1728.06 - lr: 0.000047 - momentum: 0.000000 2023-10-17 16:14:30,098 epoch 2 - iter 2166/3617 - loss 0.10186845 - time (sec): 130.87 - samples/sec: 1720.27 - lr: 0.000047 - momentum: 0.000000 2023-10-17 16:14:51,935 epoch 2 - iter 2527/3617 - loss 0.10393233 - time (sec): 152.70 - samples/sec: 1721.16 - lr: 0.000046 - momentum: 0.000000 2023-10-17 16:15:13,576 epoch 2 - iter 2888/3617 - loss 0.10390794 - time (sec): 174.35 - samples/sec: 1731.93 - lr: 0.000046 - momentum: 0.000000 2023-10-17 16:15:35,151 epoch 2 - iter 3249/3617 - loss 0.10301978 - time (sec): 195.92 - samples/sec: 1732.01 - lr: 0.000045 - momentum: 0.000000 2023-10-17 16:15:56,894 epoch 2 - iter 3610/3617 - loss 0.10250492 - time (sec): 217.66 - samples/sec: 1741.45 - lr: 0.000044 - momentum: 0.000000 2023-10-17 16:15:57,298 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:15:57,298 EPOCH 2 done: loss 0.1025 - lr: 0.000044 2023-10-17 16:16:04,138 DEV : loss 0.1211915835738182 - f1-score (micro avg) 0.578 2023-10-17 16:16:04,179 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:16:25,632 epoch 3 - iter 361/3617 - loss 0.08273042 - time (sec): 21.45 - samples/sec: 1784.67 - lr: 0.000044 - momentum: 0.000000 2023-10-17 16:16:47,094 epoch 3 - iter 722/3617 - loss 0.08527928 - time (sec): 42.91 - samples/sec: 1787.74 - lr: 0.000043 - momentum: 0.000000 2023-10-17 16:17:08,726 epoch 3 - iter 1083/3617 - loss 0.08265513 - time (sec): 64.55 - samples/sec: 1767.47 - lr: 0.000043 - momentum: 0.000000 2023-10-17 16:17:31,084 epoch 3 - iter 1444/3617 - loss 0.08240273 - time (sec): 86.90 - samples/sec: 1749.29 - lr: 0.000042 - momentum: 0.000000 2023-10-17 16:17:54,260 epoch 3 - iter 1805/3617 - loss 0.08334927 - time (sec): 110.08 - samples/sec: 1722.08 - lr: 0.000042 - momentum: 0.000000 2023-10-17 16:18:17,478 epoch 3 - iter 2166/3617 - loss 0.08383538 - time (sec): 133.30 - samples/sec: 1705.38 - lr: 0.000041 - momentum: 0.000000 2023-10-17 16:18:39,969 epoch 3 - iter 2527/3617 - loss 0.08409338 - time (sec): 155.79 - samples/sec: 1706.81 - lr: 0.000041 - momentum: 0.000000 2023-10-17 16:19:01,962 epoch 3 - iter 2888/3617 - loss 0.08533779 - time (sec): 177.78 - samples/sec: 1703.34 - lr: 0.000040 - momentum: 0.000000 2023-10-17 16:19:23,533 epoch 3 - iter 3249/3617 - loss 0.08826896 - time (sec): 199.35 - samples/sec: 1706.16 - lr: 0.000039 - momentum: 0.000000 2023-10-17 16:19:45,437 epoch 3 - iter 3610/3617 - loss 0.08787095 - time (sec): 221.26 - samples/sec: 1714.34 - lr: 0.000039 - momentum: 0.000000 2023-10-17 16:19:45,845 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:19:45,845 EPOCH 3 done: loss 0.0881 - lr: 0.000039 2023-10-17 16:19:52,208 DEV : loss 0.17476259171962738 - f1-score (micro avg) 0.6021 2023-10-17 16:19:52,254 saving best model 2023-10-17 16:19:52,842 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:20:14,661 epoch 4 - iter 361/3617 - loss 0.05439196 - time (sec): 21.82 - samples/sec: 1719.29 - lr: 0.000038 - momentum: 0.000000 2023-10-17 16:20:36,456 epoch 4 - iter 722/3617 - loss 0.05961071 - time (sec): 43.61 - samples/sec: 1741.20 - lr: 0.000038 - momentum: 0.000000 2023-10-17 16:20:58,045 epoch 4 - iter 1083/3617 - loss 0.06537487 - time (sec): 65.20 - samples/sec: 1761.71 - lr: 0.000037 - momentum: 0.000000 2023-10-17 16:21:20,050 epoch 4 - iter 1444/3617 - loss 0.06504166 - time (sec): 87.21 - samples/sec: 1741.01 - lr: 0.000037 - momentum: 0.000000 2023-10-17 16:21:42,953 epoch 4 - iter 1805/3617 - loss 0.06566737 - time (sec): 110.11 - samples/sec: 1708.21 - lr: 0.000036 - momentum: 0.000000 2023-10-17 16:22:04,721 epoch 4 - iter 2166/3617 - loss 0.06563944 - time (sec): 131.88 - samples/sec: 1727.09 - lr: 0.000036 - momentum: 0.000000 2023-10-17 16:22:26,157 epoch 4 - iter 2527/3617 - loss 0.06616583 - time (sec): 153.31 - samples/sec: 1730.54 - lr: 0.000035 - momentum: 0.000000 2023-10-17 16:22:47,642 epoch 4 - iter 2888/3617 - loss 0.06528920 - time (sec): 174.80 - samples/sec: 1731.39 - lr: 0.000034 - momentum: 0.000000 2023-10-17 16:23:09,259 epoch 4 - iter 3249/3617 - loss 0.06547094 - time (sec): 196.42 - samples/sec: 1737.82 - lr: 0.000034 - momentum: 0.000000 2023-10-17 16:23:31,047 epoch 4 - iter 3610/3617 - loss 0.06536416 - time (sec): 218.20 - samples/sec: 1738.72 - lr: 0.000033 - momentum: 0.000000 2023-10-17 16:23:31,466 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:23:31,467 EPOCH 4 done: loss 0.0653 - lr: 0.000033 2023-10-17 16:23:38,400 DEV : loss 0.22877228260040283 - f1-score (micro avg) 0.6333 2023-10-17 16:23:38,440 saving best model 2023-10-17 16:23:39,040 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:24:01,038 epoch 5 - iter 361/3617 - loss 0.04861305 - time (sec): 22.00 - samples/sec: 1733.21 - lr: 0.000033 - momentum: 0.000000 2023-10-17 16:24:23,079 epoch 5 - iter 722/3617 - loss 0.04897049 - time (sec): 44.04 - samples/sec: 1722.65 - lr: 0.000032 - momentum: 0.000000 2023-10-17 16:24:44,814 epoch 5 - iter 1083/3617 - loss 0.05357731 - time (sec): 65.77 - samples/sec: 1717.83 - lr: 0.000032 - momentum: 0.000000 2023-10-17 16:25:06,457 epoch 5 - iter 1444/3617 - loss 0.04890673 - time (sec): 87.42 - samples/sec: 1727.37 - lr: 0.000031 - momentum: 0.000000 2023-10-17 16:25:28,982 epoch 5 - iter 1805/3617 - loss 0.04924110 - time (sec): 109.94 - samples/sec: 1710.23 - lr: 0.000031 - momentum: 0.000000 2023-10-17 16:25:50,419 epoch 5 - iter 2166/3617 - loss 0.04799027 - time (sec): 131.38 - samples/sec: 1711.81 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:26:12,067 epoch 5 - iter 2527/3617 - loss 0.04886923 - time (sec): 153.03 - samples/sec: 1723.26 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:26:33,507 epoch 5 - iter 2888/3617 - loss 0.04895532 - time (sec): 174.47 - samples/sec: 1730.14 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:26:54,980 epoch 5 - iter 3249/3617 - loss 0.04969032 - time (sec): 195.94 - samples/sec: 1736.77 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:27:16,714 epoch 5 - iter 3610/3617 - loss 0.04894855 - time (sec): 217.67 - samples/sec: 1742.34 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:27:17,106 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:27:17,106 EPOCH 5 done: loss 0.0489 - lr: 0.000028 2023-10-17 16:27:23,358 DEV : loss 0.23564380407333374 - f1-score (micro avg) 0.6461 2023-10-17 16:27:23,399 saving best model 2023-10-17 16:27:23,986 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:27:45,735 epoch 6 - iter 361/3617 - loss 0.03048999 - time (sec): 21.75 - samples/sec: 1751.55 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:28:08,416 epoch 6 - iter 722/3617 - loss 0.03137315 - time (sec): 44.43 - samples/sec: 1679.67 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:28:31,192 epoch 6 - iter 1083/3617 - loss 0.02991919 - time (sec): 67.20 - samples/sec: 1674.33 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:28:53,849 epoch 6 - iter 1444/3617 - loss 0.03053765 - time (sec): 89.86 - samples/sec: 1695.66 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:29:16,297 epoch 6 - iter 1805/3617 - loss 0.03086143 - time (sec): 112.31 - samples/sec: 1690.78 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:29:38,606 epoch 6 - iter 2166/3617 - loss 0.03193821 - time (sec): 134.62 - samples/sec: 1694.58 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:30:01,036 epoch 6 - iter 2527/3617 - loss 0.03226272 - time (sec): 157.05 - samples/sec: 1679.22 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:30:23,898 epoch 6 - iter 2888/3617 - loss 0.03126579 - time (sec): 179.91 - samples/sec: 1677.77 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:30:46,833 epoch 6 - iter 3249/3617 - loss 0.03176394 - time (sec): 202.85 - samples/sec: 1677.19 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:31:09,891 epoch 6 - iter 3610/3617 - loss 0.03149282 - time (sec): 225.90 - samples/sec: 1678.94 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:31:10,289 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:31:10,291 EPOCH 6 done: loss 0.0315 - lr: 0.000022 2023-10-17 16:31:17,347 DEV : loss 0.3068622648715973 - f1-score (micro avg) 0.6404 2023-10-17 16:31:17,387 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:31:39,287 epoch 7 - iter 361/3617 - loss 0.02154516 - time (sec): 21.90 - samples/sec: 1656.45 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:32:01,457 epoch 7 - iter 722/3617 - loss 0.02204868 - time (sec): 44.07 - samples/sec: 1657.47 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:32:23,647 epoch 7 - iter 1083/3617 - loss 0.02181568 - time (sec): 66.26 - samples/sec: 1653.97 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:32:45,349 epoch 7 - iter 1444/3617 - loss 0.02191833 - time (sec): 87.96 - samples/sec: 1689.89 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:33:07,781 epoch 7 - iter 1805/3617 - loss 0.02270628 - time (sec): 110.39 - samples/sec: 1708.20 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:33:29,669 epoch 7 - iter 2166/3617 - loss 0.02357930 - time (sec): 132.28 - samples/sec: 1722.88 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:33:51,178 epoch 7 - iter 2527/3617 - loss 0.02346310 - time (sec): 153.79 - samples/sec: 1723.77 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:34:12,913 epoch 7 - iter 2888/3617 - loss 0.02349435 - time (sec): 175.52 - samples/sec: 1722.81 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:34:34,514 epoch 7 - iter 3249/3617 - loss 0.02305226 - time (sec): 197.12 - samples/sec: 1727.36 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:34:56,192 epoch 7 - iter 3610/3617 - loss 0.02258401 - time (sec): 218.80 - samples/sec: 1732.80 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:34:56,582 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:34:56,582 EPOCH 7 done: loss 0.0226 - lr: 0.000017 2023-10-17 16:35:03,064 DEV : loss 0.36475077271461487 - f1-score (micro avg) 0.6412 2023-10-17 16:35:03,107 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:35:24,549 epoch 8 - iter 361/3617 - loss 0.02045659 - time (sec): 21.44 - samples/sec: 1736.80 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:35:46,178 epoch 8 - iter 722/3617 - loss 0.01647243 - time (sec): 43.07 - samples/sec: 1713.09 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:36:07,941 epoch 8 - iter 1083/3617 - loss 0.01445657 - time (sec): 64.83 - samples/sec: 1714.13 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:36:30,179 epoch 8 - iter 1444/3617 - loss 0.01444114 - time (sec): 87.07 - samples/sec: 1727.94 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:36:52,195 epoch 8 - iter 1805/3617 - loss 0.01502009 - time (sec): 109.09 - samples/sec: 1727.91 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:37:13,873 epoch 8 - iter 2166/3617 - loss 0.01501820 - time (sec): 130.76 - samples/sec: 1726.98 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:37:35,515 epoch 8 - iter 2527/3617 - loss 0.01528970 - time (sec): 152.41 - samples/sec: 1735.65 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:37:58,666 epoch 8 - iter 2888/3617 - loss 0.01583012 - time (sec): 175.56 - samples/sec: 1726.61 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:38:22,617 epoch 8 - iter 3249/3617 - loss 0.01525804 - time (sec): 199.51 - samples/sec: 1702.48 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:38:46,049 epoch 8 - iter 3610/3617 - loss 0.01498113 - time (sec): 222.94 - samples/sec: 1700.32 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:38:46,515 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:38:46,516 EPOCH 8 done: loss 0.0149 - lr: 0.000011 2023-10-17 16:38:52,987 DEV : loss 0.38312429189682007 - f1-score (micro avg) 0.6461 2023-10-17 16:38:53,028 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:39:14,936 epoch 9 - iter 361/3617 - loss 0.01109698 - time (sec): 21.91 - samples/sec: 1660.95 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:39:38,405 epoch 9 - iter 722/3617 - loss 0.01173978 - time (sec): 45.38 - samples/sec: 1632.06 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:40:02,186 epoch 9 - iter 1083/3617 - loss 0.01129520 - time (sec): 69.16 - samples/sec: 1624.14 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:40:26,563 epoch 9 - iter 1444/3617 - loss 0.01113245 - time (sec): 93.53 - samples/sec: 1613.99 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:40:48,569 epoch 9 - iter 1805/3617 - loss 0.01049652 - time (sec): 115.54 - samples/sec: 1634.64 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:41:12,010 epoch 9 - iter 2166/3617 - loss 0.01067123 - time (sec): 138.98 - samples/sec: 1632.70 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:41:35,070 epoch 9 - iter 2527/3617 - loss 0.01012252 - time (sec): 162.04 - samples/sec: 1648.39 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:41:56,800 epoch 9 - iter 2888/3617 - loss 0.00987860 - time (sec): 183.77 - samples/sec: 1661.48 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:42:18,535 epoch 9 - iter 3249/3617 - loss 0.00989462 - time (sec): 205.50 - samples/sec: 1667.99 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:42:41,057 epoch 9 - iter 3610/3617 - loss 0.00959614 - time (sec): 228.03 - samples/sec: 1663.07 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:42:41,508 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:42:41,508 EPOCH 9 done: loss 0.0096 - lr: 0.000006 2023-10-17 16:42:47,816 DEV : loss 0.4049243628978729 - f1-score (micro avg) 0.6505 2023-10-17 16:42:47,860 saving best model 2023-10-17 16:42:48,460 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:43:10,345 epoch 10 - iter 361/3617 - loss 0.00529713 - time (sec): 21.88 - samples/sec: 1724.95 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:43:32,081 epoch 10 - iter 722/3617 - loss 0.00553721 - time (sec): 43.62 - samples/sec: 1787.77 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:43:53,824 epoch 10 - iter 1083/3617 - loss 0.00483622 - time (sec): 65.36 - samples/sec: 1765.91 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:44:15,560 epoch 10 - iter 1444/3617 - loss 0.00556484 - time (sec): 87.10 - samples/sec: 1762.45 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:44:37,180 epoch 10 - iter 1805/3617 - loss 0.00604277 - time (sec): 108.72 - samples/sec: 1745.58 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:44:58,941 epoch 10 - iter 2166/3617 - loss 0.00588800 - time (sec): 130.48 - samples/sec: 1750.77 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:45:20,767 epoch 10 - iter 2527/3617 - loss 0.00594552 - time (sec): 152.30 - samples/sec: 1751.21 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:45:42,507 epoch 10 - iter 2888/3617 - loss 0.00620746 - time (sec): 174.05 - samples/sec: 1748.94 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:46:04,199 epoch 10 - iter 3249/3617 - loss 0.00595650 - time (sec): 195.74 - samples/sec: 1754.23 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:46:27,303 epoch 10 - iter 3610/3617 - loss 0.00602530 - time (sec): 218.84 - samples/sec: 1734.10 - lr: 0.000000 - momentum: 0.000000 2023-10-17 16:46:27,736 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:46:27,737 EPOCH 10 done: loss 0.0060 - lr: 0.000000 2023-10-17 16:46:34,953 DEV : loss 0.41719192266464233 - f1-score (micro avg) 0.6449 2023-10-17 16:46:35,492 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:46:35,493 Loading model from best epoch ... 2023-10-17 16:46:37,250 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org 2023-10-17 16:46:45,305 Results: - F-score (micro) 0.6493 - F-score (macro) 0.502 - Accuracy 0.4947 By class: precision recall f1-score support loc 0.6341 0.8122 0.7122 591 pers 0.5762 0.7199 0.6401 357 org 0.1558 0.1519 0.1538 79 micro avg 0.5852 0.7293 0.6493 1027 macro avg 0.4554 0.5613 0.5020 1027 weighted avg 0.5772 0.7293 0.6442 1027 2023-10-17 16:46:45,305 ----------------------------------------------------------------------------------------------------