2023-10-18 20:26:42,123 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:42,123 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 20:26:42,123 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:42,123 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-18 20:26:42,123 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:42,123 Train: 7936 sentences 2023-10-18 20:26:42,123 (train_with_dev=False, train_with_test=False) 2023-10-18 20:26:42,123 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:42,123 Training Params: 2023-10-18 20:26:42,123 - learning_rate: "5e-05" 2023-10-18 20:26:42,123 - mini_batch_size: "4" 2023-10-18 20:26:42,124 - max_epochs: "10" 2023-10-18 20:26:42,124 - shuffle: "True" 2023-10-18 20:26:42,124 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:42,124 Plugins: 2023-10-18 20:26:42,124 - TensorboardLogger 2023-10-18 20:26:42,124 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 20:26:42,124 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:42,124 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 20:26:42,124 - metric: "('micro avg', 'f1-score')" 2023-10-18 20:26:42,124 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:42,124 Computation: 2023-10-18 20:26:42,124 - compute on device: cuda:0 2023-10-18 20:26:42,124 - embedding storage: none 2023-10-18 20:26:42,124 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:42,124 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-18 20:26:42,124 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:42,124 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:42,124 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 20:26:45,221 epoch 1 - iter 198/1984 - loss 3.10528833 - time (sec): 3.10 - samples/sec: 5261.08 - lr: 0.000005 - momentum: 0.000000 2023-10-18 20:26:48,238 epoch 1 - iter 396/1984 - loss 2.51418141 - time (sec): 6.11 - samples/sec: 5312.12 - lr: 0.000010 - momentum: 0.000000 2023-10-18 20:26:51,300 epoch 1 - iter 594/1984 - loss 1.89531636 - time (sec): 9.18 - samples/sec: 5362.40 - lr: 0.000015 - momentum: 0.000000 2023-10-18 20:26:54,272 epoch 1 - iter 792/1984 - loss 1.54862823 - time (sec): 12.15 - samples/sec: 5412.02 - lr: 0.000020 - momentum: 0.000000 2023-10-18 20:26:57,272 epoch 1 - iter 990/1984 - loss 1.33496354 - time (sec): 15.15 - samples/sec: 5376.01 - lr: 0.000025 - momentum: 0.000000 2023-10-18 20:27:00,328 epoch 1 - iter 1188/1984 - loss 1.17925593 - time (sec): 18.20 - samples/sec: 5378.00 - lr: 0.000030 - momentum: 0.000000 2023-10-18 20:27:03,304 epoch 1 - iter 1386/1984 - loss 1.06364809 - time (sec): 21.18 - samples/sec: 5382.30 - lr: 0.000035 - momentum: 0.000000 2023-10-18 20:27:06,316 epoch 1 - iter 1584/1984 - loss 0.97153194 - time (sec): 24.19 - samples/sec: 5413.52 - lr: 0.000040 - momentum: 0.000000 2023-10-18 20:27:09,339 epoch 1 - iter 1782/1984 - loss 0.89943309 - time (sec): 27.21 - samples/sec: 5406.86 - lr: 0.000045 - momentum: 0.000000 2023-10-18 20:27:12,376 epoch 1 - iter 1980/1984 - loss 0.83845518 - time (sec): 30.25 - samples/sec: 5408.71 - lr: 0.000050 - momentum: 0.000000 2023-10-18 20:27:12,437 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:27:12,437 EPOCH 1 done: loss 0.8373 - lr: 0.000050 2023-10-18 20:27:14,301 DEV : loss 0.2123669981956482 - f1-score (micro avg) 0.3058 2023-10-18 20:27:14,319 saving best model 2023-10-18 20:27:14,350 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:27:17,415 epoch 2 - iter 198/1984 - loss 0.30751315 - time (sec): 3.06 - samples/sec: 5547.01 - lr: 0.000049 - momentum: 0.000000 2023-10-18 20:27:20,409 epoch 2 - iter 396/1984 - loss 0.28880847 - time (sec): 6.06 - samples/sec: 5584.38 - lr: 0.000049 - momentum: 0.000000 2023-10-18 20:27:23,444 epoch 2 - iter 594/1984 - loss 0.28273759 - time (sec): 9.09 - samples/sec: 5507.73 - lr: 0.000048 - momentum: 0.000000 2023-10-18 20:27:26,439 epoch 2 - iter 792/1984 - loss 0.27088719 - time (sec): 12.09 - samples/sec: 5458.52 - lr: 0.000048 - momentum: 0.000000 2023-10-18 20:27:29,491 epoch 2 - iter 990/1984 - loss 0.26504813 - time (sec): 15.14 - samples/sec: 5485.12 - lr: 0.000047 - momentum: 0.000000 2023-10-18 20:27:32,572 epoch 2 - iter 1188/1984 - loss 0.26339405 - time (sec): 18.22 - samples/sec: 5459.96 - lr: 0.000047 - momentum: 0.000000 2023-10-18 20:27:35,603 epoch 2 - iter 1386/1984 - loss 0.25689801 - time (sec): 21.25 - samples/sec: 5470.87 - lr: 0.000046 - momentum: 0.000000 2023-10-18 20:27:38,702 epoch 2 - iter 1584/1984 - loss 0.25609983 - time (sec): 24.35 - samples/sec: 5445.66 - lr: 0.000046 - momentum: 0.000000 2023-10-18 20:27:41,727 epoch 2 - iter 1782/1984 - loss 0.25426883 - time (sec): 27.38 - samples/sec: 5413.62 - lr: 0.000045 - momentum: 0.000000 2023-10-18 20:27:44,747 epoch 2 - iter 1980/1984 - loss 0.25189498 - time (sec): 30.40 - samples/sec: 5383.25 - lr: 0.000044 - momentum: 0.000000 2023-10-18 20:27:44,809 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:27:44,809 EPOCH 2 done: loss 0.2517 - lr: 0.000044 2023-10-18 20:27:46,652 DEV : loss 0.16580356657505035 - f1-score (micro avg) 0.4132 2023-10-18 20:27:46,671 saving best model 2023-10-18 20:27:46,706 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:27:49,900 epoch 3 - iter 198/1984 - loss 0.19382973 - time (sec): 3.19 - samples/sec: 5103.32 - lr: 0.000044 - momentum: 0.000000 2023-10-18 20:27:52,946 epoch 3 - iter 396/1984 - loss 0.18909038 - time (sec): 6.24 - samples/sec: 5257.64 - lr: 0.000043 - momentum: 0.000000 2023-10-18 20:27:55,955 epoch 3 - iter 594/1984 - loss 0.20993255 - time (sec): 9.25 - samples/sec: 5268.06 - lr: 0.000043 - momentum: 0.000000 2023-10-18 20:27:59,021 epoch 3 - iter 792/1984 - loss 0.20714776 - time (sec): 12.31 - samples/sec: 5332.54 - lr: 0.000042 - momentum: 0.000000 2023-10-18 20:28:02,011 epoch 3 - iter 990/1984 - loss 0.20713127 - time (sec): 15.30 - samples/sec: 5309.82 - lr: 0.000042 - momentum: 0.000000 2023-10-18 20:28:05,286 epoch 3 - iter 1188/1984 - loss 0.20727047 - time (sec): 18.58 - samples/sec: 5280.75 - lr: 0.000041 - momentum: 0.000000 2023-10-18 20:28:08,347 epoch 3 - iter 1386/1984 - loss 0.20873667 - time (sec): 21.64 - samples/sec: 5271.59 - lr: 0.000041 - momentum: 0.000000 2023-10-18 20:28:11,390 epoch 3 - iter 1584/1984 - loss 0.20804768 - time (sec): 24.68 - samples/sec: 5300.39 - lr: 0.000040 - momentum: 0.000000 2023-10-18 20:28:14,388 epoch 3 - iter 1782/1984 - loss 0.20568264 - time (sec): 27.68 - samples/sec: 5321.57 - lr: 0.000039 - momentum: 0.000000 2023-10-18 20:28:17,407 epoch 3 - iter 1980/1984 - loss 0.20417751 - time (sec): 30.70 - samples/sec: 5326.77 - lr: 0.000039 - momentum: 0.000000 2023-10-18 20:28:17,476 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:28:17,476 EPOCH 3 done: loss 0.2043 - lr: 0.000039 2023-10-18 20:28:19,319 DEV : loss 0.14851002395153046 - f1-score (micro avg) 0.513 2023-10-18 20:28:19,337 saving best model 2023-10-18 20:28:19,377 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:28:22,449 epoch 4 - iter 198/1984 - loss 0.18877601 - time (sec): 3.07 - samples/sec: 5327.50 - lr: 0.000038 - momentum: 0.000000 2023-10-18 20:28:25,530 epoch 4 - iter 396/1984 - loss 0.19191756 - time (sec): 6.15 - samples/sec: 5277.12 - lr: 0.000038 - momentum: 0.000000 2023-10-18 20:28:28,538 epoch 4 - iter 594/1984 - loss 0.18721169 - time (sec): 9.16 - samples/sec: 5235.98 - lr: 0.000037 - momentum: 0.000000 2023-10-18 20:28:31,607 epoch 4 - iter 792/1984 - loss 0.19120631 - time (sec): 12.23 - samples/sec: 5184.55 - lr: 0.000037 - momentum: 0.000000 2023-10-18 20:28:34,601 epoch 4 - iter 990/1984 - loss 0.18849200 - time (sec): 15.22 - samples/sec: 5215.47 - lr: 0.000036 - momentum: 0.000000 2023-10-18 20:28:37,624 epoch 4 - iter 1188/1984 - loss 0.18373488 - time (sec): 18.25 - samples/sec: 5250.54 - lr: 0.000036 - momentum: 0.000000 2023-10-18 20:28:40,678 epoch 4 - iter 1386/1984 - loss 0.18383898 - time (sec): 21.30 - samples/sec: 5325.35 - lr: 0.000035 - momentum: 0.000000 2023-10-18 20:28:43,747 epoch 4 - iter 1584/1984 - loss 0.18100800 - time (sec): 24.37 - samples/sec: 5341.76 - lr: 0.000034 - momentum: 0.000000 2023-10-18 20:28:46,758 epoch 4 - iter 1782/1984 - loss 0.18083878 - time (sec): 27.38 - samples/sec: 5332.30 - lr: 0.000034 - momentum: 0.000000 2023-10-18 20:28:49,823 epoch 4 - iter 1980/1984 - loss 0.17764091 - time (sec): 30.44 - samples/sec: 5375.24 - lr: 0.000033 - momentum: 0.000000 2023-10-18 20:28:49,882 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:28:49,883 EPOCH 4 done: loss 0.1776 - lr: 0.000033 2023-10-18 20:28:51,692 DEV : loss 0.14755003154277802 - f1-score (micro avg) 0.5567 2023-10-18 20:28:51,711 saving best model 2023-10-18 20:28:51,746 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:28:54,714 epoch 5 - iter 198/1984 - loss 0.19822197 - time (sec): 2.97 - samples/sec: 5025.42 - lr: 0.000033 - momentum: 0.000000 2023-10-18 20:28:57,801 epoch 5 - iter 396/1984 - loss 0.16894933 - time (sec): 6.05 - samples/sec: 5373.09 - lr: 0.000032 - momentum: 0.000000 2023-10-18 20:29:00,850 epoch 5 - iter 594/1984 - loss 0.16781950 - time (sec): 9.10 - samples/sec: 5386.35 - lr: 0.000032 - momentum: 0.000000 2023-10-18 20:29:03,958 epoch 5 - iter 792/1984 - loss 0.16415652 - time (sec): 12.21 - samples/sec: 5406.11 - lr: 0.000031 - momentum: 0.000000 2023-10-18 20:29:06,959 epoch 5 - iter 990/1984 - loss 0.16205996 - time (sec): 15.21 - samples/sec: 5362.44 - lr: 0.000031 - momentum: 0.000000 2023-10-18 20:29:09,981 epoch 5 - iter 1188/1984 - loss 0.16241670 - time (sec): 18.23 - samples/sec: 5393.56 - lr: 0.000030 - momentum: 0.000000 2023-10-18 20:29:13,014 epoch 5 - iter 1386/1984 - loss 0.16432711 - time (sec): 21.27 - samples/sec: 5406.25 - lr: 0.000029 - momentum: 0.000000 2023-10-18 20:29:16,025 epoch 5 - iter 1584/1984 - loss 0.16348040 - time (sec): 24.28 - samples/sec: 5420.92 - lr: 0.000029 - momentum: 0.000000 2023-10-18 20:29:19,029 epoch 5 - iter 1782/1984 - loss 0.16364024 - time (sec): 27.28 - samples/sec: 5416.91 - lr: 0.000028 - momentum: 0.000000 2023-10-18 20:29:22,045 epoch 5 - iter 1980/1984 - loss 0.16319243 - time (sec): 30.30 - samples/sec: 5401.10 - lr: 0.000028 - momentum: 0.000000 2023-10-18 20:29:22,104 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:29:22,104 EPOCH 5 done: loss 0.1631 - lr: 0.000028 2023-10-18 20:29:23,921 DEV : loss 0.13933779299259186 - f1-score (micro avg) 0.5857 2023-10-18 20:29:23,939 saving best model 2023-10-18 20:29:23,973 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:29:27,070 epoch 6 - iter 198/1984 - loss 0.17797563 - time (sec): 3.10 - samples/sec: 4977.75 - lr: 0.000027 - momentum: 0.000000 2023-10-18 20:29:30,099 epoch 6 - iter 396/1984 - loss 0.16344503 - time (sec): 6.12 - samples/sec: 5175.17 - lr: 0.000027 - momentum: 0.000000 2023-10-18 20:29:33,245 epoch 6 - iter 594/1984 - loss 0.16069556 - time (sec): 9.27 - samples/sec: 5117.65 - lr: 0.000026 - momentum: 0.000000 2023-10-18 20:29:36,261 epoch 6 - iter 792/1984 - loss 0.16257303 - time (sec): 12.29 - samples/sec: 5175.21 - lr: 0.000026 - momentum: 0.000000 2023-10-18 20:29:39,313 epoch 6 - iter 990/1984 - loss 0.16151177 - time (sec): 15.34 - samples/sec: 5240.88 - lr: 0.000025 - momentum: 0.000000 2023-10-18 20:29:42,401 epoch 6 - iter 1188/1984 - loss 0.15627333 - time (sec): 18.43 - samples/sec: 5283.96 - lr: 0.000024 - momentum: 0.000000 2023-10-18 20:29:45,429 epoch 6 - iter 1386/1984 - loss 0.15261274 - time (sec): 21.46 - samples/sec: 5332.64 - lr: 0.000024 - momentum: 0.000000 2023-10-18 20:29:48,133 epoch 6 - iter 1584/1984 - loss 0.15177407 - time (sec): 24.16 - samples/sec: 5415.88 - lr: 0.000023 - momentum: 0.000000 2023-10-18 20:29:50,811 epoch 6 - iter 1782/1984 - loss 0.15322561 - time (sec): 26.84 - samples/sec: 5487.27 - lr: 0.000023 - momentum: 0.000000 2023-10-18 20:29:53,582 epoch 6 - iter 1980/1984 - loss 0.15084002 - time (sec): 29.61 - samples/sec: 5527.18 - lr: 0.000022 - momentum: 0.000000 2023-10-18 20:29:53,643 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:29:53,644 EPOCH 6 done: loss 0.1508 - lr: 0.000022 2023-10-18 20:29:55,863 DEV : loss 0.14441362023353577 - f1-score (micro avg) 0.5832 2023-10-18 20:29:55,882 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:29:58,975 epoch 7 - iter 198/1984 - loss 0.17696709 - time (sec): 3.09 - samples/sec: 5179.02 - lr: 0.000022 - momentum: 0.000000 2023-10-18 20:30:01,974 epoch 7 - iter 396/1984 - loss 0.15547142 - time (sec): 6.09 - samples/sec: 5311.56 - lr: 0.000021 - momentum: 0.000000 2023-10-18 20:30:05,067 epoch 7 - iter 594/1984 - loss 0.15230504 - time (sec): 9.18 - samples/sec: 5276.61 - lr: 0.000021 - momentum: 0.000000 2023-10-18 20:30:08,135 epoch 7 - iter 792/1984 - loss 0.14464304 - time (sec): 12.25 - samples/sec: 5360.99 - lr: 0.000020 - momentum: 0.000000 2023-10-18 20:30:11,196 epoch 7 - iter 990/1984 - loss 0.14375802 - time (sec): 15.31 - samples/sec: 5387.28 - lr: 0.000019 - momentum: 0.000000 2023-10-18 20:30:14,217 epoch 7 - iter 1188/1984 - loss 0.14330999 - time (sec): 18.33 - samples/sec: 5362.74 - lr: 0.000019 - momentum: 0.000000 2023-10-18 20:30:17,322 epoch 7 - iter 1386/1984 - loss 0.14105304 - time (sec): 21.44 - samples/sec: 5365.32 - lr: 0.000018 - momentum: 0.000000 2023-10-18 20:30:20,389 epoch 7 - iter 1584/1984 - loss 0.14079064 - time (sec): 24.51 - samples/sec: 5340.24 - lr: 0.000018 - momentum: 0.000000 2023-10-18 20:30:23,438 epoch 7 - iter 1782/1984 - loss 0.14054323 - time (sec): 27.56 - samples/sec: 5332.07 - lr: 0.000017 - momentum: 0.000000 2023-10-18 20:30:26,642 epoch 7 - iter 1980/1984 - loss 0.14054386 - time (sec): 30.76 - samples/sec: 5324.44 - lr: 0.000017 - momentum: 0.000000 2023-10-18 20:30:26,701 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:30:26,701 EPOCH 7 done: loss 0.1405 - lr: 0.000017 2023-10-18 20:30:28,540 DEV : loss 0.14645995199680328 - f1-score (micro avg) 0.5839 2023-10-18 20:30:28,559 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:30:31,574 epoch 8 - iter 198/1984 - loss 0.13505995 - time (sec): 3.01 - samples/sec: 5338.26 - lr: 0.000016 - momentum: 0.000000 2023-10-18 20:30:34,585 epoch 8 - iter 396/1984 - loss 0.13300452 - time (sec): 6.03 - samples/sec: 5274.07 - lr: 0.000016 - momentum: 0.000000 2023-10-18 20:30:37,605 epoch 8 - iter 594/1984 - loss 0.13513264 - time (sec): 9.05 - samples/sec: 5218.46 - lr: 0.000015 - momentum: 0.000000 2023-10-18 20:30:40,619 epoch 8 - iter 792/1984 - loss 0.13631868 - time (sec): 12.06 - samples/sec: 5324.96 - lr: 0.000014 - momentum: 0.000000 2023-10-18 20:30:43,473 epoch 8 - iter 990/1984 - loss 0.13573158 - time (sec): 14.91 - samples/sec: 5368.03 - lr: 0.000014 - momentum: 0.000000 2023-10-18 20:30:46,628 epoch 8 - iter 1188/1984 - loss 0.13343905 - time (sec): 18.07 - samples/sec: 5410.44 - lr: 0.000013 - momentum: 0.000000 2023-10-18 20:30:49,733 epoch 8 - iter 1386/1984 - loss 0.13185324 - time (sec): 21.17 - samples/sec: 5358.71 - lr: 0.000013 - momentum: 0.000000 2023-10-18 20:30:52,837 epoch 8 - iter 1584/1984 - loss 0.13284754 - time (sec): 24.28 - samples/sec: 5373.28 - lr: 0.000012 - momentum: 0.000000 2023-10-18 20:30:55,870 epoch 8 - iter 1782/1984 - loss 0.13371484 - time (sec): 27.31 - samples/sec: 5380.92 - lr: 0.000012 - momentum: 0.000000 2023-10-18 20:30:58,973 epoch 8 - iter 1980/1984 - loss 0.13321577 - time (sec): 30.41 - samples/sec: 5383.22 - lr: 0.000011 - momentum: 0.000000 2023-10-18 20:30:59,035 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:30:59,035 EPOCH 8 done: loss 0.1331 - lr: 0.000011 2023-10-18 20:31:00,874 DEV : loss 0.1467994898557663 - f1-score (micro avg) 0.5978 2023-10-18 20:31:00,893 saving best model 2023-10-18 20:31:00,928 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:31:03,991 epoch 9 - iter 198/1984 - loss 0.11768644 - time (sec): 3.06 - samples/sec: 5303.58 - lr: 0.000011 - momentum: 0.000000 2023-10-18 20:31:07,031 epoch 9 - iter 396/1984 - loss 0.13017349 - time (sec): 6.10 - samples/sec: 5376.12 - lr: 0.000010 - momentum: 0.000000 2023-10-18 20:31:10,059 epoch 9 - iter 594/1984 - loss 0.13044117 - time (sec): 9.13 - samples/sec: 5287.11 - lr: 0.000009 - momentum: 0.000000 2023-10-18 20:31:13,053 epoch 9 - iter 792/1984 - loss 0.12921804 - time (sec): 12.12 - samples/sec: 5270.42 - lr: 0.000009 - momentum: 0.000000 2023-10-18 20:31:16,142 epoch 9 - iter 990/1984 - loss 0.12884571 - time (sec): 15.21 - samples/sec: 5310.97 - lr: 0.000008 - momentum: 0.000000 2023-10-18 20:31:19,206 epoch 9 - iter 1188/1984 - loss 0.12866639 - time (sec): 18.28 - samples/sec: 5297.82 - lr: 0.000008 - momentum: 0.000000 2023-10-18 20:31:22,278 epoch 9 - iter 1386/1984 - loss 0.12832734 - time (sec): 21.35 - samples/sec: 5353.47 - lr: 0.000007 - momentum: 0.000000 2023-10-18 20:31:25,302 epoch 9 - iter 1584/1984 - loss 0.12983739 - time (sec): 24.37 - samples/sec: 5340.26 - lr: 0.000007 - momentum: 0.000000 2023-10-18 20:31:28,367 epoch 9 - iter 1782/1984 - loss 0.12868462 - time (sec): 27.44 - samples/sec: 5343.64 - lr: 0.000006 - momentum: 0.000000 2023-10-18 20:31:31,449 epoch 9 - iter 1980/1984 - loss 0.12845863 - time (sec): 30.52 - samples/sec: 5363.40 - lr: 0.000006 - momentum: 0.000000 2023-10-18 20:31:31,511 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:31:31,511 EPOCH 9 done: loss 0.1285 - lr: 0.000006 2023-10-18 20:31:33,363 DEV : loss 0.1488402932882309 - f1-score (micro avg) 0.598 2023-10-18 20:31:33,382 saving best model 2023-10-18 20:31:33,419 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:31:36,322 epoch 10 - iter 198/1984 - loss 0.12152017 - time (sec): 2.90 - samples/sec: 5410.22 - lr: 0.000005 - momentum: 0.000000 2023-10-18 20:31:39,369 epoch 10 - iter 396/1984 - loss 0.12775156 - time (sec): 5.95 - samples/sec: 5394.26 - lr: 0.000004 - momentum: 0.000000 2023-10-18 20:31:42,403 epoch 10 - iter 594/1984 - loss 0.12825070 - time (sec): 8.98 - samples/sec: 5336.97 - lr: 0.000004 - momentum: 0.000000 2023-10-18 20:31:45,472 epoch 10 - iter 792/1984 - loss 0.12234506 - time (sec): 12.05 - samples/sec: 5362.92 - lr: 0.000003 - momentum: 0.000000 2023-10-18 20:31:48,524 epoch 10 - iter 990/1984 - loss 0.12321278 - time (sec): 15.11 - samples/sec: 5364.67 - lr: 0.000003 - momentum: 0.000000 2023-10-18 20:31:51,571 epoch 10 - iter 1188/1984 - loss 0.12468032 - time (sec): 18.15 - samples/sec: 5333.23 - lr: 0.000002 - momentum: 0.000000 2023-10-18 20:31:54,673 epoch 10 - iter 1386/1984 - loss 0.12362803 - time (sec): 21.25 - samples/sec: 5378.68 - lr: 0.000002 - momentum: 0.000000 2023-10-18 20:31:57,703 epoch 10 - iter 1584/1984 - loss 0.12226510 - time (sec): 24.28 - samples/sec: 5359.19 - lr: 0.000001 - momentum: 0.000000 2023-10-18 20:32:00,719 epoch 10 - iter 1782/1984 - loss 0.12224833 - time (sec): 27.30 - samples/sec: 5379.10 - lr: 0.000001 - momentum: 0.000000 2023-10-18 20:32:03,781 epoch 10 - iter 1980/1984 - loss 0.12319689 - time (sec): 30.36 - samples/sec: 5388.44 - lr: 0.000000 - momentum: 0.000000 2023-10-18 20:32:03,854 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:03,854 EPOCH 10 done: loss 0.1234 - lr: 0.000000 2023-10-18 20:32:05,695 DEV : loss 0.14801497757434845 - f1-score (micro avg) 0.6013 2023-10-18 20:32:05,713 saving best model 2023-10-18 20:32:05,773 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:05,774 Loading model from best epoch ... 2023-10-18 20:32:05,849 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 20:32:07,351 Results: - F-score (micro) 0.619 - F-score (macro) 0.4877 - Accuracy 0.4937 By class: precision recall f1-score support LOC 0.7309 0.7008 0.7155 655 PER 0.4337 0.6457 0.5189 223 ORG 0.4167 0.1575 0.2286 127 micro avg 0.6181 0.6199 0.6190 1005 macro avg 0.5271 0.5013 0.4877 1005 weighted avg 0.6252 0.6199 0.6104 1005 2023-10-18 20:32:07,351 ----------------------------------------------------------------------------------------------------