stefan-it's picture
Upload folder using huggingface_hub
04b2386
raw
history blame
24.2 kB
2023-10-18 20:26:42,123 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:42,123 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 20:26:42,123 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:42,123 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-18 20:26:42,123 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:42,123 Train: 7936 sentences
2023-10-18 20:26:42,123 (train_with_dev=False, train_with_test=False)
2023-10-18 20:26:42,123 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:42,123 Training Params:
2023-10-18 20:26:42,123 - learning_rate: "5e-05"
2023-10-18 20:26:42,123 - mini_batch_size: "4"
2023-10-18 20:26:42,124 - max_epochs: "10"
2023-10-18 20:26:42,124 - shuffle: "True"
2023-10-18 20:26:42,124 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:42,124 Plugins:
2023-10-18 20:26:42,124 - TensorboardLogger
2023-10-18 20:26:42,124 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 20:26:42,124 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:42,124 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 20:26:42,124 - metric: "('micro avg', 'f1-score')"
2023-10-18 20:26:42,124 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:42,124 Computation:
2023-10-18 20:26:42,124 - compute on device: cuda:0
2023-10-18 20:26:42,124 - embedding storage: none
2023-10-18 20:26:42,124 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:42,124 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-18 20:26:42,124 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:42,124 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:42,124 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 20:26:45,221 epoch 1 - iter 198/1984 - loss 3.10528833 - time (sec): 3.10 - samples/sec: 5261.08 - lr: 0.000005 - momentum: 0.000000
2023-10-18 20:26:48,238 epoch 1 - iter 396/1984 - loss 2.51418141 - time (sec): 6.11 - samples/sec: 5312.12 - lr: 0.000010 - momentum: 0.000000
2023-10-18 20:26:51,300 epoch 1 - iter 594/1984 - loss 1.89531636 - time (sec): 9.18 - samples/sec: 5362.40 - lr: 0.000015 - momentum: 0.000000
2023-10-18 20:26:54,272 epoch 1 - iter 792/1984 - loss 1.54862823 - time (sec): 12.15 - samples/sec: 5412.02 - lr: 0.000020 - momentum: 0.000000
2023-10-18 20:26:57,272 epoch 1 - iter 990/1984 - loss 1.33496354 - time (sec): 15.15 - samples/sec: 5376.01 - lr: 0.000025 - momentum: 0.000000
2023-10-18 20:27:00,328 epoch 1 - iter 1188/1984 - loss 1.17925593 - time (sec): 18.20 - samples/sec: 5378.00 - lr: 0.000030 - momentum: 0.000000
2023-10-18 20:27:03,304 epoch 1 - iter 1386/1984 - loss 1.06364809 - time (sec): 21.18 - samples/sec: 5382.30 - lr: 0.000035 - momentum: 0.000000
2023-10-18 20:27:06,316 epoch 1 - iter 1584/1984 - loss 0.97153194 - time (sec): 24.19 - samples/sec: 5413.52 - lr: 0.000040 - momentum: 0.000000
2023-10-18 20:27:09,339 epoch 1 - iter 1782/1984 - loss 0.89943309 - time (sec): 27.21 - samples/sec: 5406.86 - lr: 0.000045 - momentum: 0.000000
2023-10-18 20:27:12,376 epoch 1 - iter 1980/1984 - loss 0.83845518 - time (sec): 30.25 - samples/sec: 5408.71 - lr: 0.000050 - momentum: 0.000000
2023-10-18 20:27:12,437 ----------------------------------------------------------------------------------------------------
2023-10-18 20:27:12,437 EPOCH 1 done: loss 0.8373 - lr: 0.000050
2023-10-18 20:27:14,301 DEV : loss 0.2123669981956482 - f1-score (micro avg) 0.3058
2023-10-18 20:27:14,319 saving best model
2023-10-18 20:27:14,350 ----------------------------------------------------------------------------------------------------
2023-10-18 20:27:17,415 epoch 2 - iter 198/1984 - loss 0.30751315 - time (sec): 3.06 - samples/sec: 5547.01 - lr: 0.000049 - momentum: 0.000000
2023-10-18 20:27:20,409 epoch 2 - iter 396/1984 - loss 0.28880847 - time (sec): 6.06 - samples/sec: 5584.38 - lr: 0.000049 - momentum: 0.000000
2023-10-18 20:27:23,444 epoch 2 - iter 594/1984 - loss 0.28273759 - time (sec): 9.09 - samples/sec: 5507.73 - lr: 0.000048 - momentum: 0.000000
2023-10-18 20:27:26,439 epoch 2 - iter 792/1984 - loss 0.27088719 - time (sec): 12.09 - samples/sec: 5458.52 - lr: 0.000048 - momentum: 0.000000
2023-10-18 20:27:29,491 epoch 2 - iter 990/1984 - loss 0.26504813 - time (sec): 15.14 - samples/sec: 5485.12 - lr: 0.000047 - momentum: 0.000000
2023-10-18 20:27:32,572 epoch 2 - iter 1188/1984 - loss 0.26339405 - time (sec): 18.22 - samples/sec: 5459.96 - lr: 0.000047 - momentum: 0.000000
2023-10-18 20:27:35,603 epoch 2 - iter 1386/1984 - loss 0.25689801 - time (sec): 21.25 - samples/sec: 5470.87 - lr: 0.000046 - momentum: 0.000000
2023-10-18 20:27:38,702 epoch 2 - iter 1584/1984 - loss 0.25609983 - time (sec): 24.35 - samples/sec: 5445.66 - lr: 0.000046 - momentum: 0.000000
2023-10-18 20:27:41,727 epoch 2 - iter 1782/1984 - loss 0.25426883 - time (sec): 27.38 - samples/sec: 5413.62 - lr: 0.000045 - momentum: 0.000000
2023-10-18 20:27:44,747 epoch 2 - iter 1980/1984 - loss 0.25189498 - time (sec): 30.40 - samples/sec: 5383.25 - lr: 0.000044 - momentum: 0.000000
2023-10-18 20:27:44,809 ----------------------------------------------------------------------------------------------------
2023-10-18 20:27:44,809 EPOCH 2 done: loss 0.2517 - lr: 0.000044
2023-10-18 20:27:46,652 DEV : loss 0.16580356657505035 - f1-score (micro avg) 0.4132
2023-10-18 20:27:46,671 saving best model
2023-10-18 20:27:46,706 ----------------------------------------------------------------------------------------------------
2023-10-18 20:27:49,900 epoch 3 - iter 198/1984 - loss 0.19382973 - time (sec): 3.19 - samples/sec: 5103.32 - lr: 0.000044 - momentum: 0.000000
2023-10-18 20:27:52,946 epoch 3 - iter 396/1984 - loss 0.18909038 - time (sec): 6.24 - samples/sec: 5257.64 - lr: 0.000043 - momentum: 0.000000
2023-10-18 20:27:55,955 epoch 3 - iter 594/1984 - loss 0.20993255 - time (sec): 9.25 - samples/sec: 5268.06 - lr: 0.000043 - momentum: 0.000000
2023-10-18 20:27:59,021 epoch 3 - iter 792/1984 - loss 0.20714776 - time (sec): 12.31 - samples/sec: 5332.54 - lr: 0.000042 - momentum: 0.000000
2023-10-18 20:28:02,011 epoch 3 - iter 990/1984 - loss 0.20713127 - time (sec): 15.30 - samples/sec: 5309.82 - lr: 0.000042 - momentum: 0.000000
2023-10-18 20:28:05,286 epoch 3 - iter 1188/1984 - loss 0.20727047 - time (sec): 18.58 - samples/sec: 5280.75 - lr: 0.000041 - momentum: 0.000000
2023-10-18 20:28:08,347 epoch 3 - iter 1386/1984 - loss 0.20873667 - time (sec): 21.64 - samples/sec: 5271.59 - lr: 0.000041 - momentum: 0.000000
2023-10-18 20:28:11,390 epoch 3 - iter 1584/1984 - loss 0.20804768 - time (sec): 24.68 - samples/sec: 5300.39 - lr: 0.000040 - momentum: 0.000000
2023-10-18 20:28:14,388 epoch 3 - iter 1782/1984 - loss 0.20568264 - time (sec): 27.68 - samples/sec: 5321.57 - lr: 0.000039 - momentum: 0.000000
2023-10-18 20:28:17,407 epoch 3 - iter 1980/1984 - loss 0.20417751 - time (sec): 30.70 - samples/sec: 5326.77 - lr: 0.000039 - momentum: 0.000000
2023-10-18 20:28:17,476 ----------------------------------------------------------------------------------------------------
2023-10-18 20:28:17,476 EPOCH 3 done: loss 0.2043 - lr: 0.000039
2023-10-18 20:28:19,319 DEV : loss 0.14851002395153046 - f1-score (micro avg) 0.513
2023-10-18 20:28:19,337 saving best model
2023-10-18 20:28:19,377 ----------------------------------------------------------------------------------------------------
2023-10-18 20:28:22,449 epoch 4 - iter 198/1984 - loss 0.18877601 - time (sec): 3.07 - samples/sec: 5327.50 - lr: 0.000038 - momentum: 0.000000
2023-10-18 20:28:25,530 epoch 4 - iter 396/1984 - loss 0.19191756 - time (sec): 6.15 - samples/sec: 5277.12 - lr: 0.000038 - momentum: 0.000000
2023-10-18 20:28:28,538 epoch 4 - iter 594/1984 - loss 0.18721169 - time (sec): 9.16 - samples/sec: 5235.98 - lr: 0.000037 - momentum: 0.000000
2023-10-18 20:28:31,607 epoch 4 - iter 792/1984 - loss 0.19120631 - time (sec): 12.23 - samples/sec: 5184.55 - lr: 0.000037 - momentum: 0.000000
2023-10-18 20:28:34,601 epoch 4 - iter 990/1984 - loss 0.18849200 - time (sec): 15.22 - samples/sec: 5215.47 - lr: 0.000036 - momentum: 0.000000
2023-10-18 20:28:37,624 epoch 4 - iter 1188/1984 - loss 0.18373488 - time (sec): 18.25 - samples/sec: 5250.54 - lr: 0.000036 - momentum: 0.000000
2023-10-18 20:28:40,678 epoch 4 - iter 1386/1984 - loss 0.18383898 - time (sec): 21.30 - samples/sec: 5325.35 - lr: 0.000035 - momentum: 0.000000
2023-10-18 20:28:43,747 epoch 4 - iter 1584/1984 - loss 0.18100800 - time (sec): 24.37 - samples/sec: 5341.76 - lr: 0.000034 - momentum: 0.000000
2023-10-18 20:28:46,758 epoch 4 - iter 1782/1984 - loss 0.18083878 - time (sec): 27.38 - samples/sec: 5332.30 - lr: 0.000034 - momentum: 0.000000
2023-10-18 20:28:49,823 epoch 4 - iter 1980/1984 - loss 0.17764091 - time (sec): 30.44 - samples/sec: 5375.24 - lr: 0.000033 - momentum: 0.000000
2023-10-18 20:28:49,882 ----------------------------------------------------------------------------------------------------
2023-10-18 20:28:49,883 EPOCH 4 done: loss 0.1776 - lr: 0.000033
2023-10-18 20:28:51,692 DEV : loss 0.14755003154277802 - f1-score (micro avg) 0.5567
2023-10-18 20:28:51,711 saving best model
2023-10-18 20:28:51,746 ----------------------------------------------------------------------------------------------------
2023-10-18 20:28:54,714 epoch 5 - iter 198/1984 - loss 0.19822197 - time (sec): 2.97 - samples/sec: 5025.42 - lr: 0.000033 - momentum: 0.000000
2023-10-18 20:28:57,801 epoch 5 - iter 396/1984 - loss 0.16894933 - time (sec): 6.05 - samples/sec: 5373.09 - lr: 0.000032 - momentum: 0.000000
2023-10-18 20:29:00,850 epoch 5 - iter 594/1984 - loss 0.16781950 - time (sec): 9.10 - samples/sec: 5386.35 - lr: 0.000032 - momentum: 0.000000
2023-10-18 20:29:03,958 epoch 5 - iter 792/1984 - loss 0.16415652 - time (sec): 12.21 - samples/sec: 5406.11 - lr: 0.000031 - momentum: 0.000000
2023-10-18 20:29:06,959 epoch 5 - iter 990/1984 - loss 0.16205996 - time (sec): 15.21 - samples/sec: 5362.44 - lr: 0.000031 - momentum: 0.000000
2023-10-18 20:29:09,981 epoch 5 - iter 1188/1984 - loss 0.16241670 - time (sec): 18.23 - samples/sec: 5393.56 - lr: 0.000030 - momentum: 0.000000
2023-10-18 20:29:13,014 epoch 5 - iter 1386/1984 - loss 0.16432711 - time (sec): 21.27 - samples/sec: 5406.25 - lr: 0.000029 - momentum: 0.000000
2023-10-18 20:29:16,025 epoch 5 - iter 1584/1984 - loss 0.16348040 - time (sec): 24.28 - samples/sec: 5420.92 - lr: 0.000029 - momentum: 0.000000
2023-10-18 20:29:19,029 epoch 5 - iter 1782/1984 - loss 0.16364024 - time (sec): 27.28 - samples/sec: 5416.91 - lr: 0.000028 - momentum: 0.000000
2023-10-18 20:29:22,045 epoch 5 - iter 1980/1984 - loss 0.16319243 - time (sec): 30.30 - samples/sec: 5401.10 - lr: 0.000028 - momentum: 0.000000
2023-10-18 20:29:22,104 ----------------------------------------------------------------------------------------------------
2023-10-18 20:29:22,104 EPOCH 5 done: loss 0.1631 - lr: 0.000028
2023-10-18 20:29:23,921 DEV : loss 0.13933779299259186 - f1-score (micro avg) 0.5857
2023-10-18 20:29:23,939 saving best model
2023-10-18 20:29:23,973 ----------------------------------------------------------------------------------------------------
2023-10-18 20:29:27,070 epoch 6 - iter 198/1984 - loss 0.17797563 - time (sec): 3.10 - samples/sec: 4977.75 - lr: 0.000027 - momentum: 0.000000
2023-10-18 20:29:30,099 epoch 6 - iter 396/1984 - loss 0.16344503 - time (sec): 6.12 - samples/sec: 5175.17 - lr: 0.000027 - momentum: 0.000000
2023-10-18 20:29:33,245 epoch 6 - iter 594/1984 - loss 0.16069556 - time (sec): 9.27 - samples/sec: 5117.65 - lr: 0.000026 - momentum: 0.000000
2023-10-18 20:29:36,261 epoch 6 - iter 792/1984 - loss 0.16257303 - time (sec): 12.29 - samples/sec: 5175.21 - lr: 0.000026 - momentum: 0.000000
2023-10-18 20:29:39,313 epoch 6 - iter 990/1984 - loss 0.16151177 - time (sec): 15.34 - samples/sec: 5240.88 - lr: 0.000025 - momentum: 0.000000
2023-10-18 20:29:42,401 epoch 6 - iter 1188/1984 - loss 0.15627333 - time (sec): 18.43 - samples/sec: 5283.96 - lr: 0.000024 - momentum: 0.000000
2023-10-18 20:29:45,429 epoch 6 - iter 1386/1984 - loss 0.15261274 - time (sec): 21.46 - samples/sec: 5332.64 - lr: 0.000024 - momentum: 0.000000
2023-10-18 20:29:48,133 epoch 6 - iter 1584/1984 - loss 0.15177407 - time (sec): 24.16 - samples/sec: 5415.88 - lr: 0.000023 - momentum: 0.000000
2023-10-18 20:29:50,811 epoch 6 - iter 1782/1984 - loss 0.15322561 - time (sec): 26.84 - samples/sec: 5487.27 - lr: 0.000023 - momentum: 0.000000
2023-10-18 20:29:53,582 epoch 6 - iter 1980/1984 - loss 0.15084002 - time (sec): 29.61 - samples/sec: 5527.18 - lr: 0.000022 - momentum: 0.000000
2023-10-18 20:29:53,643 ----------------------------------------------------------------------------------------------------
2023-10-18 20:29:53,644 EPOCH 6 done: loss 0.1508 - lr: 0.000022
2023-10-18 20:29:55,863 DEV : loss 0.14441362023353577 - f1-score (micro avg) 0.5832
2023-10-18 20:29:55,882 ----------------------------------------------------------------------------------------------------
2023-10-18 20:29:58,975 epoch 7 - iter 198/1984 - loss 0.17696709 - time (sec): 3.09 - samples/sec: 5179.02 - lr: 0.000022 - momentum: 0.000000
2023-10-18 20:30:01,974 epoch 7 - iter 396/1984 - loss 0.15547142 - time (sec): 6.09 - samples/sec: 5311.56 - lr: 0.000021 - momentum: 0.000000
2023-10-18 20:30:05,067 epoch 7 - iter 594/1984 - loss 0.15230504 - time (sec): 9.18 - samples/sec: 5276.61 - lr: 0.000021 - momentum: 0.000000
2023-10-18 20:30:08,135 epoch 7 - iter 792/1984 - loss 0.14464304 - time (sec): 12.25 - samples/sec: 5360.99 - lr: 0.000020 - momentum: 0.000000
2023-10-18 20:30:11,196 epoch 7 - iter 990/1984 - loss 0.14375802 - time (sec): 15.31 - samples/sec: 5387.28 - lr: 0.000019 - momentum: 0.000000
2023-10-18 20:30:14,217 epoch 7 - iter 1188/1984 - loss 0.14330999 - time (sec): 18.33 - samples/sec: 5362.74 - lr: 0.000019 - momentum: 0.000000
2023-10-18 20:30:17,322 epoch 7 - iter 1386/1984 - loss 0.14105304 - time (sec): 21.44 - samples/sec: 5365.32 - lr: 0.000018 - momentum: 0.000000
2023-10-18 20:30:20,389 epoch 7 - iter 1584/1984 - loss 0.14079064 - time (sec): 24.51 - samples/sec: 5340.24 - lr: 0.000018 - momentum: 0.000000
2023-10-18 20:30:23,438 epoch 7 - iter 1782/1984 - loss 0.14054323 - time (sec): 27.56 - samples/sec: 5332.07 - lr: 0.000017 - momentum: 0.000000
2023-10-18 20:30:26,642 epoch 7 - iter 1980/1984 - loss 0.14054386 - time (sec): 30.76 - samples/sec: 5324.44 - lr: 0.000017 - momentum: 0.000000
2023-10-18 20:30:26,701 ----------------------------------------------------------------------------------------------------
2023-10-18 20:30:26,701 EPOCH 7 done: loss 0.1405 - lr: 0.000017
2023-10-18 20:30:28,540 DEV : loss 0.14645995199680328 - f1-score (micro avg) 0.5839
2023-10-18 20:30:28,559 ----------------------------------------------------------------------------------------------------
2023-10-18 20:30:31,574 epoch 8 - iter 198/1984 - loss 0.13505995 - time (sec): 3.01 - samples/sec: 5338.26 - lr: 0.000016 - momentum: 0.000000
2023-10-18 20:30:34,585 epoch 8 - iter 396/1984 - loss 0.13300452 - time (sec): 6.03 - samples/sec: 5274.07 - lr: 0.000016 - momentum: 0.000000
2023-10-18 20:30:37,605 epoch 8 - iter 594/1984 - loss 0.13513264 - time (sec): 9.05 - samples/sec: 5218.46 - lr: 0.000015 - momentum: 0.000000
2023-10-18 20:30:40,619 epoch 8 - iter 792/1984 - loss 0.13631868 - time (sec): 12.06 - samples/sec: 5324.96 - lr: 0.000014 - momentum: 0.000000
2023-10-18 20:30:43,473 epoch 8 - iter 990/1984 - loss 0.13573158 - time (sec): 14.91 - samples/sec: 5368.03 - lr: 0.000014 - momentum: 0.000000
2023-10-18 20:30:46,628 epoch 8 - iter 1188/1984 - loss 0.13343905 - time (sec): 18.07 - samples/sec: 5410.44 - lr: 0.000013 - momentum: 0.000000
2023-10-18 20:30:49,733 epoch 8 - iter 1386/1984 - loss 0.13185324 - time (sec): 21.17 - samples/sec: 5358.71 - lr: 0.000013 - momentum: 0.000000
2023-10-18 20:30:52,837 epoch 8 - iter 1584/1984 - loss 0.13284754 - time (sec): 24.28 - samples/sec: 5373.28 - lr: 0.000012 - momentum: 0.000000
2023-10-18 20:30:55,870 epoch 8 - iter 1782/1984 - loss 0.13371484 - time (sec): 27.31 - samples/sec: 5380.92 - lr: 0.000012 - momentum: 0.000000
2023-10-18 20:30:58,973 epoch 8 - iter 1980/1984 - loss 0.13321577 - time (sec): 30.41 - samples/sec: 5383.22 - lr: 0.000011 - momentum: 0.000000
2023-10-18 20:30:59,035 ----------------------------------------------------------------------------------------------------
2023-10-18 20:30:59,035 EPOCH 8 done: loss 0.1331 - lr: 0.000011
2023-10-18 20:31:00,874 DEV : loss 0.1467994898557663 - f1-score (micro avg) 0.5978
2023-10-18 20:31:00,893 saving best model
2023-10-18 20:31:00,928 ----------------------------------------------------------------------------------------------------
2023-10-18 20:31:03,991 epoch 9 - iter 198/1984 - loss 0.11768644 - time (sec): 3.06 - samples/sec: 5303.58 - lr: 0.000011 - momentum: 0.000000
2023-10-18 20:31:07,031 epoch 9 - iter 396/1984 - loss 0.13017349 - time (sec): 6.10 - samples/sec: 5376.12 - lr: 0.000010 - momentum: 0.000000
2023-10-18 20:31:10,059 epoch 9 - iter 594/1984 - loss 0.13044117 - time (sec): 9.13 - samples/sec: 5287.11 - lr: 0.000009 - momentum: 0.000000
2023-10-18 20:31:13,053 epoch 9 - iter 792/1984 - loss 0.12921804 - time (sec): 12.12 - samples/sec: 5270.42 - lr: 0.000009 - momentum: 0.000000
2023-10-18 20:31:16,142 epoch 9 - iter 990/1984 - loss 0.12884571 - time (sec): 15.21 - samples/sec: 5310.97 - lr: 0.000008 - momentum: 0.000000
2023-10-18 20:31:19,206 epoch 9 - iter 1188/1984 - loss 0.12866639 - time (sec): 18.28 - samples/sec: 5297.82 - lr: 0.000008 - momentum: 0.000000
2023-10-18 20:31:22,278 epoch 9 - iter 1386/1984 - loss 0.12832734 - time (sec): 21.35 - samples/sec: 5353.47 - lr: 0.000007 - momentum: 0.000000
2023-10-18 20:31:25,302 epoch 9 - iter 1584/1984 - loss 0.12983739 - time (sec): 24.37 - samples/sec: 5340.26 - lr: 0.000007 - momentum: 0.000000
2023-10-18 20:31:28,367 epoch 9 - iter 1782/1984 - loss 0.12868462 - time (sec): 27.44 - samples/sec: 5343.64 - lr: 0.000006 - momentum: 0.000000
2023-10-18 20:31:31,449 epoch 9 - iter 1980/1984 - loss 0.12845863 - time (sec): 30.52 - samples/sec: 5363.40 - lr: 0.000006 - momentum: 0.000000
2023-10-18 20:31:31,511 ----------------------------------------------------------------------------------------------------
2023-10-18 20:31:31,511 EPOCH 9 done: loss 0.1285 - lr: 0.000006
2023-10-18 20:31:33,363 DEV : loss 0.1488402932882309 - f1-score (micro avg) 0.598
2023-10-18 20:31:33,382 saving best model
2023-10-18 20:31:33,419 ----------------------------------------------------------------------------------------------------
2023-10-18 20:31:36,322 epoch 10 - iter 198/1984 - loss 0.12152017 - time (sec): 2.90 - samples/sec: 5410.22 - lr: 0.000005 - momentum: 0.000000
2023-10-18 20:31:39,369 epoch 10 - iter 396/1984 - loss 0.12775156 - time (sec): 5.95 - samples/sec: 5394.26 - lr: 0.000004 - momentum: 0.000000
2023-10-18 20:31:42,403 epoch 10 - iter 594/1984 - loss 0.12825070 - time (sec): 8.98 - samples/sec: 5336.97 - lr: 0.000004 - momentum: 0.000000
2023-10-18 20:31:45,472 epoch 10 - iter 792/1984 - loss 0.12234506 - time (sec): 12.05 - samples/sec: 5362.92 - lr: 0.000003 - momentum: 0.000000
2023-10-18 20:31:48,524 epoch 10 - iter 990/1984 - loss 0.12321278 - time (sec): 15.11 - samples/sec: 5364.67 - lr: 0.000003 - momentum: 0.000000
2023-10-18 20:31:51,571 epoch 10 - iter 1188/1984 - loss 0.12468032 - time (sec): 18.15 - samples/sec: 5333.23 - lr: 0.000002 - momentum: 0.000000
2023-10-18 20:31:54,673 epoch 10 - iter 1386/1984 - loss 0.12362803 - time (sec): 21.25 - samples/sec: 5378.68 - lr: 0.000002 - momentum: 0.000000
2023-10-18 20:31:57,703 epoch 10 - iter 1584/1984 - loss 0.12226510 - time (sec): 24.28 - samples/sec: 5359.19 - lr: 0.000001 - momentum: 0.000000
2023-10-18 20:32:00,719 epoch 10 - iter 1782/1984 - loss 0.12224833 - time (sec): 27.30 - samples/sec: 5379.10 - lr: 0.000001 - momentum: 0.000000
2023-10-18 20:32:03,781 epoch 10 - iter 1980/1984 - loss 0.12319689 - time (sec): 30.36 - samples/sec: 5388.44 - lr: 0.000000 - momentum: 0.000000
2023-10-18 20:32:03,854 ----------------------------------------------------------------------------------------------------
2023-10-18 20:32:03,854 EPOCH 10 done: loss 0.1234 - lr: 0.000000
2023-10-18 20:32:05,695 DEV : loss 0.14801497757434845 - f1-score (micro avg) 0.6013
2023-10-18 20:32:05,713 saving best model
2023-10-18 20:32:05,773 ----------------------------------------------------------------------------------------------------
2023-10-18 20:32:05,774 Loading model from best epoch ...
2023-10-18 20:32:05,849 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-18 20:32:07,351
Results:
- F-score (micro) 0.619
- F-score (macro) 0.4877
- Accuracy 0.4937
By class:
precision recall f1-score support
LOC 0.7309 0.7008 0.7155 655
PER 0.4337 0.6457 0.5189 223
ORG 0.4167 0.1575 0.2286 127
micro avg 0.6181 0.6199 0.6190 1005
macro avg 0.5271 0.5013 0.4877 1005
weighted avg 0.6252 0.6199 0.6104 1005
2023-10-18 20:32:07,351 ----------------------------------------------------------------------------------------------------