stefan-it's picture
Upload ./training.log with huggingface_hub
78915e1
2023-10-25 18:28:08,806 ----------------------------------------------------------------------------------------------------
2023-10-25 18:28:08,807 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 18:28:08,807 ----------------------------------------------------------------------------------------------------
2023-10-25 18:28:08,807 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 18:28:08,807 ----------------------------------------------------------------------------------------------------
2023-10-25 18:28:08,807 Train: 20847 sentences
2023-10-25 18:28:08,807 (train_with_dev=False, train_with_test=False)
2023-10-25 18:28:08,807 ----------------------------------------------------------------------------------------------------
2023-10-25 18:28:08,807 Training Params:
2023-10-25 18:28:08,807 - learning_rate: "3e-05"
2023-10-25 18:28:08,807 - mini_batch_size: "8"
2023-10-25 18:28:08,807 - max_epochs: "10"
2023-10-25 18:28:08,807 - shuffle: "True"
2023-10-25 18:28:08,807 ----------------------------------------------------------------------------------------------------
2023-10-25 18:28:08,807 Plugins:
2023-10-25 18:28:08,807 - TensorboardLogger
2023-10-25 18:28:08,807 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 18:28:08,807 ----------------------------------------------------------------------------------------------------
2023-10-25 18:28:08,807 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 18:28:08,807 - metric: "('micro avg', 'f1-score')"
2023-10-25 18:28:08,807 ----------------------------------------------------------------------------------------------------
2023-10-25 18:28:08,807 Computation:
2023-10-25 18:28:08,807 - compute on device: cuda:0
2023-10-25 18:28:08,807 - embedding storage: none
2023-10-25 18:28:08,807 ----------------------------------------------------------------------------------------------------
2023-10-25 18:28:08,807 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-25 18:28:08,807 ----------------------------------------------------------------------------------------------------
2023-10-25 18:28:08,807 ----------------------------------------------------------------------------------------------------
2023-10-25 18:28:08,807 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 18:28:22,840 epoch 1 - iter 260/2606 - loss 1.44118564 - time (sec): 14.03 - samples/sec: 2575.65 - lr: 0.000003 - momentum: 0.000000
2023-10-25 18:28:36,666 epoch 1 - iter 520/2606 - loss 0.92810801 - time (sec): 27.86 - samples/sec: 2637.50 - lr: 0.000006 - momentum: 0.000000
2023-10-25 18:28:51,042 epoch 1 - iter 780/2606 - loss 0.70417279 - time (sec): 42.23 - samples/sec: 2644.02 - lr: 0.000009 - momentum: 0.000000
2023-10-25 18:29:05,298 epoch 1 - iter 1040/2606 - loss 0.59393290 - time (sec): 56.49 - samples/sec: 2629.87 - lr: 0.000012 - momentum: 0.000000
2023-10-25 18:29:18,805 epoch 1 - iter 1300/2606 - loss 0.52883614 - time (sec): 70.00 - samples/sec: 2619.81 - lr: 0.000015 - momentum: 0.000000
2023-10-25 18:29:32,598 epoch 1 - iter 1560/2606 - loss 0.48018899 - time (sec): 83.79 - samples/sec: 2605.48 - lr: 0.000018 - momentum: 0.000000
2023-10-25 18:29:47,075 epoch 1 - iter 1820/2606 - loss 0.43654950 - time (sec): 98.27 - samples/sec: 2624.14 - lr: 0.000021 - momentum: 0.000000
2023-10-25 18:30:01,512 epoch 1 - iter 2080/2606 - loss 0.40049267 - time (sec): 112.70 - samples/sec: 2624.17 - lr: 0.000024 - momentum: 0.000000
2023-10-25 18:30:15,541 epoch 1 - iter 2340/2606 - loss 0.37673764 - time (sec): 126.73 - samples/sec: 2615.94 - lr: 0.000027 - momentum: 0.000000
2023-10-25 18:30:29,155 epoch 1 - iter 2600/2606 - loss 0.35942056 - time (sec): 140.35 - samples/sec: 2609.76 - lr: 0.000030 - momentum: 0.000000
2023-10-25 18:30:29,546 ----------------------------------------------------------------------------------------------------
2023-10-25 18:30:29,547 EPOCH 1 done: loss 0.3587 - lr: 0.000030
2023-10-25 18:30:33,264 DEV : loss 0.1093616634607315 - f1-score (micro avg) 0.2831
2023-10-25 18:30:33,289 saving best model
2023-10-25 18:30:33,764 ----------------------------------------------------------------------------------------------------
2023-10-25 18:30:47,858 epoch 2 - iter 260/2606 - loss 0.15973395 - time (sec): 14.09 - samples/sec: 2596.30 - lr: 0.000030 - momentum: 0.000000
2023-10-25 18:31:01,897 epoch 2 - iter 520/2606 - loss 0.15273016 - time (sec): 28.13 - samples/sec: 2587.41 - lr: 0.000029 - momentum: 0.000000
2023-10-25 18:31:16,107 epoch 2 - iter 780/2606 - loss 0.15091510 - time (sec): 42.34 - samples/sec: 2602.93 - lr: 0.000029 - momentum: 0.000000
2023-10-25 18:31:29,906 epoch 2 - iter 1040/2606 - loss 0.14924346 - time (sec): 56.14 - samples/sec: 2625.36 - lr: 0.000029 - momentum: 0.000000
2023-10-25 18:31:43,744 epoch 2 - iter 1300/2606 - loss 0.14800277 - time (sec): 69.98 - samples/sec: 2613.86 - lr: 0.000028 - momentum: 0.000000
2023-10-25 18:31:57,687 epoch 2 - iter 1560/2606 - loss 0.14609768 - time (sec): 83.92 - samples/sec: 2639.91 - lr: 0.000028 - momentum: 0.000000
2023-10-25 18:32:11,802 epoch 2 - iter 1820/2606 - loss 0.14641189 - time (sec): 98.04 - samples/sec: 2628.39 - lr: 0.000028 - momentum: 0.000000
2023-10-25 18:32:25,136 epoch 2 - iter 2080/2606 - loss 0.14722873 - time (sec): 111.37 - samples/sec: 2618.82 - lr: 0.000027 - momentum: 0.000000
2023-10-25 18:32:39,368 epoch 2 - iter 2340/2606 - loss 0.14675373 - time (sec): 125.60 - samples/sec: 2616.38 - lr: 0.000027 - momentum: 0.000000
2023-10-25 18:32:53,254 epoch 2 - iter 2600/2606 - loss 0.14529176 - time (sec): 139.49 - samples/sec: 2626.49 - lr: 0.000027 - momentum: 0.000000
2023-10-25 18:32:53,635 ----------------------------------------------------------------------------------------------------
2023-10-25 18:32:53,635 EPOCH 2 done: loss 0.1449 - lr: 0.000027
2023-10-25 18:33:00,495 DEV : loss 0.12418132275342941 - f1-score (micro avg) 0.345
2023-10-25 18:33:00,522 saving best model
2023-10-25 18:33:01,001 ----------------------------------------------------------------------------------------------------
2023-10-25 18:33:14,815 epoch 3 - iter 260/2606 - loss 0.12123734 - time (sec): 13.81 - samples/sec: 2636.88 - lr: 0.000026 - momentum: 0.000000
2023-10-25 18:33:29,151 epoch 3 - iter 520/2606 - loss 0.10035032 - time (sec): 28.15 - samples/sec: 2699.66 - lr: 0.000026 - momentum: 0.000000
2023-10-25 18:33:43,128 epoch 3 - iter 780/2606 - loss 0.09873947 - time (sec): 42.12 - samples/sec: 2677.14 - lr: 0.000026 - momentum: 0.000000
2023-10-25 18:33:56,862 epoch 3 - iter 1040/2606 - loss 0.09621422 - time (sec): 55.86 - samples/sec: 2651.72 - lr: 0.000025 - momentum: 0.000000
2023-10-25 18:34:10,890 epoch 3 - iter 1300/2606 - loss 0.09492393 - time (sec): 69.89 - samples/sec: 2650.00 - lr: 0.000025 - momentum: 0.000000
2023-10-25 18:34:24,483 epoch 3 - iter 1560/2606 - loss 0.09360765 - time (sec): 83.48 - samples/sec: 2628.90 - lr: 0.000025 - momentum: 0.000000
2023-10-25 18:34:38,353 epoch 3 - iter 1820/2606 - loss 0.09935380 - time (sec): 97.35 - samples/sec: 2625.24 - lr: 0.000024 - momentum: 0.000000
2023-10-25 18:34:52,835 epoch 3 - iter 2080/2606 - loss 0.09836373 - time (sec): 111.83 - samples/sec: 2637.51 - lr: 0.000024 - momentum: 0.000000
2023-10-25 18:35:06,260 epoch 3 - iter 2340/2606 - loss 0.09861426 - time (sec): 125.26 - samples/sec: 2624.73 - lr: 0.000024 - momentum: 0.000000
2023-10-25 18:35:20,465 epoch 3 - iter 2600/2606 - loss 0.09836569 - time (sec): 139.46 - samples/sec: 2629.47 - lr: 0.000023 - momentum: 0.000000
2023-10-25 18:35:20,763 ----------------------------------------------------------------------------------------------------
2023-10-25 18:35:20,763 EPOCH 3 done: loss 0.0983 - lr: 0.000023
2023-10-25 18:35:27,694 DEV : loss 0.3102573752403259 - f1-score (micro avg) 0.3591
2023-10-25 18:35:27,720 saving best model
2023-10-25 18:35:28,194 ----------------------------------------------------------------------------------------------------
2023-10-25 18:35:42,333 epoch 4 - iter 260/2606 - loss 0.05744413 - time (sec): 14.14 - samples/sec: 2584.08 - lr: 0.000023 - momentum: 0.000000
2023-10-25 18:35:57,025 epoch 4 - iter 520/2606 - loss 0.06369754 - time (sec): 28.83 - samples/sec: 2660.47 - lr: 0.000023 - momentum: 0.000000
2023-10-25 18:36:11,068 epoch 4 - iter 780/2606 - loss 0.06347063 - time (sec): 42.87 - samples/sec: 2642.25 - lr: 0.000022 - momentum: 0.000000
2023-10-25 18:36:25,141 epoch 4 - iter 1040/2606 - loss 0.06530696 - time (sec): 56.94 - samples/sec: 2663.60 - lr: 0.000022 - momentum: 0.000000
2023-10-25 18:36:38,908 epoch 4 - iter 1300/2606 - loss 0.06387663 - time (sec): 70.71 - samples/sec: 2682.17 - lr: 0.000022 - momentum: 0.000000
2023-10-25 18:36:52,388 epoch 4 - iter 1560/2606 - loss 0.06503052 - time (sec): 84.19 - samples/sec: 2665.31 - lr: 0.000021 - momentum: 0.000000
2023-10-25 18:37:06,164 epoch 4 - iter 1820/2606 - loss 0.06576948 - time (sec): 97.97 - samples/sec: 2673.09 - lr: 0.000021 - momentum: 0.000000
2023-10-25 18:37:20,037 epoch 4 - iter 2080/2606 - loss 0.06561542 - time (sec): 111.84 - samples/sec: 2667.10 - lr: 0.000021 - momentum: 0.000000
2023-10-25 18:37:33,397 epoch 4 - iter 2340/2606 - loss 0.06595358 - time (sec): 125.20 - samples/sec: 2660.84 - lr: 0.000020 - momentum: 0.000000
2023-10-25 18:37:46,603 epoch 4 - iter 2600/2606 - loss 0.06624139 - time (sec): 138.41 - samples/sec: 2650.34 - lr: 0.000020 - momentum: 0.000000
2023-10-25 18:37:46,868 ----------------------------------------------------------------------------------------------------
2023-10-25 18:37:46,868 EPOCH 4 done: loss 0.0662 - lr: 0.000020
2023-10-25 18:37:53,780 DEV : loss 0.27316081523895264 - f1-score (micro avg) 0.3408
2023-10-25 18:37:53,807 ----------------------------------------------------------------------------------------------------
2023-10-25 18:38:07,884 epoch 5 - iter 260/2606 - loss 0.05585596 - time (sec): 14.08 - samples/sec: 2570.69 - lr: 0.000020 - momentum: 0.000000
2023-10-25 18:38:21,643 epoch 5 - iter 520/2606 - loss 0.05209107 - time (sec): 27.84 - samples/sec: 2558.79 - lr: 0.000019 - momentum: 0.000000
2023-10-25 18:38:35,682 epoch 5 - iter 780/2606 - loss 0.04718666 - time (sec): 41.87 - samples/sec: 2606.01 - lr: 0.000019 - momentum: 0.000000
2023-10-25 18:38:49,695 epoch 5 - iter 1040/2606 - loss 0.04694978 - time (sec): 55.89 - samples/sec: 2636.11 - lr: 0.000019 - momentum: 0.000000
2023-10-25 18:39:03,937 epoch 5 - iter 1300/2606 - loss 0.04779807 - time (sec): 70.13 - samples/sec: 2627.53 - lr: 0.000018 - momentum: 0.000000
2023-10-25 18:39:17,799 epoch 5 - iter 1560/2606 - loss 0.04766551 - time (sec): 83.99 - samples/sec: 2633.45 - lr: 0.000018 - momentum: 0.000000
2023-10-25 18:39:31,276 epoch 5 - iter 1820/2606 - loss 0.04834322 - time (sec): 97.47 - samples/sec: 2613.56 - lr: 0.000018 - momentum: 0.000000
2023-10-25 18:39:45,187 epoch 5 - iter 2080/2606 - loss 0.04845369 - time (sec): 111.38 - samples/sec: 2622.77 - lr: 0.000017 - momentum: 0.000000
2023-10-25 18:39:58,437 epoch 5 - iter 2340/2606 - loss 0.04889454 - time (sec): 124.63 - samples/sec: 2636.67 - lr: 0.000017 - momentum: 0.000000
2023-10-25 18:40:12,471 epoch 5 - iter 2600/2606 - loss 0.04779773 - time (sec): 138.66 - samples/sec: 2643.59 - lr: 0.000017 - momentum: 0.000000
2023-10-25 18:40:12,773 ----------------------------------------------------------------------------------------------------
2023-10-25 18:40:12,773 EPOCH 5 done: loss 0.0478 - lr: 0.000017
2023-10-25 18:40:19,656 DEV : loss 0.3309582471847534 - f1-score (micro avg) 0.3394
2023-10-25 18:40:19,682 ----------------------------------------------------------------------------------------------------
2023-10-25 18:40:33,761 epoch 6 - iter 260/2606 - loss 0.03565927 - time (sec): 14.08 - samples/sec: 2701.06 - lr: 0.000016 - momentum: 0.000000
2023-10-25 18:40:47,631 epoch 6 - iter 520/2606 - loss 0.03669726 - time (sec): 27.95 - samples/sec: 2615.10 - lr: 0.000016 - momentum: 0.000000
2023-10-25 18:41:01,384 epoch 6 - iter 780/2606 - loss 0.03458977 - time (sec): 41.70 - samples/sec: 2601.88 - lr: 0.000016 - momentum: 0.000000
2023-10-25 18:41:15,444 epoch 6 - iter 1040/2606 - loss 0.03370625 - time (sec): 55.76 - samples/sec: 2630.87 - lr: 0.000015 - momentum: 0.000000
2023-10-25 18:41:29,289 epoch 6 - iter 1300/2606 - loss 0.03659547 - time (sec): 69.61 - samples/sec: 2615.42 - lr: 0.000015 - momentum: 0.000000
2023-10-25 18:41:43,555 epoch 6 - iter 1560/2606 - loss 0.03566172 - time (sec): 83.87 - samples/sec: 2625.18 - lr: 0.000015 - momentum: 0.000000
2023-10-25 18:41:56,926 epoch 6 - iter 1820/2606 - loss 0.03750852 - time (sec): 97.24 - samples/sec: 2619.62 - lr: 0.000014 - momentum: 0.000000
2023-10-25 18:42:10,835 epoch 6 - iter 2080/2606 - loss 0.03612369 - time (sec): 111.15 - samples/sec: 2623.28 - lr: 0.000014 - momentum: 0.000000
2023-10-25 18:42:25,243 epoch 6 - iter 2340/2606 - loss 0.03526350 - time (sec): 125.56 - samples/sec: 2621.48 - lr: 0.000014 - momentum: 0.000000
2023-10-25 18:42:39,920 epoch 6 - iter 2600/2606 - loss 0.03565724 - time (sec): 140.24 - samples/sec: 2613.12 - lr: 0.000013 - momentum: 0.000000
2023-10-25 18:42:40,268 ----------------------------------------------------------------------------------------------------
2023-10-25 18:42:40,268 EPOCH 6 done: loss 0.0357 - lr: 0.000013
2023-10-25 18:42:46,462 DEV : loss 0.3563600778579712 - f1-score (micro avg) 0.3609
2023-10-25 18:42:46,488 saving best model
2023-10-25 18:42:47,091 ----------------------------------------------------------------------------------------------------
2023-10-25 18:43:01,182 epoch 7 - iter 260/2606 - loss 0.03348942 - time (sec): 14.09 - samples/sec: 2564.82 - lr: 0.000013 - momentum: 0.000000
2023-10-25 18:43:16,395 epoch 7 - iter 520/2606 - loss 0.03261560 - time (sec): 29.30 - samples/sec: 2522.13 - lr: 0.000013 - momentum: 0.000000
2023-10-25 18:43:30,249 epoch 7 - iter 780/2606 - loss 0.03131634 - time (sec): 43.16 - samples/sec: 2570.67 - lr: 0.000012 - momentum: 0.000000
2023-10-25 18:43:44,668 epoch 7 - iter 1040/2606 - loss 0.02913817 - time (sec): 57.57 - samples/sec: 2596.73 - lr: 0.000012 - momentum: 0.000000
2023-10-25 18:43:58,255 epoch 7 - iter 1300/2606 - loss 0.02865369 - time (sec): 71.16 - samples/sec: 2606.97 - lr: 0.000012 - momentum: 0.000000
2023-10-25 18:44:12,566 epoch 7 - iter 1560/2606 - loss 0.03000096 - time (sec): 85.47 - samples/sec: 2621.12 - lr: 0.000011 - momentum: 0.000000
2023-10-25 18:44:26,829 epoch 7 - iter 1820/2606 - loss 0.02935599 - time (sec): 99.74 - samples/sec: 2624.84 - lr: 0.000011 - momentum: 0.000000
2023-10-25 18:44:41,001 epoch 7 - iter 2080/2606 - loss 0.02816366 - time (sec): 113.91 - samples/sec: 2619.08 - lr: 0.000011 - momentum: 0.000000
2023-10-25 18:44:54,742 epoch 7 - iter 2340/2606 - loss 0.02786870 - time (sec): 127.65 - samples/sec: 2602.86 - lr: 0.000010 - momentum: 0.000000
2023-10-25 18:45:08,369 epoch 7 - iter 2600/2606 - loss 0.02810153 - time (sec): 141.28 - samples/sec: 2595.47 - lr: 0.000010 - momentum: 0.000000
2023-10-25 18:45:08,655 ----------------------------------------------------------------------------------------------------
2023-10-25 18:45:08,655 EPOCH 7 done: loss 0.0281 - lr: 0.000010
2023-10-25 18:45:14,892 DEV : loss 0.35052964091300964 - f1-score (micro avg) 0.4187
2023-10-25 18:45:14,917 saving best model
2023-10-25 18:45:15,604 ----------------------------------------------------------------------------------------------------
2023-10-25 18:45:29,274 epoch 8 - iter 260/2606 - loss 0.01679626 - time (sec): 13.67 - samples/sec: 2649.74 - lr: 0.000010 - momentum: 0.000000
2023-10-25 18:45:43,015 epoch 8 - iter 520/2606 - loss 0.01590055 - time (sec): 27.41 - samples/sec: 2630.44 - lr: 0.000009 - momentum: 0.000000
2023-10-25 18:45:56,822 epoch 8 - iter 780/2606 - loss 0.01681384 - time (sec): 41.22 - samples/sec: 2642.29 - lr: 0.000009 - momentum: 0.000000
2023-10-25 18:46:10,352 epoch 8 - iter 1040/2606 - loss 0.01711388 - time (sec): 54.75 - samples/sec: 2613.36 - lr: 0.000009 - momentum: 0.000000
2023-10-25 18:46:24,915 epoch 8 - iter 1300/2606 - loss 0.01693945 - time (sec): 69.31 - samples/sec: 2609.75 - lr: 0.000008 - momentum: 0.000000
2023-10-25 18:46:38,629 epoch 8 - iter 1560/2606 - loss 0.01717473 - time (sec): 83.02 - samples/sec: 2589.08 - lr: 0.000008 - momentum: 0.000000
2023-10-25 18:46:52,741 epoch 8 - iter 1820/2606 - loss 0.01849902 - time (sec): 97.14 - samples/sec: 2602.39 - lr: 0.000008 - momentum: 0.000000
2023-10-25 18:47:06,886 epoch 8 - iter 2080/2606 - loss 0.01830031 - time (sec): 111.28 - samples/sec: 2604.13 - lr: 0.000007 - momentum: 0.000000
2023-10-25 18:47:21,244 epoch 8 - iter 2340/2606 - loss 0.01818663 - time (sec): 125.64 - samples/sec: 2619.72 - lr: 0.000007 - momentum: 0.000000
2023-10-25 18:47:34,988 epoch 8 - iter 2600/2606 - loss 0.01797792 - time (sec): 139.38 - samples/sec: 2629.60 - lr: 0.000007 - momentum: 0.000000
2023-10-25 18:47:35,287 ----------------------------------------------------------------------------------------------------
2023-10-25 18:47:35,288 EPOCH 8 done: loss 0.0180 - lr: 0.000007
2023-10-25 18:47:41,544 DEV : loss 0.4683326184749603 - f1-score (micro avg) 0.3644
2023-10-25 18:47:41,569 ----------------------------------------------------------------------------------------------------
2023-10-25 18:47:55,554 epoch 9 - iter 260/2606 - loss 0.01570520 - time (sec): 13.98 - samples/sec: 2653.48 - lr: 0.000006 - momentum: 0.000000
2023-10-25 18:48:09,087 epoch 9 - iter 520/2606 - loss 0.01431690 - time (sec): 27.52 - samples/sec: 2602.52 - lr: 0.000006 - momentum: 0.000000
2023-10-25 18:48:22,993 epoch 9 - iter 780/2606 - loss 0.01274596 - time (sec): 41.42 - samples/sec: 2647.47 - lr: 0.000006 - momentum: 0.000000
2023-10-25 18:48:36,937 epoch 9 - iter 1040/2606 - loss 0.01326027 - time (sec): 55.37 - samples/sec: 2628.55 - lr: 0.000005 - momentum: 0.000000
2023-10-25 18:48:50,570 epoch 9 - iter 1300/2606 - loss 0.01338527 - time (sec): 69.00 - samples/sec: 2615.07 - lr: 0.000005 - momentum: 0.000000
2023-10-25 18:49:04,248 epoch 9 - iter 1560/2606 - loss 0.01381768 - time (sec): 82.68 - samples/sec: 2627.59 - lr: 0.000005 - momentum: 0.000000
2023-10-25 18:49:17,867 epoch 9 - iter 1820/2606 - loss 0.01373566 - time (sec): 96.30 - samples/sec: 2617.12 - lr: 0.000004 - momentum: 0.000000
2023-10-25 18:49:32,820 epoch 9 - iter 2080/2606 - loss 0.01338600 - time (sec): 111.25 - samples/sec: 2617.51 - lr: 0.000004 - momentum: 0.000000
2023-10-25 18:49:46,961 epoch 9 - iter 2340/2606 - loss 0.01388701 - time (sec): 125.39 - samples/sec: 2638.49 - lr: 0.000004 - momentum: 0.000000
2023-10-25 18:50:00,960 epoch 9 - iter 2600/2606 - loss 0.01382361 - time (sec): 139.39 - samples/sec: 2626.74 - lr: 0.000003 - momentum: 0.000000
2023-10-25 18:50:01,379 ----------------------------------------------------------------------------------------------------
2023-10-25 18:50:01,379 EPOCH 9 done: loss 0.0138 - lr: 0.000003
2023-10-25 18:50:07,639 DEV : loss 0.4295385181903839 - f1-score (micro avg) 0.3935
2023-10-25 18:50:07,666 ----------------------------------------------------------------------------------------------------
2023-10-25 18:50:21,916 epoch 10 - iter 260/2606 - loss 0.00628169 - time (sec): 14.25 - samples/sec: 2601.73 - lr: 0.000003 - momentum: 0.000000
2023-10-25 18:50:37,396 epoch 10 - iter 520/2606 - loss 0.00790052 - time (sec): 29.73 - samples/sec: 2463.62 - lr: 0.000003 - momentum: 0.000000
2023-10-25 18:50:50,955 epoch 10 - iter 780/2606 - loss 0.00841466 - time (sec): 43.29 - samples/sec: 2529.08 - lr: 0.000002 - momentum: 0.000000
2023-10-25 18:51:04,859 epoch 10 - iter 1040/2606 - loss 0.00940638 - time (sec): 57.19 - samples/sec: 2537.96 - lr: 0.000002 - momentum: 0.000000
2023-10-25 18:51:18,666 epoch 10 - iter 1300/2606 - loss 0.00936794 - time (sec): 71.00 - samples/sec: 2553.71 - lr: 0.000002 - momentum: 0.000000
2023-10-25 18:51:32,144 epoch 10 - iter 1560/2606 - loss 0.00994615 - time (sec): 84.48 - samples/sec: 2543.68 - lr: 0.000001 - momentum: 0.000000
2023-10-25 18:51:45,883 epoch 10 - iter 1820/2606 - loss 0.00957951 - time (sec): 98.22 - samples/sec: 2559.69 - lr: 0.000001 - momentum: 0.000000
2023-10-25 18:51:59,867 epoch 10 - iter 2080/2606 - loss 0.00938588 - time (sec): 112.20 - samples/sec: 2564.23 - lr: 0.000001 - momentum: 0.000000
2023-10-25 18:52:14,080 epoch 10 - iter 2340/2606 - loss 0.00938487 - time (sec): 126.41 - samples/sec: 2593.01 - lr: 0.000000 - momentum: 0.000000
2023-10-25 18:52:28,218 epoch 10 - iter 2600/2606 - loss 0.00945105 - time (sec): 140.55 - samples/sec: 2604.22 - lr: 0.000000 - momentum: 0.000000
2023-10-25 18:52:28,653 ----------------------------------------------------------------------------------------------------
2023-10-25 18:52:28,653 EPOCH 10 done: loss 0.0094 - lr: 0.000000
2023-10-25 18:52:35,606 DEV : loss 0.47907230257987976 - f1-score (micro avg) 0.3898
2023-10-25 18:52:36,103 ----------------------------------------------------------------------------------------------------
2023-10-25 18:52:36,104 Loading model from best epoch ...
2023-10-25 18:52:37,712 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 18:52:47,559
Results:
- F-score (micro) 0.4885
- F-score (macro) 0.3364
- Accuracy 0.3266
By class:
precision recall f1-score support
LOC 0.5243 0.5964 0.5580 1214
PER 0.4407 0.4505 0.4455 808
ORG 0.3605 0.3258 0.3423 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4746 0.5033 0.4885 2390
macro avg 0.3314 0.3432 0.3364 2390
weighted avg 0.4685 0.5033 0.4846 2390
2023-10-25 18:52:47,560 ----------------------------------------------------------------------------------------------------