|
2023-10-25 18:28:08,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:08,807 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 18:28:08,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:08,807 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-25 18:28:08,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:08,807 Train: 20847 sentences |
|
2023-10-25 18:28:08,807 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 18:28:08,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:08,807 Training Params: |
|
2023-10-25 18:28:08,807 - learning_rate: "3e-05" |
|
2023-10-25 18:28:08,807 - mini_batch_size: "8" |
|
2023-10-25 18:28:08,807 - max_epochs: "10" |
|
2023-10-25 18:28:08,807 - shuffle: "True" |
|
2023-10-25 18:28:08,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:08,807 Plugins: |
|
2023-10-25 18:28:08,807 - TensorboardLogger |
|
2023-10-25 18:28:08,807 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 18:28:08,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:08,807 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 18:28:08,807 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 18:28:08,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:08,807 Computation: |
|
2023-10-25 18:28:08,807 - compute on device: cuda:0 |
|
2023-10-25 18:28:08,807 - embedding storage: none |
|
2023-10-25 18:28:08,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:08,807 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-25 18:28:08,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:08,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:28:08,807 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 18:28:22,840 epoch 1 - iter 260/2606 - loss 1.44118564 - time (sec): 14.03 - samples/sec: 2575.65 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 18:28:36,666 epoch 1 - iter 520/2606 - loss 0.92810801 - time (sec): 27.86 - samples/sec: 2637.50 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 18:28:51,042 epoch 1 - iter 780/2606 - loss 0.70417279 - time (sec): 42.23 - samples/sec: 2644.02 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 18:29:05,298 epoch 1 - iter 1040/2606 - loss 0.59393290 - time (sec): 56.49 - samples/sec: 2629.87 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 18:29:18,805 epoch 1 - iter 1300/2606 - loss 0.52883614 - time (sec): 70.00 - samples/sec: 2619.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 18:29:32,598 epoch 1 - iter 1560/2606 - loss 0.48018899 - time (sec): 83.79 - samples/sec: 2605.48 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 18:29:47,075 epoch 1 - iter 1820/2606 - loss 0.43654950 - time (sec): 98.27 - samples/sec: 2624.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 18:30:01,512 epoch 1 - iter 2080/2606 - loss 0.40049267 - time (sec): 112.70 - samples/sec: 2624.17 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 18:30:15,541 epoch 1 - iter 2340/2606 - loss 0.37673764 - time (sec): 126.73 - samples/sec: 2615.94 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 18:30:29,155 epoch 1 - iter 2600/2606 - loss 0.35942056 - time (sec): 140.35 - samples/sec: 2609.76 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 18:30:29,546 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:30:29,547 EPOCH 1 done: loss 0.3587 - lr: 0.000030 |
|
2023-10-25 18:30:33,264 DEV : loss 0.1093616634607315 - f1-score (micro avg) 0.2831 |
|
2023-10-25 18:30:33,289 saving best model |
|
2023-10-25 18:30:33,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:30:47,858 epoch 2 - iter 260/2606 - loss 0.15973395 - time (sec): 14.09 - samples/sec: 2596.30 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 18:31:01,897 epoch 2 - iter 520/2606 - loss 0.15273016 - time (sec): 28.13 - samples/sec: 2587.41 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 18:31:16,107 epoch 2 - iter 780/2606 - loss 0.15091510 - time (sec): 42.34 - samples/sec: 2602.93 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 18:31:29,906 epoch 2 - iter 1040/2606 - loss 0.14924346 - time (sec): 56.14 - samples/sec: 2625.36 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 18:31:43,744 epoch 2 - iter 1300/2606 - loss 0.14800277 - time (sec): 69.98 - samples/sec: 2613.86 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 18:31:57,687 epoch 2 - iter 1560/2606 - loss 0.14609768 - time (sec): 83.92 - samples/sec: 2639.91 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 18:32:11,802 epoch 2 - iter 1820/2606 - loss 0.14641189 - time (sec): 98.04 - samples/sec: 2628.39 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 18:32:25,136 epoch 2 - iter 2080/2606 - loss 0.14722873 - time (sec): 111.37 - samples/sec: 2618.82 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 18:32:39,368 epoch 2 - iter 2340/2606 - loss 0.14675373 - time (sec): 125.60 - samples/sec: 2616.38 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 18:32:53,254 epoch 2 - iter 2600/2606 - loss 0.14529176 - time (sec): 139.49 - samples/sec: 2626.49 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 18:32:53,635 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:32:53,635 EPOCH 2 done: loss 0.1449 - lr: 0.000027 |
|
2023-10-25 18:33:00,495 DEV : loss 0.12418132275342941 - f1-score (micro avg) 0.345 |
|
2023-10-25 18:33:00,522 saving best model |
|
2023-10-25 18:33:01,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:33:14,815 epoch 3 - iter 260/2606 - loss 0.12123734 - time (sec): 13.81 - samples/sec: 2636.88 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 18:33:29,151 epoch 3 - iter 520/2606 - loss 0.10035032 - time (sec): 28.15 - samples/sec: 2699.66 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 18:33:43,128 epoch 3 - iter 780/2606 - loss 0.09873947 - time (sec): 42.12 - samples/sec: 2677.14 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 18:33:56,862 epoch 3 - iter 1040/2606 - loss 0.09621422 - time (sec): 55.86 - samples/sec: 2651.72 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 18:34:10,890 epoch 3 - iter 1300/2606 - loss 0.09492393 - time (sec): 69.89 - samples/sec: 2650.00 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 18:34:24,483 epoch 3 - iter 1560/2606 - loss 0.09360765 - time (sec): 83.48 - samples/sec: 2628.90 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 18:34:38,353 epoch 3 - iter 1820/2606 - loss 0.09935380 - time (sec): 97.35 - samples/sec: 2625.24 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 18:34:52,835 epoch 3 - iter 2080/2606 - loss 0.09836373 - time (sec): 111.83 - samples/sec: 2637.51 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 18:35:06,260 epoch 3 - iter 2340/2606 - loss 0.09861426 - time (sec): 125.26 - samples/sec: 2624.73 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 18:35:20,465 epoch 3 - iter 2600/2606 - loss 0.09836569 - time (sec): 139.46 - samples/sec: 2629.47 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 18:35:20,763 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:35:20,763 EPOCH 3 done: loss 0.0983 - lr: 0.000023 |
|
2023-10-25 18:35:27,694 DEV : loss 0.3102573752403259 - f1-score (micro avg) 0.3591 |
|
2023-10-25 18:35:27,720 saving best model |
|
2023-10-25 18:35:28,194 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:35:42,333 epoch 4 - iter 260/2606 - loss 0.05744413 - time (sec): 14.14 - samples/sec: 2584.08 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 18:35:57,025 epoch 4 - iter 520/2606 - loss 0.06369754 - time (sec): 28.83 - samples/sec: 2660.47 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 18:36:11,068 epoch 4 - iter 780/2606 - loss 0.06347063 - time (sec): 42.87 - samples/sec: 2642.25 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 18:36:25,141 epoch 4 - iter 1040/2606 - loss 0.06530696 - time (sec): 56.94 - samples/sec: 2663.60 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 18:36:38,908 epoch 4 - iter 1300/2606 - loss 0.06387663 - time (sec): 70.71 - samples/sec: 2682.17 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 18:36:52,388 epoch 4 - iter 1560/2606 - loss 0.06503052 - time (sec): 84.19 - samples/sec: 2665.31 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 18:37:06,164 epoch 4 - iter 1820/2606 - loss 0.06576948 - time (sec): 97.97 - samples/sec: 2673.09 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 18:37:20,037 epoch 4 - iter 2080/2606 - loss 0.06561542 - time (sec): 111.84 - samples/sec: 2667.10 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 18:37:33,397 epoch 4 - iter 2340/2606 - loss 0.06595358 - time (sec): 125.20 - samples/sec: 2660.84 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 18:37:46,603 epoch 4 - iter 2600/2606 - loss 0.06624139 - time (sec): 138.41 - samples/sec: 2650.34 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 18:37:46,868 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:37:46,868 EPOCH 4 done: loss 0.0662 - lr: 0.000020 |
|
2023-10-25 18:37:53,780 DEV : loss 0.27316081523895264 - f1-score (micro avg) 0.3408 |
|
2023-10-25 18:37:53,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:38:07,884 epoch 5 - iter 260/2606 - loss 0.05585596 - time (sec): 14.08 - samples/sec: 2570.69 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 18:38:21,643 epoch 5 - iter 520/2606 - loss 0.05209107 - time (sec): 27.84 - samples/sec: 2558.79 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 18:38:35,682 epoch 5 - iter 780/2606 - loss 0.04718666 - time (sec): 41.87 - samples/sec: 2606.01 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 18:38:49,695 epoch 5 - iter 1040/2606 - loss 0.04694978 - time (sec): 55.89 - samples/sec: 2636.11 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 18:39:03,937 epoch 5 - iter 1300/2606 - loss 0.04779807 - time (sec): 70.13 - samples/sec: 2627.53 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 18:39:17,799 epoch 5 - iter 1560/2606 - loss 0.04766551 - time (sec): 83.99 - samples/sec: 2633.45 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 18:39:31,276 epoch 5 - iter 1820/2606 - loss 0.04834322 - time (sec): 97.47 - samples/sec: 2613.56 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 18:39:45,187 epoch 5 - iter 2080/2606 - loss 0.04845369 - time (sec): 111.38 - samples/sec: 2622.77 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 18:39:58,437 epoch 5 - iter 2340/2606 - loss 0.04889454 - time (sec): 124.63 - samples/sec: 2636.67 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 18:40:12,471 epoch 5 - iter 2600/2606 - loss 0.04779773 - time (sec): 138.66 - samples/sec: 2643.59 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 18:40:12,773 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:40:12,773 EPOCH 5 done: loss 0.0478 - lr: 0.000017 |
|
2023-10-25 18:40:19,656 DEV : loss 0.3309582471847534 - f1-score (micro avg) 0.3394 |
|
2023-10-25 18:40:19,682 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:40:33,761 epoch 6 - iter 260/2606 - loss 0.03565927 - time (sec): 14.08 - samples/sec: 2701.06 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 18:40:47,631 epoch 6 - iter 520/2606 - loss 0.03669726 - time (sec): 27.95 - samples/sec: 2615.10 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 18:41:01,384 epoch 6 - iter 780/2606 - loss 0.03458977 - time (sec): 41.70 - samples/sec: 2601.88 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 18:41:15,444 epoch 6 - iter 1040/2606 - loss 0.03370625 - time (sec): 55.76 - samples/sec: 2630.87 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 18:41:29,289 epoch 6 - iter 1300/2606 - loss 0.03659547 - time (sec): 69.61 - samples/sec: 2615.42 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 18:41:43,555 epoch 6 - iter 1560/2606 - loss 0.03566172 - time (sec): 83.87 - samples/sec: 2625.18 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 18:41:56,926 epoch 6 - iter 1820/2606 - loss 0.03750852 - time (sec): 97.24 - samples/sec: 2619.62 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 18:42:10,835 epoch 6 - iter 2080/2606 - loss 0.03612369 - time (sec): 111.15 - samples/sec: 2623.28 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 18:42:25,243 epoch 6 - iter 2340/2606 - loss 0.03526350 - time (sec): 125.56 - samples/sec: 2621.48 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 18:42:39,920 epoch 6 - iter 2600/2606 - loss 0.03565724 - time (sec): 140.24 - samples/sec: 2613.12 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 18:42:40,268 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:42:40,268 EPOCH 6 done: loss 0.0357 - lr: 0.000013 |
|
2023-10-25 18:42:46,462 DEV : loss 0.3563600778579712 - f1-score (micro avg) 0.3609 |
|
2023-10-25 18:42:46,488 saving best model |
|
2023-10-25 18:42:47,091 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:43:01,182 epoch 7 - iter 260/2606 - loss 0.03348942 - time (sec): 14.09 - samples/sec: 2564.82 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 18:43:16,395 epoch 7 - iter 520/2606 - loss 0.03261560 - time (sec): 29.30 - samples/sec: 2522.13 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 18:43:30,249 epoch 7 - iter 780/2606 - loss 0.03131634 - time (sec): 43.16 - samples/sec: 2570.67 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 18:43:44,668 epoch 7 - iter 1040/2606 - loss 0.02913817 - time (sec): 57.57 - samples/sec: 2596.73 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 18:43:58,255 epoch 7 - iter 1300/2606 - loss 0.02865369 - time (sec): 71.16 - samples/sec: 2606.97 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 18:44:12,566 epoch 7 - iter 1560/2606 - loss 0.03000096 - time (sec): 85.47 - samples/sec: 2621.12 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 18:44:26,829 epoch 7 - iter 1820/2606 - loss 0.02935599 - time (sec): 99.74 - samples/sec: 2624.84 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 18:44:41,001 epoch 7 - iter 2080/2606 - loss 0.02816366 - time (sec): 113.91 - samples/sec: 2619.08 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 18:44:54,742 epoch 7 - iter 2340/2606 - loss 0.02786870 - time (sec): 127.65 - samples/sec: 2602.86 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 18:45:08,369 epoch 7 - iter 2600/2606 - loss 0.02810153 - time (sec): 141.28 - samples/sec: 2595.47 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 18:45:08,655 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:45:08,655 EPOCH 7 done: loss 0.0281 - lr: 0.000010 |
|
2023-10-25 18:45:14,892 DEV : loss 0.35052964091300964 - f1-score (micro avg) 0.4187 |
|
2023-10-25 18:45:14,917 saving best model |
|
2023-10-25 18:45:15,604 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:45:29,274 epoch 8 - iter 260/2606 - loss 0.01679626 - time (sec): 13.67 - samples/sec: 2649.74 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 18:45:43,015 epoch 8 - iter 520/2606 - loss 0.01590055 - time (sec): 27.41 - samples/sec: 2630.44 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 18:45:56,822 epoch 8 - iter 780/2606 - loss 0.01681384 - time (sec): 41.22 - samples/sec: 2642.29 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 18:46:10,352 epoch 8 - iter 1040/2606 - loss 0.01711388 - time (sec): 54.75 - samples/sec: 2613.36 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 18:46:24,915 epoch 8 - iter 1300/2606 - loss 0.01693945 - time (sec): 69.31 - samples/sec: 2609.75 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 18:46:38,629 epoch 8 - iter 1560/2606 - loss 0.01717473 - time (sec): 83.02 - samples/sec: 2589.08 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 18:46:52,741 epoch 8 - iter 1820/2606 - loss 0.01849902 - time (sec): 97.14 - samples/sec: 2602.39 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 18:47:06,886 epoch 8 - iter 2080/2606 - loss 0.01830031 - time (sec): 111.28 - samples/sec: 2604.13 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 18:47:21,244 epoch 8 - iter 2340/2606 - loss 0.01818663 - time (sec): 125.64 - samples/sec: 2619.72 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 18:47:34,988 epoch 8 - iter 2600/2606 - loss 0.01797792 - time (sec): 139.38 - samples/sec: 2629.60 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 18:47:35,287 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:47:35,288 EPOCH 8 done: loss 0.0180 - lr: 0.000007 |
|
2023-10-25 18:47:41,544 DEV : loss 0.4683326184749603 - f1-score (micro avg) 0.3644 |
|
2023-10-25 18:47:41,569 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:47:55,554 epoch 9 - iter 260/2606 - loss 0.01570520 - time (sec): 13.98 - samples/sec: 2653.48 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 18:48:09,087 epoch 9 - iter 520/2606 - loss 0.01431690 - time (sec): 27.52 - samples/sec: 2602.52 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 18:48:22,993 epoch 9 - iter 780/2606 - loss 0.01274596 - time (sec): 41.42 - samples/sec: 2647.47 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 18:48:36,937 epoch 9 - iter 1040/2606 - loss 0.01326027 - time (sec): 55.37 - samples/sec: 2628.55 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 18:48:50,570 epoch 9 - iter 1300/2606 - loss 0.01338527 - time (sec): 69.00 - samples/sec: 2615.07 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 18:49:04,248 epoch 9 - iter 1560/2606 - loss 0.01381768 - time (sec): 82.68 - samples/sec: 2627.59 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 18:49:17,867 epoch 9 - iter 1820/2606 - loss 0.01373566 - time (sec): 96.30 - samples/sec: 2617.12 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 18:49:32,820 epoch 9 - iter 2080/2606 - loss 0.01338600 - time (sec): 111.25 - samples/sec: 2617.51 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 18:49:46,961 epoch 9 - iter 2340/2606 - loss 0.01388701 - time (sec): 125.39 - samples/sec: 2638.49 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 18:50:00,960 epoch 9 - iter 2600/2606 - loss 0.01382361 - time (sec): 139.39 - samples/sec: 2626.74 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 18:50:01,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:50:01,379 EPOCH 9 done: loss 0.0138 - lr: 0.000003 |
|
2023-10-25 18:50:07,639 DEV : loss 0.4295385181903839 - f1-score (micro avg) 0.3935 |
|
2023-10-25 18:50:07,666 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:50:21,916 epoch 10 - iter 260/2606 - loss 0.00628169 - time (sec): 14.25 - samples/sec: 2601.73 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 18:50:37,396 epoch 10 - iter 520/2606 - loss 0.00790052 - time (sec): 29.73 - samples/sec: 2463.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 18:50:50,955 epoch 10 - iter 780/2606 - loss 0.00841466 - time (sec): 43.29 - samples/sec: 2529.08 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 18:51:04,859 epoch 10 - iter 1040/2606 - loss 0.00940638 - time (sec): 57.19 - samples/sec: 2537.96 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 18:51:18,666 epoch 10 - iter 1300/2606 - loss 0.00936794 - time (sec): 71.00 - samples/sec: 2553.71 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 18:51:32,144 epoch 10 - iter 1560/2606 - loss 0.00994615 - time (sec): 84.48 - samples/sec: 2543.68 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 18:51:45,883 epoch 10 - iter 1820/2606 - loss 0.00957951 - time (sec): 98.22 - samples/sec: 2559.69 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 18:51:59,867 epoch 10 - iter 2080/2606 - loss 0.00938588 - time (sec): 112.20 - samples/sec: 2564.23 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 18:52:14,080 epoch 10 - iter 2340/2606 - loss 0.00938487 - time (sec): 126.41 - samples/sec: 2593.01 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 18:52:28,218 epoch 10 - iter 2600/2606 - loss 0.00945105 - time (sec): 140.55 - samples/sec: 2604.22 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 18:52:28,653 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:52:28,653 EPOCH 10 done: loss 0.0094 - lr: 0.000000 |
|
2023-10-25 18:52:35,606 DEV : loss 0.47907230257987976 - f1-score (micro avg) 0.3898 |
|
2023-10-25 18:52:36,103 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:52:36,104 Loading model from best epoch ... |
|
2023-10-25 18:52:37,712 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 18:52:47,559 |
|
Results: |
|
- F-score (micro) 0.4885 |
|
- F-score (macro) 0.3364 |
|
- Accuracy 0.3266 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5243 0.5964 0.5580 1214 |
|
PER 0.4407 0.4505 0.4455 808 |
|
ORG 0.3605 0.3258 0.3423 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4746 0.5033 0.4885 2390 |
|
macro avg 0.3314 0.3432 0.3364 2390 |
|
weighted avg 0.4685 0.5033 0.4846 2390 |
|
|
|
2023-10-25 18:52:47,560 ---------------------------------------------------------------------------------------------------- |
|
|