|
2023-10-25 10:37:21,901 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:21,902 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 10:37:21,902 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:21,903 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:21,903 Train: 6183 sentences |
|
2023-10-25 10:37:21,903 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:21,903 Training Params: |
|
2023-10-25 10:37:21,903 - learning_rate: "3e-05" |
|
2023-10-25 10:37:21,903 - mini_batch_size: "8" |
|
2023-10-25 10:37:21,903 - max_epochs: "10" |
|
2023-10-25 10:37:21,903 - shuffle: "True" |
|
2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:21,903 Plugins: |
|
2023-10-25 10:37:21,903 - TensorboardLogger |
|
2023-10-25 10:37:21,903 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:21,903 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 10:37:21,903 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:21,903 Computation: |
|
2023-10-25 10:37:21,903 - compute on device: cuda:0 |
|
2023-10-25 10:37:21,903 - embedding storage: none |
|
2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:21,903 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:21,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:21,903 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 10:37:26,605 epoch 1 - iter 77/773 - loss 2.00342293 - time (sec): 4.70 - samples/sec: 2712.01 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:37:31,262 epoch 1 - iter 154/773 - loss 1.14055389 - time (sec): 9.36 - samples/sec: 2662.74 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:37:35,889 epoch 1 - iter 231/773 - loss 0.82625202 - time (sec): 13.98 - samples/sec: 2642.41 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:37:40,575 epoch 1 - iter 308/773 - loss 0.64493288 - time (sec): 18.67 - samples/sec: 2668.34 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:37:45,208 epoch 1 - iter 385/773 - loss 0.53761507 - time (sec): 23.30 - samples/sec: 2669.10 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 10:37:49,742 epoch 1 - iter 462/773 - loss 0.47184545 - time (sec): 27.84 - samples/sec: 2656.78 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:37:54,392 epoch 1 - iter 539/773 - loss 0.41778036 - time (sec): 32.49 - samples/sec: 2672.72 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:37:59,140 epoch 1 - iter 616/773 - loss 0.37526149 - time (sec): 37.24 - samples/sec: 2675.01 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:38:03,677 epoch 1 - iter 693/773 - loss 0.34438623 - time (sec): 41.77 - samples/sec: 2678.19 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:38:08,219 epoch 1 - iter 770/773 - loss 0.32016433 - time (sec): 46.31 - samples/sec: 2673.41 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 10:38:08,404 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:38:08,404 EPOCH 1 done: loss 0.3193 - lr: 0.000030 |
|
2023-10-25 10:38:11,901 DEV : loss 0.05575157329440117 - f1-score (micro avg) 0.7258 |
|
2023-10-25 10:38:11,920 saving best model |
|
2023-10-25 10:38:12,442 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:38:17,190 epoch 2 - iter 77/773 - loss 0.06917782 - time (sec): 4.75 - samples/sec: 2589.73 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 10:38:21,845 epoch 2 - iter 154/773 - loss 0.07295557 - time (sec): 9.40 - samples/sec: 2635.98 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:38:26,432 epoch 2 - iter 231/773 - loss 0.07445178 - time (sec): 13.99 - samples/sec: 2577.67 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:38:31,123 epoch 2 - iter 308/773 - loss 0.07546608 - time (sec): 18.68 - samples/sec: 2620.82 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:38:35,870 epoch 2 - iter 385/773 - loss 0.07298737 - time (sec): 23.43 - samples/sec: 2630.85 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:38:40,555 epoch 2 - iter 462/773 - loss 0.07131937 - time (sec): 28.11 - samples/sec: 2642.70 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:38:45,260 epoch 2 - iter 539/773 - loss 0.07085615 - time (sec): 32.82 - samples/sec: 2652.90 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:38:50,000 epoch 2 - iter 616/773 - loss 0.07015877 - time (sec): 37.56 - samples/sec: 2628.79 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:38:54,537 epoch 2 - iter 693/773 - loss 0.06882789 - time (sec): 42.09 - samples/sec: 2624.41 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:38:59,395 epoch 2 - iter 770/773 - loss 0.06928794 - time (sec): 46.95 - samples/sec: 2635.97 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:38:59,585 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:38:59,586 EPOCH 2 done: loss 0.0691 - lr: 0.000027 |
|
2023-10-25 10:39:02,404 DEV : loss 0.049297548830509186 - f1-score (micro avg) 0.8142 |
|
2023-10-25 10:39:02,422 saving best model |
|
2023-10-25 10:39:03,089 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:39:07,755 epoch 3 - iter 77/773 - loss 0.04741193 - time (sec): 4.66 - samples/sec: 2618.48 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:39:12,662 epoch 3 - iter 154/773 - loss 0.04548219 - time (sec): 9.57 - samples/sec: 2493.23 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:39:17,185 epoch 3 - iter 231/773 - loss 0.04459085 - time (sec): 14.09 - samples/sec: 2538.42 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:39:21,796 epoch 3 - iter 308/773 - loss 0.04386816 - time (sec): 18.70 - samples/sec: 2585.25 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 10:39:26,414 epoch 3 - iter 385/773 - loss 0.04428707 - time (sec): 23.32 - samples/sec: 2595.54 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 10:39:31,080 epoch 3 - iter 462/773 - loss 0.04386941 - time (sec): 27.99 - samples/sec: 2628.33 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 10:39:35,843 epoch 3 - iter 539/773 - loss 0.04425161 - time (sec): 32.75 - samples/sec: 2629.03 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:39:40,653 epoch 3 - iter 616/773 - loss 0.04565354 - time (sec): 37.56 - samples/sec: 2628.00 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:39:45,336 epoch 3 - iter 693/773 - loss 0.04638286 - time (sec): 42.24 - samples/sec: 2640.83 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:39:50,069 epoch 3 - iter 770/773 - loss 0.04580281 - time (sec): 46.98 - samples/sec: 2637.11 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:39:50,249 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:39:50,249 EPOCH 3 done: loss 0.0457 - lr: 0.000023 |
|
2023-10-25 10:39:53,011 DEV : loss 0.07478724420070648 - f1-score (micro avg) 0.7705 |
|
2023-10-25 10:39:53,029 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:39:57,751 epoch 4 - iter 77/773 - loss 0.02381115 - time (sec): 4.72 - samples/sec: 2644.65 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:40:02,403 epoch 4 - iter 154/773 - loss 0.02272232 - time (sec): 9.37 - samples/sec: 2696.05 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:40:07,048 epoch 4 - iter 231/773 - loss 0.02306162 - time (sec): 14.02 - samples/sec: 2694.29 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 10:40:11,691 epoch 4 - iter 308/773 - loss 0.02500263 - time (sec): 18.66 - samples/sec: 2695.98 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 10:40:16,351 epoch 4 - iter 385/773 - loss 0.02577652 - time (sec): 23.32 - samples/sec: 2669.27 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 10:40:21,109 epoch 4 - iter 462/773 - loss 0.02834569 - time (sec): 28.08 - samples/sec: 2640.19 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:40:26,055 epoch 4 - iter 539/773 - loss 0.02860700 - time (sec): 33.02 - samples/sec: 2629.75 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:40:30,766 epoch 4 - iter 616/773 - loss 0.02820990 - time (sec): 37.73 - samples/sec: 2639.20 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:40:35,442 epoch 4 - iter 693/773 - loss 0.02789789 - time (sec): 42.41 - samples/sec: 2648.82 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 10:40:39,951 epoch 4 - iter 770/773 - loss 0.02986146 - time (sec): 46.92 - samples/sec: 2638.85 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 10:40:40,133 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:40:40,133 EPOCH 4 done: loss 0.0299 - lr: 0.000020 |
|
2023-10-25 10:40:42,825 DEV : loss 0.08224356174468994 - f1-score (micro avg) 0.7658 |
|
2023-10-25 10:40:42,842 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:40:47,542 epoch 5 - iter 77/773 - loss 0.02345774 - time (sec): 4.70 - samples/sec: 2619.37 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 10:40:52,241 epoch 5 - iter 154/773 - loss 0.02122391 - time (sec): 9.40 - samples/sec: 2625.76 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 10:40:56,970 epoch 5 - iter 231/773 - loss 0.01984080 - time (sec): 14.13 - samples/sec: 2661.60 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 10:41:01,364 epoch 5 - iter 308/773 - loss 0.02263347 - time (sec): 18.52 - samples/sec: 2686.02 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 10:41:06,019 epoch 5 - iter 385/773 - loss 0.02194096 - time (sec): 23.18 - samples/sec: 2706.00 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:41:10,705 epoch 5 - iter 462/773 - loss 0.02179624 - time (sec): 27.86 - samples/sec: 2708.53 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:41:15,249 epoch 5 - iter 539/773 - loss 0.02057243 - time (sec): 32.41 - samples/sec: 2722.04 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:41:19,743 epoch 5 - iter 616/773 - loss 0.02059981 - time (sec): 36.90 - samples/sec: 2702.67 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 10:41:24,238 epoch 5 - iter 693/773 - loss 0.02048126 - time (sec): 41.39 - samples/sec: 2712.66 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 10:41:28,644 epoch 5 - iter 770/773 - loss 0.02099495 - time (sec): 45.80 - samples/sec: 2703.67 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 10:41:28,839 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:41:28,840 EPOCH 5 done: loss 0.0212 - lr: 0.000017 |
|
2023-10-25 10:41:31,552 DEV : loss 0.09945573657751083 - f1-score (micro avg) 0.781 |
|
2023-10-25 10:41:31,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:41:36,179 epoch 6 - iter 77/773 - loss 0.01534591 - time (sec): 4.60 - samples/sec: 2745.09 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 10:41:40,809 epoch 6 - iter 154/773 - loss 0.01519805 - time (sec): 9.24 - samples/sec: 2722.23 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 10:41:45,397 epoch 6 - iter 231/773 - loss 0.01464584 - time (sec): 13.82 - samples/sec: 2671.47 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 10:41:50,157 epoch 6 - iter 308/773 - loss 0.01414850 - time (sec): 18.58 - samples/sec: 2679.03 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 10:41:54,914 epoch 6 - iter 385/773 - loss 0.01433663 - time (sec): 23.34 - samples/sec: 2701.85 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 10:41:59,690 epoch 6 - iter 462/773 - loss 0.01277122 - time (sec): 28.12 - samples/sec: 2696.50 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 10:42:04,329 epoch 6 - iter 539/773 - loss 0.01360655 - time (sec): 32.76 - samples/sec: 2682.46 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 10:42:09,181 epoch 6 - iter 616/773 - loss 0.01363450 - time (sec): 37.61 - samples/sec: 2648.33 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 10:42:14,051 epoch 6 - iter 693/773 - loss 0.01353720 - time (sec): 42.48 - samples/sec: 2630.62 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 10:42:18,779 epoch 6 - iter 770/773 - loss 0.01363499 - time (sec): 47.21 - samples/sec: 2624.71 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 10:42:18,961 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:42:18,962 EPOCH 6 done: loss 0.0140 - lr: 0.000013 |
|
2023-10-25 10:42:22,522 DEV : loss 0.11278796941041946 - f1-score (micro avg) 0.7753 |
|
2023-10-25 10:42:22,540 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:42:27,293 epoch 7 - iter 77/773 - loss 0.00960456 - time (sec): 4.75 - samples/sec: 2680.96 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 10:42:32,004 epoch 7 - iter 154/773 - loss 0.00936374 - time (sec): 9.46 - samples/sec: 2626.52 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 10:42:36,826 epoch 7 - iter 231/773 - loss 0.00747500 - time (sec): 14.28 - samples/sec: 2713.22 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:42:41,284 epoch 7 - iter 308/773 - loss 0.00789143 - time (sec): 18.74 - samples/sec: 2647.02 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:42:45,927 epoch 7 - iter 385/773 - loss 0.00801181 - time (sec): 23.39 - samples/sec: 2644.58 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:42:50,560 epoch 7 - iter 462/773 - loss 0.00730589 - time (sec): 28.02 - samples/sec: 2658.91 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 10:42:55,182 epoch 7 - iter 539/773 - loss 0.00808199 - time (sec): 32.64 - samples/sec: 2631.74 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 10:42:59,804 epoch 7 - iter 616/773 - loss 0.00863132 - time (sec): 37.26 - samples/sec: 2626.61 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 10:43:04,706 epoch 7 - iter 693/773 - loss 0.00876967 - time (sec): 42.16 - samples/sec: 2631.43 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 10:43:09,406 epoch 7 - iter 770/773 - loss 0.00915212 - time (sec): 46.86 - samples/sec: 2639.82 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 10:43:09,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:43:09,584 EPOCH 7 done: loss 0.0091 - lr: 0.000010 |
|
2023-10-25 10:43:12,629 DEV : loss 0.11861388385295868 - f1-score (micro avg) 0.7724 |
|
2023-10-25 10:43:12,647 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:43:17,325 epoch 8 - iter 77/773 - loss 0.00461380 - time (sec): 4.68 - samples/sec: 2636.24 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 10:43:21,940 epoch 8 - iter 154/773 - loss 0.00702621 - time (sec): 9.29 - samples/sec: 2672.72 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:43:26,655 epoch 8 - iter 231/773 - loss 0.00812508 - time (sec): 14.01 - samples/sec: 2598.23 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:43:31,362 epoch 8 - iter 308/773 - loss 0.00661780 - time (sec): 18.71 - samples/sec: 2595.72 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:43:35,901 epoch 8 - iter 385/773 - loss 0.00640004 - time (sec): 23.25 - samples/sec: 2629.17 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 10:43:40,469 epoch 8 - iter 462/773 - loss 0.00633717 - time (sec): 27.82 - samples/sec: 2675.90 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 10:43:45,129 epoch 8 - iter 539/773 - loss 0.00653842 - time (sec): 32.48 - samples/sec: 2677.43 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 10:43:49,745 epoch 8 - iter 616/773 - loss 0.00732939 - time (sec): 37.10 - samples/sec: 2673.52 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 10:43:54,376 epoch 8 - iter 693/773 - loss 0.00709463 - time (sec): 41.73 - samples/sec: 2665.60 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 10:43:59,079 epoch 8 - iter 770/773 - loss 0.00668988 - time (sec): 46.43 - samples/sec: 2665.10 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 10:43:59,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:43:59,266 EPOCH 8 done: loss 0.0067 - lr: 0.000007 |
|
2023-10-25 10:44:02,424 DEV : loss 0.10935225337743759 - f1-score (micro avg) 0.7901 |
|
2023-10-25 10:44:02,442 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:44:07,211 epoch 9 - iter 77/773 - loss 0.00280222 - time (sec): 4.77 - samples/sec: 2641.93 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:44:11,756 epoch 9 - iter 154/773 - loss 0.00290731 - time (sec): 9.31 - samples/sec: 2687.34 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:44:16,394 epoch 9 - iter 231/773 - loss 0.00405151 - time (sec): 13.95 - samples/sec: 2681.13 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:44:21,122 epoch 9 - iter 308/773 - loss 0.00435854 - time (sec): 18.68 - samples/sec: 2697.32 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 10:44:25,805 epoch 9 - iter 385/773 - loss 0.00434929 - time (sec): 23.36 - samples/sec: 2682.52 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 10:44:30,607 epoch 9 - iter 462/773 - loss 0.00405141 - time (sec): 28.16 - samples/sec: 2661.17 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 10:44:35,355 epoch 9 - iter 539/773 - loss 0.00398165 - time (sec): 32.91 - samples/sec: 2641.67 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 10:44:40,206 epoch 9 - iter 616/773 - loss 0.00428787 - time (sec): 37.76 - samples/sec: 2622.47 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 10:44:44,909 epoch 9 - iter 693/773 - loss 0.00416455 - time (sec): 42.47 - samples/sec: 2642.13 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 10:44:49,527 epoch 9 - iter 770/773 - loss 0.00393684 - time (sec): 47.08 - samples/sec: 2633.23 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:44:49,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:44:49,707 EPOCH 9 done: loss 0.0039 - lr: 0.000003 |
|
2023-10-25 10:44:52,325 DEV : loss 0.11163745075464249 - f1-score (micro avg) 0.7942 |
|
2023-10-25 10:44:52,342 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:44:57,023 epoch 10 - iter 77/773 - loss 0.00160620 - time (sec): 4.68 - samples/sec: 2530.57 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:45:01,740 epoch 10 - iter 154/773 - loss 0.00218583 - time (sec): 9.40 - samples/sec: 2501.82 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:45:06,385 epoch 10 - iter 231/773 - loss 0.00253136 - time (sec): 14.04 - samples/sec: 2524.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 10:45:10,823 epoch 10 - iter 308/773 - loss 0.00321432 - time (sec): 18.48 - samples/sec: 2583.04 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 10:45:15,327 epoch 10 - iter 385/773 - loss 0.00292953 - time (sec): 22.98 - samples/sec: 2578.23 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 10:45:20,048 epoch 10 - iter 462/773 - loss 0.00283533 - time (sec): 27.70 - samples/sec: 2604.20 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 10:45:24,719 epoch 10 - iter 539/773 - loss 0.00275001 - time (sec): 32.37 - samples/sec: 2628.38 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 10:45:29,618 epoch 10 - iter 616/773 - loss 0.00267312 - time (sec): 37.27 - samples/sec: 2636.23 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 10:45:34,364 epoch 10 - iter 693/773 - loss 0.00242572 - time (sec): 42.02 - samples/sec: 2648.91 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 10:45:39,113 epoch 10 - iter 770/773 - loss 0.00279658 - time (sec): 46.77 - samples/sec: 2642.08 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 10:45:39,310 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:45:39,310 EPOCH 10 done: loss 0.0029 - lr: 0.000000 |
|
2023-10-25 10:45:42,349 DEV : loss 0.11523404717445374 - f1-score (micro avg) 0.7884 |
|
2023-10-25 10:45:43,297 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:45:43,299 Loading model from best epoch ... |
|
2023-10-25 10:45:45,437 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-25 10:45:55,712 |
|
Results: |
|
- F-score (micro) 0.7656 |
|
- F-score (macro) 0.6513 |
|
- Accuracy 0.641 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8262 0.8140 0.8200 946 |
|
BUILDING 0.5258 0.5514 0.5383 185 |
|
STREET 0.7368 0.5000 0.5957 56 |
|
|
|
micro avg 0.7732 0.7582 0.7656 1187 |
|
macro avg 0.6963 0.6218 0.6513 1187 |
|
weighted avg 0.7751 0.7582 0.7655 1187 |
|
|
|
2023-10-25 10:45:55,712 ---------------------------------------------------------------------------------------------------- |
|
|