|
2023-10-25 10:55:20,713 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:55:20,714 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:55:20,715 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:55:20,715 Train: 6183 sentences |
|
2023-10-25 10:55:20,715 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:55:20,715 Training Params: |
|
2023-10-25 10:55:20,715 - learning_rate: "3e-05" |
|
2023-10-25 10:55:20,715 - mini_batch_size: "4" |
|
2023-10-25 10:55:20,715 - max_epochs: "10" |
|
2023-10-25 10:55:20,715 - shuffle: "True" |
|
2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:55:20,715 Plugins: |
|
2023-10-25 10:55:20,715 - TensorboardLogger |
|
2023-10-25 10:55:20,715 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:55:20,715 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 10:55:20,715 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 10:55:20,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:55:20,716 Computation: |
|
2023-10-25 10:55:20,716 - compute on device: cuda:0 |
|
2023-10-25 10:55:20,716 - embedding storage: none |
|
2023-10-25 10:55:20,716 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:55:20,716 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-25 10:55:20,716 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:55:20,716 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:55:20,716 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 10:55:28,912 epoch 1 - iter 154/1546 - loss 1.65969452 - time (sec): 8.20 - samples/sec: 1555.63 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:55:36,950 epoch 1 - iter 308/1546 - loss 0.93754585 - time (sec): 16.23 - samples/sec: 1534.93 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:55:44,983 epoch 1 - iter 462/1546 - loss 0.68571099 - time (sec): 24.27 - samples/sec: 1522.86 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:55:53,052 epoch 1 - iter 616/1546 - loss 0.54254456 - time (sec): 32.33 - samples/sec: 1540.72 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:56:01,033 epoch 1 - iter 770/1546 - loss 0.45649672 - time (sec): 40.32 - samples/sec: 1542.76 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 10:56:09,335 epoch 1 - iter 924/1546 - loss 0.40353128 - time (sec): 48.62 - samples/sec: 1521.17 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:56:17,417 epoch 1 - iter 1078/1546 - loss 0.36171769 - time (sec): 56.70 - samples/sec: 1531.40 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:56:25,605 epoch 1 - iter 1232/1546 - loss 0.32774519 - time (sec): 64.89 - samples/sec: 1535.03 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:56:33,774 epoch 1 - iter 1386/1546 - loss 0.30231800 - time (sec): 73.06 - samples/sec: 1531.32 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:56:41,853 epoch 1 - iter 1540/1546 - loss 0.28338013 - time (sec): 81.14 - samples/sec: 1526.05 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 10:56:42,169 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:56:42,170 EPOCH 1 done: loss 0.2827 - lr: 0.000030 |
|
2023-10-25 10:56:45,326 DEV : loss 0.062169428914785385 - f1-score (micro avg) 0.7459 |
|
2023-10-25 10:56:45,343 saving best model |
|
2023-10-25 10:56:45,846 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:56:53,944 epoch 2 - iter 154/1546 - loss 0.08228178 - time (sec): 8.10 - samples/sec: 1517.82 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 10:57:02,045 epoch 2 - iter 308/1546 - loss 0.09250857 - time (sec): 16.20 - samples/sec: 1529.83 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:57:10,520 epoch 2 - iter 462/1546 - loss 0.09380708 - time (sec): 24.67 - samples/sec: 1461.41 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:57:18,800 epoch 2 - iter 616/1546 - loss 0.09259481 - time (sec): 32.95 - samples/sec: 1485.60 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:57:27,133 epoch 2 - iter 770/1546 - loss 0.08939337 - time (sec): 41.29 - samples/sec: 1492.76 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:57:35,482 epoch 2 - iter 924/1546 - loss 0.08678052 - time (sec): 49.63 - samples/sec: 1496.72 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:57:43,622 epoch 2 - iter 1078/1546 - loss 0.08519177 - time (sec): 57.77 - samples/sec: 1506.85 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:57:51,769 epoch 2 - iter 1232/1546 - loss 0.08376653 - time (sec): 65.92 - samples/sec: 1497.63 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:58:00,015 epoch 2 - iter 1386/1546 - loss 0.08475831 - time (sec): 74.17 - samples/sec: 1489.46 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:58:08,149 epoch 2 - iter 1540/1546 - loss 0.08472997 - time (sec): 82.30 - samples/sec: 1503.75 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:58:08,448 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:58:08,449 EPOCH 2 done: loss 0.0846 - lr: 0.000027 |
|
2023-10-25 10:58:11,612 DEV : loss 0.07261277735233307 - f1-score (micro avg) 0.7277 |
|
2023-10-25 10:58:11,628 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:58:19,774 epoch 3 - iter 154/1546 - loss 0.05836815 - time (sec): 8.14 - samples/sec: 1498.97 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:58:27,451 epoch 3 - iter 308/1546 - loss 0.05750916 - time (sec): 15.82 - samples/sec: 1507.96 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:58:35,457 epoch 3 - iter 462/1546 - loss 0.05789807 - time (sec): 23.83 - samples/sec: 1501.32 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:58:43,631 epoch 3 - iter 616/1546 - loss 0.05366765 - time (sec): 32.00 - samples/sec: 1510.95 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 10:58:51,877 epoch 3 - iter 770/1546 - loss 0.05391996 - time (sec): 40.25 - samples/sec: 1503.99 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 10:59:00,343 epoch 3 - iter 924/1546 - loss 0.05489213 - time (sec): 48.71 - samples/sec: 1510.07 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 10:59:08,613 epoch 3 - iter 1078/1546 - loss 0.05440378 - time (sec): 56.98 - samples/sec: 1510.99 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:59:16,017 epoch 3 - iter 1232/1546 - loss 0.05523524 - time (sec): 64.39 - samples/sec: 1533.04 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:59:23,782 epoch 3 - iter 1386/1546 - loss 0.05498877 - time (sec): 72.15 - samples/sec: 1546.14 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:59:31,809 epoch 3 - iter 1540/1546 - loss 0.05337318 - time (sec): 80.18 - samples/sec: 1545.06 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:59:32,114 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:59:32,114 EPOCH 3 done: loss 0.0533 - lr: 0.000023 |
|
2023-10-25 10:59:34,713 DEV : loss 0.09182113409042358 - f1-score (micro avg) 0.7679 |
|
2023-10-25 10:59:34,735 saving best model |
|
2023-10-25 10:59:35,483 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:59:43,457 epoch 4 - iter 154/1546 - loss 0.02507648 - time (sec): 7.97 - samples/sec: 1565.78 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:59:51,780 epoch 4 - iter 308/1546 - loss 0.03175758 - time (sec): 16.29 - samples/sec: 1550.68 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:59:59,992 epoch 4 - iter 462/1546 - loss 0.03194911 - time (sec): 24.51 - samples/sec: 1540.95 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 11:00:08,358 epoch 4 - iter 616/1546 - loss 0.03245614 - time (sec): 32.87 - samples/sec: 1530.37 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 11:00:16,776 epoch 4 - iter 770/1546 - loss 0.03363787 - time (sec): 41.29 - samples/sec: 1507.55 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 11:00:25,086 epoch 4 - iter 924/1546 - loss 0.03411037 - time (sec): 49.60 - samples/sec: 1494.55 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 11:00:33,183 epoch 4 - iter 1078/1546 - loss 0.03460341 - time (sec): 57.70 - samples/sec: 1505.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 11:00:41,566 epoch 4 - iter 1232/1546 - loss 0.03464367 - time (sec): 66.08 - samples/sec: 1507.10 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 11:00:50,390 epoch 4 - iter 1386/1546 - loss 0.03534451 - time (sec): 74.90 - samples/sec: 1499.75 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 11:00:59,048 epoch 4 - iter 1540/1546 - loss 0.03677995 - time (sec): 83.56 - samples/sec: 1481.68 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 11:00:59,372 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:00:59,372 EPOCH 4 done: loss 0.0367 - lr: 0.000020 |
|
2023-10-25 11:01:02,425 DEV : loss 0.09009607881307602 - f1-score (micro avg) 0.7484 |
|
2023-10-25 11:01:02,448 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:01:11,044 epoch 5 - iter 154/1546 - loss 0.02548701 - time (sec): 8.59 - samples/sec: 1432.16 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 11:01:19,545 epoch 5 - iter 308/1546 - loss 0.02652036 - time (sec): 17.10 - samples/sec: 1443.33 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 11:01:28,336 epoch 5 - iter 462/1546 - loss 0.02539993 - time (sec): 25.89 - samples/sec: 1452.56 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 11:01:36,776 epoch 5 - iter 616/1546 - loss 0.02612452 - time (sec): 34.33 - samples/sec: 1449.22 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 11:01:45,209 epoch 5 - iter 770/1546 - loss 0.02524168 - time (sec): 42.76 - samples/sec: 1466.66 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 11:01:53,861 epoch 5 - iter 924/1546 - loss 0.02572133 - time (sec): 51.41 - samples/sec: 1467.85 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 11:02:02,324 epoch 5 - iter 1078/1546 - loss 0.02424334 - time (sec): 59.87 - samples/sec: 1473.27 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 11:02:11,012 epoch 5 - iter 1232/1546 - loss 0.02389485 - time (sec): 68.56 - samples/sec: 1454.57 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 11:02:19,339 epoch 5 - iter 1386/1546 - loss 0.02362872 - time (sec): 76.89 - samples/sec: 1460.40 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 11:02:27,570 epoch 5 - iter 1540/1546 - loss 0.02429264 - time (sec): 85.12 - samples/sec: 1454.78 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 11:02:27,893 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:02:27,894 EPOCH 5 done: loss 0.0243 - lr: 0.000017 |
|
2023-10-25 11:02:30,998 DEV : loss 0.10089725255966187 - f1-score (micro avg) 0.7832 |
|
2023-10-25 11:02:31,025 saving best model |
|
2023-10-25 11:02:31,719 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:02:40,219 epoch 6 - iter 154/1546 - loss 0.02003037 - time (sec): 8.50 - samples/sec: 1487.59 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 11:02:48,637 epoch 6 - iter 308/1546 - loss 0.02358693 - time (sec): 16.92 - samples/sec: 1486.16 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 11:02:57,000 epoch 6 - iter 462/1546 - loss 0.02117204 - time (sec): 25.28 - samples/sec: 1460.82 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 11:03:05,526 epoch 6 - iter 616/1546 - loss 0.02069151 - time (sec): 33.80 - samples/sec: 1472.70 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 11:03:13,317 epoch 6 - iter 770/1546 - loss 0.02110916 - time (sec): 41.60 - samples/sec: 1516.05 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 11:03:20,828 epoch 6 - iter 924/1546 - loss 0.01896075 - time (sec): 49.11 - samples/sec: 1543.87 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 11:03:28,603 epoch 6 - iter 1078/1546 - loss 0.01988098 - time (sec): 56.88 - samples/sec: 1544.66 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 11:03:36,263 epoch 6 - iter 1232/1546 - loss 0.01957224 - time (sec): 64.54 - samples/sec: 1543.12 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 11:03:43,921 epoch 6 - iter 1386/1546 - loss 0.01892299 - time (sec): 72.20 - samples/sec: 1547.68 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 11:03:51,944 epoch 6 - iter 1540/1546 - loss 0.01883330 - time (sec): 80.22 - samples/sec: 1544.46 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 11:03:52,223 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:03:52,223 EPOCH 6 done: loss 0.0188 - lr: 0.000013 |
|
2023-10-25 11:03:54,800 DEV : loss 0.11670850217342377 - f1-score (micro avg) 0.7555 |
|
2023-10-25 11:03:54,821 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:04:02,733 epoch 7 - iter 154/1546 - loss 0.01309731 - time (sec): 7.91 - samples/sec: 1610.31 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 11:04:10,327 epoch 7 - iter 308/1546 - loss 0.01392533 - time (sec): 15.50 - samples/sec: 1602.90 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 11:04:17,838 epoch 7 - iter 462/1546 - loss 0.01201188 - time (sec): 23.02 - samples/sec: 1683.89 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 11:04:25,232 epoch 7 - iter 616/1546 - loss 0.01294140 - time (sec): 30.41 - samples/sec: 1631.46 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 11:04:32,693 epoch 7 - iter 770/1546 - loss 0.01266465 - time (sec): 37.87 - samples/sec: 1633.04 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 11:04:40,354 epoch 7 - iter 924/1546 - loss 0.01178168 - time (sec): 45.53 - samples/sec: 1636.18 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 11:04:48,093 epoch 7 - iter 1078/1546 - loss 0.01250303 - time (sec): 53.27 - samples/sec: 1612.54 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 11:04:55,739 epoch 7 - iter 1232/1546 - loss 0.01293497 - time (sec): 60.92 - samples/sec: 1606.69 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 11:05:03,548 epoch 7 - iter 1386/1546 - loss 0.01307263 - time (sec): 68.72 - samples/sec: 1614.44 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 11:05:11,371 epoch 7 - iter 1540/1546 - loss 0.01279875 - time (sec): 76.55 - samples/sec: 1616.14 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 11:05:11,669 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:05:11,669 EPOCH 7 done: loss 0.0128 - lr: 0.000010 |
|
2023-10-25 11:05:15,197 DEV : loss 0.12019308656454086 - f1-score (micro avg) 0.7705 |
|
2023-10-25 11:05:15,216 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:05:23,412 epoch 8 - iter 154/1546 - loss 0.00807042 - time (sec): 8.19 - samples/sec: 1504.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 11:05:31,450 epoch 8 - iter 308/1546 - loss 0.00746783 - time (sec): 16.23 - samples/sec: 1529.78 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 11:05:39,597 epoch 8 - iter 462/1546 - loss 0.00860174 - time (sec): 24.38 - samples/sec: 1492.71 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 11:05:47,486 epoch 8 - iter 616/1546 - loss 0.00814822 - time (sec): 32.27 - samples/sec: 1505.31 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 11:05:55,271 epoch 8 - iter 770/1546 - loss 0.00790389 - time (sec): 40.05 - samples/sec: 1526.31 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 11:06:03,085 epoch 8 - iter 924/1546 - loss 0.00841588 - time (sec): 47.87 - samples/sec: 1555.20 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 11:06:10,637 epoch 8 - iter 1078/1546 - loss 0.00891299 - time (sec): 55.42 - samples/sec: 1569.17 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 11:06:18,161 epoch 8 - iter 1232/1546 - loss 0.00897866 - time (sec): 62.94 - samples/sec: 1575.65 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 11:06:26,204 epoch 8 - iter 1386/1546 - loss 0.00922351 - time (sec): 70.99 - samples/sec: 1566.90 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 11:06:34,371 epoch 8 - iter 1540/1546 - loss 0.00871974 - time (sec): 79.15 - samples/sec: 1563.32 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 11:06:34,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:06:34,700 EPOCH 8 done: loss 0.0087 - lr: 0.000007 |
|
2023-10-25 11:06:37,436 DEV : loss 0.11731629073619843 - f1-score (micro avg) 0.7885 |
|
2023-10-25 11:06:37,454 saving best model |
|
2023-10-25 11:06:38,144 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:06:45,643 epoch 9 - iter 154/1546 - loss 0.00438247 - time (sec): 7.50 - samples/sec: 1680.48 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 11:06:52,902 epoch 9 - iter 308/1546 - loss 0.00372726 - time (sec): 14.76 - samples/sec: 1696.09 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 11:07:00,237 epoch 9 - iter 462/1546 - loss 0.00385813 - time (sec): 22.09 - samples/sec: 1693.20 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 11:07:07,380 epoch 9 - iter 616/1546 - loss 0.00377794 - time (sec): 29.23 - samples/sec: 1723.44 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 11:07:14,702 epoch 9 - iter 770/1546 - loss 0.00460608 - time (sec): 36.55 - samples/sec: 1714.33 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 11:07:22,280 epoch 9 - iter 924/1546 - loss 0.00436436 - time (sec): 44.13 - samples/sec: 1698.21 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 11:07:29,621 epoch 9 - iter 1078/1546 - loss 0.00482249 - time (sec): 51.47 - samples/sec: 1689.03 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 11:07:36,843 epoch 9 - iter 1232/1546 - loss 0.00450838 - time (sec): 58.70 - samples/sec: 1687.18 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 11:07:44,631 epoch 9 - iter 1386/1546 - loss 0.00440336 - time (sec): 66.48 - samples/sec: 1687.63 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 11:07:52,067 epoch 9 - iter 1540/1546 - loss 0.00411316 - time (sec): 73.92 - samples/sec: 1677.25 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 11:07:52,352 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:07:52,352 EPOCH 9 done: loss 0.0041 - lr: 0.000003 |
|
2023-10-25 11:07:54,893 DEV : loss 0.1307971030473709 - f1-score (micro avg) 0.7692 |
|
2023-10-25 11:07:54,916 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:08:02,148 epoch 10 - iter 154/1546 - loss 0.00204746 - time (sec): 7.23 - samples/sec: 1637.33 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 11:08:09,285 epoch 10 - iter 308/1546 - loss 0.00355914 - time (sec): 14.37 - samples/sec: 1636.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 11:08:16,545 epoch 10 - iter 462/1546 - loss 0.00381047 - time (sec): 21.63 - samples/sec: 1639.14 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 11:08:23,743 epoch 10 - iter 616/1546 - loss 0.00367323 - time (sec): 28.83 - samples/sec: 1655.86 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 11:08:31,133 epoch 10 - iter 770/1546 - loss 0.00339860 - time (sec): 36.22 - samples/sec: 1636.15 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 11:08:38,585 epoch 10 - iter 924/1546 - loss 0.00310024 - time (sec): 43.67 - samples/sec: 1652.22 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 11:08:46,049 epoch 10 - iter 1078/1546 - loss 0.00291881 - time (sec): 51.13 - samples/sec: 1664.18 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 11:08:53,576 epoch 10 - iter 1232/1546 - loss 0.00282386 - time (sec): 58.66 - samples/sec: 1675.15 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 11:09:01,133 epoch 10 - iter 1386/1546 - loss 0.00272207 - time (sec): 66.22 - samples/sec: 1680.98 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 11:09:08,863 epoch 10 - iter 1540/1546 - loss 0.00284420 - time (sec): 73.95 - samples/sec: 1671.06 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 11:09:09,151 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:09:09,151 EPOCH 10 done: loss 0.0029 - lr: 0.000000 |
|
2023-10-25 11:09:12,044 DEV : loss 0.13086602091789246 - f1-score (micro avg) 0.7695 |
|
2023-10-25 11:09:12,513 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:09:12,514 Loading model from best epoch ... |
|
2023-10-25 11:09:14,434 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-25 11:09:23,205 |
|
Results: |
|
- F-score (micro) 0.8064 |
|
- F-score (macro) 0.7186 |
|
- Accuracy 0.6987 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8493 0.8520 0.8507 946 |
|
BUILDING 0.6020 0.6378 0.6194 185 |
|
STREET 0.7347 0.6429 0.6857 56 |
|
|
|
micro avg 0.8040 0.8088 0.8064 1187 |
|
macro avg 0.7287 0.7109 0.7186 1187 |
|
weighted avg 0.8054 0.8088 0.8068 1187 |
|
|
|
2023-10-25 11:09:23,205 ---------------------------------------------------------------------------------------------------- |
|
|