|
2023-10-25 17:10:39,021 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:10:39,022 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 17:10:39,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:10:39,022 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-25 17:10:39,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:10:39,022 Train: 20847 sentences |
|
2023-10-25 17:10:39,022 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 17:10:39,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:10:39,022 Training Params: |
|
2023-10-25 17:10:39,022 - learning_rate: "3e-05" |
|
2023-10-25 17:10:39,022 - mini_batch_size: "4" |
|
2023-10-25 17:10:39,022 - max_epochs: "10" |
|
2023-10-25 17:10:39,022 - shuffle: "True" |
|
2023-10-25 17:10:39,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:10:39,022 Plugins: |
|
2023-10-25 17:10:39,022 - TensorboardLogger |
|
2023-10-25 17:10:39,022 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 17:10:39,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:10:39,022 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 17:10:39,022 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 17:10:39,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:10:39,022 Computation: |
|
2023-10-25 17:10:39,022 - compute on device: cuda:0 |
|
2023-10-25 17:10:39,022 - embedding storage: none |
|
2023-10-25 17:10:39,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:10:39,022 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-25 17:10:39,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:10:39,023 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:10:39,023 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 17:11:01,402 epoch 1 - iter 521/5212 - loss 1.42812369 - time (sec): 22.38 - samples/sec: 1563.36 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 17:11:23,707 epoch 1 - iter 1042/5212 - loss 0.88349774 - time (sec): 44.68 - samples/sec: 1637.20 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 17:11:45,540 epoch 1 - iter 1563/5212 - loss 0.69007273 - time (sec): 66.52 - samples/sec: 1598.78 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 17:12:07,631 epoch 1 - iter 2084/5212 - loss 0.57540747 - time (sec): 88.61 - samples/sec: 1644.33 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 17:12:29,962 epoch 1 - iter 2605/5212 - loss 0.50612270 - time (sec): 110.94 - samples/sec: 1647.07 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 17:12:51,920 epoch 1 - iter 3126/5212 - loss 0.45647568 - time (sec): 132.90 - samples/sec: 1658.24 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 17:13:13,904 epoch 1 - iter 3647/5212 - loss 0.41815257 - time (sec): 154.88 - samples/sec: 1670.30 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 17:13:36,167 epoch 1 - iter 4168/5212 - loss 0.39256890 - time (sec): 177.14 - samples/sec: 1665.90 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 17:13:58,652 epoch 1 - iter 4689/5212 - loss 0.37195175 - time (sec): 199.63 - samples/sec: 1659.85 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 17:14:21,064 epoch 1 - iter 5210/5212 - loss 0.35466532 - time (sec): 222.04 - samples/sec: 1654.66 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 17:14:21,146 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:14:21,147 EPOCH 1 done: loss 0.3546 - lr: 0.000030 |
|
2023-10-25 17:14:24,869 DEV : loss 0.12253964692354202 - f1-score (micro avg) 0.1856 |
|
2023-10-25 17:14:24,895 saving best model |
|
2023-10-25 17:14:25,372 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:14:47,269 epoch 2 - iter 521/5212 - loss 0.17884856 - time (sec): 21.90 - samples/sec: 1682.18 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 17:15:09,326 epoch 2 - iter 1042/5212 - loss 0.17084209 - time (sec): 43.95 - samples/sec: 1746.73 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 17:15:31,350 epoch 2 - iter 1563/5212 - loss 0.17404067 - time (sec): 65.98 - samples/sec: 1744.50 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 17:15:53,385 epoch 2 - iter 2084/5212 - loss 0.17709317 - time (sec): 88.01 - samples/sec: 1724.88 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 17:16:15,212 epoch 2 - iter 2605/5212 - loss 0.17555934 - time (sec): 109.84 - samples/sec: 1693.49 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 17:16:36,826 epoch 2 - iter 3126/5212 - loss 0.17428858 - time (sec): 131.45 - samples/sec: 1697.41 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 17:16:58,764 epoch 2 - iter 3647/5212 - loss 0.17078016 - time (sec): 153.39 - samples/sec: 1704.73 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 17:17:20,835 epoch 2 - iter 4168/5212 - loss 0.16926793 - time (sec): 175.46 - samples/sec: 1708.53 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 17:17:42,843 epoch 2 - iter 4689/5212 - loss 0.16791217 - time (sec): 197.47 - samples/sec: 1689.55 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 17:18:04,488 epoch 2 - iter 5210/5212 - loss 0.16890519 - time (sec): 219.11 - samples/sec: 1675.45 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 17:18:04,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:18:04,576 EPOCH 2 done: loss 0.1691 - lr: 0.000027 |
|
2023-10-25 17:18:11,491 DEV : loss 0.1517297476530075 - f1-score (micro avg) 0.3623 |
|
2023-10-25 17:18:11,516 saving best model |
|
2023-10-25 17:18:12,122 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:18:34,231 epoch 3 - iter 521/5212 - loss 0.10765596 - time (sec): 22.10 - samples/sec: 1778.85 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 17:18:55,895 epoch 3 - iter 1042/5212 - loss 0.10781574 - time (sec): 43.77 - samples/sec: 1737.42 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 17:19:17,465 epoch 3 - iter 1563/5212 - loss 0.11097772 - time (sec): 65.34 - samples/sec: 1697.21 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 17:19:40,795 epoch 3 - iter 2084/5212 - loss 0.11524448 - time (sec): 88.67 - samples/sec: 1696.34 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 17:20:02,364 epoch 3 - iter 2605/5212 - loss 0.11375610 - time (sec): 110.24 - samples/sec: 1681.78 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 17:20:24,583 epoch 3 - iter 3126/5212 - loss 0.11298056 - time (sec): 132.46 - samples/sec: 1671.52 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 17:20:46,653 epoch 3 - iter 3647/5212 - loss 0.11408116 - time (sec): 154.53 - samples/sec: 1664.56 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 17:21:08,743 epoch 3 - iter 4168/5212 - loss 0.11524557 - time (sec): 176.62 - samples/sec: 1668.38 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 17:21:30,574 epoch 3 - iter 4689/5212 - loss 0.11726213 - time (sec): 198.45 - samples/sec: 1665.60 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 17:21:52,582 epoch 3 - iter 5210/5212 - loss 0.11559631 - time (sec): 220.45 - samples/sec: 1666.47 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 17:21:52,662 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:21:52,662 EPOCH 3 done: loss 0.1156 - lr: 0.000023 |
|
2023-10-25 17:21:58,870 DEV : loss 0.2178632616996765 - f1-score (micro avg) 0.4066 |
|
2023-10-25 17:21:58,897 saving best model |
|
2023-10-25 17:21:59,382 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:22:21,118 epoch 4 - iter 521/5212 - loss 0.08093001 - time (sec): 21.73 - samples/sec: 1628.13 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 17:22:42,719 epoch 4 - iter 1042/5212 - loss 0.08353216 - time (sec): 43.34 - samples/sec: 1609.79 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 17:23:05,589 epoch 4 - iter 1563/5212 - loss 0.07892836 - time (sec): 66.21 - samples/sec: 1646.97 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 17:23:27,349 epoch 4 - iter 2084/5212 - loss 0.08223074 - time (sec): 87.97 - samples/sec: 1671.53 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 17:23:49,217 epoch 4 - iter 2605/5212 - loss 0.08120219 - time (sec): 109.83 - samples/sec: 1669.44 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 17:24:11,386 epoch 4 - iter 3126/5212 - loss 0.08400528 - time (sec): 132.00 - samples/sec: 1666.34 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 17:24:33,293 epoch 4 - iter 3647/5212 - loss 0.08287545 - time (sec): 153.91 - samples/sec: 1670.99 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 17:24:55,564 epoch 4 - iter 4168/5212 - loss 0.08112875 - time (sec): 176.18 - samples/sec: 1676.36 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 17:25:17,655 epoch 4 - iter 4689/5212 - loss 0.08165353 - time (sec): 198.27 - samples/sec: 1675.32 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 17:25:39,735 epoch 4 - iter 5210/5212 - loss 0.08271518 - time (sec): 220.35 - samples/sec: 1667.18 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 17:25:39,814 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:25:39,814 EPOCH 4 done: loss 0.0827 - lr: 0.000020 |
|
2023-10-25 17:25:46,171 DEV : loss 0.22580073773860931 - f1-score (micro avg) 0.3863 |
|
2023-10-25 17:25:46,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:26:08,425 epoch 5 - iter 521/5212 - loss 0.06486546 - time (sec): 22.22 - samples/sec: 1728.16 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 17:26:30,115 epoch 5 - iter 1042/5212 - loss 0.06119080 - time (sec): 43.92 - samples/sec: 1733.02 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 17:26:51,896 epoch 5 - iter 1563/5212 - loss 0.05815220 - time (sec): 65.70 - samples/sec: 1720.52 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 17:27:13,839 epoch 5 - iter 2084/5212 - loss 0.05823306 - time (sec): 87.64 - samples/sec: 1688.74 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 17:27:35,819 epoch 5 - iter 2605/5212 - loss 0.06005092 - time (sec): 109.62 - samples/sec: 1692.64 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 17:27:58,545 epoch 5 - iter 3126/5212 - loss 0.05992666 - time (sec): 132.34 - samples/sec: 1678.65 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 17:28:20,832 epoch 5 - iter 3647/5212 - loss 0.06213572 - time (sec): 154.63 - samples/sec: 1674.98 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 17:28:42,976 epoch 5 - iter 4168/5212 - loss 0.06186123 - time (sec): 176.78 - samples/sec: 1673.56 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 17:29:04,588 epoch 5 - iter 4689/5212 - loss 0.06183511 - time (sec): 198.39 - samples/sec: 1675.21 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 17:29:25,823 epoch 5 - iter 5210/5212 - loss 0.06174339 - time (sec): 219.62 - samples/sec: 1672.76 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 17:29:25,901 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:29:25,901 EPOCH 5 done: loss 0.0617 - lr: 0.000017 |
|
2023-10-25 17:29:32,121 DEV : loss 0.26587387919425964 - f1-score (micro avg) 0.4384 |
|
2023-10-25 17:29:32,148 saving best model |
|
2023-10-25 17:29:32,748 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:29:55,254 epoch 6 - iter 521/5212 - loss 0.04134678 - time (sec): 22.50 - samples/sec: 1747.44 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 17:30:17,890 epoch 6 - iter 1042/5212 - loss 0.03876985 - time (sec): 45.14 - samples/sec: 1695.68 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 17:30:40,689 epoch 6 - iter 1563/5212 - loss 0.03994635 - time (sec): 67.94 - samples/sec: 1680.45 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 17:31:02,501 epoch 6 - iter 2084/5212 - loss 0.04059620 - time (sec): 89.75 - samples/sec: 1674.02 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 17:31:24,817 epoch 6 - iter 2605/5212 - loss 0.04082835 - time (sec): 112.07 - samples/sec: 1649.42 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 17:31:46,564 epoch 6 - iter 3126/5212 - loss 0.04210774 - time (sec): 133.81 - samples/sec: 1643.11 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 17:32:09,005 epoch 6 - iter 3647/5212 - loss 0.04150600 - time (sec): 156.26 - samples/sec: 1658.69 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 17:32:31,318 epoch 6 - iter 4168/5212 - loss 0.04190735 - time (sec): 178.57 - samples/sec: 1652.85 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 17:32:53,308 epoch 6 - iter 4689/5212 - loss 0.04188495 - time (sec): 200.56 - samples/sec: 1651.75 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 17:33:15,961 epoch 6 - iter 5210/5212 - loss 0.04187712 - time (sec): 223.21 - samples/sec: 1645.76 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 17:33:16,042 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:33:16,043 EPOCH 6 done: loss 0.0419 - lr: 0.000013 |
|
2023-10-25 17:33:22,241 DEV : loss 0.3574075996875763 - f1-score (micro avg) 0.397 |
|
2023-10-25 17:33:22,267 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:33:44,354 epoch 7 - iter 521/5212 - loss 0.03039355 - time (sec): 22.09 - samples/sec: 1677.53 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 17:34:07,659 epoch 7 - iter 1042/5212 - loss 0.03164622 - time (sec): 45.39 - samples/sec: 1622.40 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 17:34:29,729 epoch 7 - iter 1563/5212 - loss 0.03063082 - time (sec): 67.46 - samples/sec: 1650.05 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 17:34:51,817 epoch 7 - iter 2084/5212 - loss 0.03002049 - time (sec): 89.55 - samples/sec: 1647.24 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 17:35:14,190 epoch 7 - iter 2605/5212 - loss 0.02845181 - time (sec): 111.92 - samples/sec: 1641.54 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 17:35:36,402 epoch 7 - iter 3126/5212 - loss 0.02817657 - time (sec): 134.13 - samples/sec: 1652.56 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 17:35:58,702 epoch 7 - iter 3647/5212 - loss 0.02817558 - time (sec): 156.43 - samples/sec: 1675.68 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 17:36:20,947 epoch 7 - iter 4168/5212 - loss 0.03018002 - time (sec): 178.68 - samples/sec: 1667.05 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 17:36:42,851 epoch 7 - iter 4689/5212 - loss 0.02972812 - time (sec): 200.58 - samples/sec: 1657.09 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 17:37:04,941 epoch 7 - iter 5210/5212 - loss 0.02981065 - time (sec): 222.67 - samples/sec: 1649.75 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 17:37:05,019 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:37:05,019 EPOCH 7 done: loss 0.0298 - lr: 0.000010 |
|
2023-10-25 17:37:11,912 DEV : loss 0.41519442200660706 - f1-score (micro avg) 0.3892 |
|
2023-10-25 17:37:11,938 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:37:33,570 epoch 8 - iter 521/5212 - loss 0.01779766 - time (sec): 21.63 - samples/sec: 1669.98 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 17:37:55,659 epoch 8 - iter 1042/5212 - loss 0.01916489 - time (sec): 43.72 - samples/sec: 1668.19 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 17:38:17,995 epoch 8 - iter 1563/5212 - loss 0.02003561 - time (sec): 66.06 - samples/sec: 1647.34 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 17:38:40,210 epoch 8 - iter 2084/5212 - loss 0.01983414 - time (sec): 88.27 - samples/sec: 1640.49 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 17:39:02,258 epoch 8 - iter 2605/5212 - loss 0.02006743 - time (sec): 110.32 - samples/sec: 1647.44 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 17:39:24,542 epoch 8 - iter 3126/5212 - loss 0.02243560 - time (sec): 132.60 - samples/sec: 1646.87 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 17:39:46,461 epoch 8 - iter 3647/5212 - loss 0.02188283 - time (sec): 154.52 - samples/sec: 1668.89 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 17:40:08,478 epoch 8 - iter 4168/5212 - loss 0.02198132 - time (sec): 176.54 - samples/sec: 1666.15 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 17:40:31,186 epoch 8 - iter 4689/5212 - loss 0.02180658 - time (sec): 199.25 - samples/sec: 1656.91 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 17:40:53,169 epoch 8 - iter 5210/5212 - loss 0.02133745 - time (sec): 221.23 - samples/sec: 1660.21 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 17:40:53,252 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:40:53,252 EPOCH 8 done: loss 0.0213 - lr: 0.000007 |
|
2023-10-25 17:41:00,281 DEV : loss 0.3952998220920563 - f1-score (micro avg) 0.3988 |
|
2023-10-25 17:41:00,307 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:41:22,341 epoch 9 - iter 521/5212 - loss 0.01866035 - time (sec): 22.03 - samples/sec: 1715.77 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 17:41:45,005 epoch 9 - iter 1042/5212 - loss 0.01578284 - time (sec): 44.70 - samples/sec: 1662.33 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 17:42:07,148 epoch 9 - iter 1563/5212 - loss 0.01409868 - time (sec): 66.84 - samples/sec: 1649.15 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 17:42:29,360 epoch 9 - iter 2084/5212 - loss 0.01328640 - time (sec): 89.05 - samples/sec: 1672.80 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 17:42:51,215 epoch 9 - iter 2605/5212 - loss 0.01404307 - time (sec): 110.91 - samples/sec: 1671.13 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 17:43:13,076 epoch 9 - iter 3126/5212 - loss 0.01393654 - time (sec): 132.77 - samples/sec: 1662.03 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 17:43:35,371 epoch 9 - iter 3647/5212 - loss 0.01363863 - time (sec): 155.06 - samples/sec: 1667.51 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 17:43:57,538 epoch 9 - iter 4168/5212 - loss 0.01418941 - time (sec): 177.23 - samples/sec: 1662.81 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 17:44:19,964 epoch 9 - iter 4689/5212 - loss 0.01443190 - time (sec): 199.66 - samples/sec: 1664.10 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 17:44:42,080 epoch 9 - iter 5210/5212 - loss 0.01434777 - time (sec): 221.77 - samples/sec: 1656.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 17:44:42,163 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:44:42,163 EPOCH 9 done: loss 0.0143 - lr: 0.000003 |
|
2023-10-25 17:44:49,057 DEV : loss 0.4204791486263275 - f1-score (micro avg) 0.4073 |
|
2023-10-25 17:44:49,084 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:45:11,388 epoch 10 - iter 521/5212 - loss 0.01094782 - time (sec): 22.30 - samples/sec: 1662.01 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 17:45:33,362 epoch 10 - iter 1042/5212 - loss 0.00960251 - time (sec): 44.28 - samples/sec: 1649.16 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 17:45:55,731 epoch 10 - iter 1563/5212 - loss 0.00887085 - time (sec): 66.65 - samples/sec: 1680.98 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 17:46:17,794 epoch 10 - iter 2084/5212 - loss 0.00932400 - time (sec): 88.71 - samples/sec: 1696.50 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 17:46:40,350 epoch 10 - iter 2605/5212 - loss 0.00933893 - time (sec): 111.27 - samples/sec: 1701.90 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 17:47:02,076 epoch 10 - iter 3126/5212 - loss 0.00908651 - time (sec): 132.99 - samples/sec: 1681.57 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 17:47:23,742 epoch 10 - iter 3647/5212 - loss 0.00933220 - time (sec): 154.66 - samples/sec: 1667.79 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 17:47:45,231 epoch 10 - iter 4168/5212 - loss 0.00889164 - time (sec): 176.15 - samples/sec: 1661.09 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 17:48:06,956 epoch 10 - iter 4689/5212 - loss 0.00870107 - time (sec): 197.87 - samples/sec: 1669.40 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 17:48:28,945 epoch 10 - iter 5210/5212 - loss 0.00875027 - time (sec): 219.86 - samples/sec: 1669.93 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 17:48:29,040 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:48:29,040 EPOCH 10 done: loss 0.0087 - lr: 0.000000 |
|
2023-10-25 17:48:35,948 DEV : loss 0.4886936545372009 - f1-score (micro avg) 0.3916 |
|
2023-10-25 17:48:36,442 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:48:36,443 Loading model from best epoch ... |
|
2023-10-25 17:48:38,032 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 17:48:48,031 |
|
Results: |
|
- F-score (micro) 0.4406 |
|
- F-score (macro) 0.298 |
|
- Accuracy 0.2872 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4976 0.5049 0.5012 1214 |
|
PER 0.4266 0.4282 0.4274 808 |
|
ORG 0.2867 0.2436 0.2634 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4441 0.4372 0.4406 2390 |
|
macro avg 0.3027 0.2942 0.2980 2390 |
|
weighted avg 0.4393 0.4372 0.4380 2390 |
|
|
|
2023-10-25 17:48:48,031 ---------------------------------------------------------------------------------------------------- |
|
|