stefan-it's picture
Upload folder using huggingface_hub
eac4e2b
2023-10-17 13:35:36,187 ----------------------------------------------------------------------------------------------------
2023-10-17 13:35:36,189 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 13:35:36,189 ----------------------------------------------------------------------------------------------------
2023-10-17 13:35:36,189 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-17 13:35:36,189 ----------------------------------------------------------------------------------------------------
2023-10-17 13:35:36,189 Train: 20847 sentences
2023-10-17 13:35:36,189 (train_with_dev=False, train_with_test=False)
2023-10-17 13:35:36,189 ----------------------------------------------------------------------------------------------------
2023-10-17 13:35:36,189 Training Params:
2023-10-17 13:35:36,189 - learning_rate: "3e-05"
2023-10-17 13:35:36,189 - mini_batch_size: "8"
2023-10-17 13:35:36,190 - max_epochs: "10"
2023-10-17 13:35:36,190 - shuffle: "True"
2023-10-17 13:35:36,190 ----------------------------------------------------------------------------------------------------
2023-10-17 13:35:36,190 Plugins:
2023-10-17 13:35:36,190 - TensorboardLogger
2023-10-17 13:35:36,190 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 13:35:36,190 ----------------------------------------------------------------------------------------------------
2023-10-17 13:35:36,190 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 13:35:36,190 - metric: "('micro avg', 'f1-score')"
2023-10-17 13:35:36,190 ----------------------------------------------------------------------------------------------------
2023-10-17 13:35:36,190 Computation:
2023-10-17 13:35:36,190 - compute on device: cuda:0
2023-10-17 13:35:36,190 - embedding storage: none
2023-10-17 13:35:36,190 ----------------------------------------------------------------------------------------------------
2023-10-17 13:35:36,190 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 13:35:36,190 ----------------------------------------------------------------------------------------------------
2023-10-17 13:35:36,191 ----------------------------------------------------------------------------------------------------
2023-10-17 13:35:36,191 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 13:36:03,247 epoch 1 - iter 260/2606 - loss 2.23417308 - time (sec): 27.05 - samples/sec: 1321.16 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:36:32,899 epoch 1 - iter 520/2606 - loss 1.31459848 - time (sec): 56.71 - samples/sec: 1299.74 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:37:00,740 epoch 1 - iter 780/2606 - loss 0.98767282 - time (sec): 84.55 - samples/sec: 1322.73 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:37:28,917 epoch 1 - iter 1040/2606 - loss 0.80515632 - time (sec): 112.72 - samples/sec: 1326.87 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:37:57,443 epoch 1 - iter 1300/2606 - loss 0.69065513 - time (sec): 141.25 - samples/sec: 1325.36 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:38:24,254 epoch 1 - iter 1560/2606 - loss 0.62104172 - time (sec): 168.06 - samples/sec: 1325.26 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:38:49,550 epoch 1 - iter 1820/2606 - loss 0.56057291 - time (sec): 193.36 - samples/sec: 1349.16 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:39:15,981 epoch 1 - iter 2080/2606 - loss 0.51882473 - time (sec): 219.79 - samples/sec: 1343.75 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:39:42,923 epoch 1 - iter 2340/2606 - loss 0.48259607 - time (sec): 246.73 - samples/sec: 1345.69 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:40:08,974 epoch 1 - iter 2600/2606 - loss 0.45516666 - time (sec): 272.78 - samples/sec: 1344.72 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:40:09,555 ----------------------------------------------------------------------------------------------------
2023-10-17 13:40:09,555 EPOCH 1 done: loss 0.4546 - lr: 0.000030
2023-10-17 13:40:17,382 DEV : loss 0.12860068678855896 - f1-score (micro avg) 0.2748
2023-10-17 13:40:17,440 saving best model
2023-10-17 13:40:17,985 ----------------------------------------------------------------------------------------------------
2023-10-17 13:40:45,255 epoch 2 - iter 260/2606 - loss 0.17952761 - time (sec): 27.27 - samples/sec: 1355.53 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:41:12,418 epoch 2 - iter 520/2606 - loss 0.17895549 - time (sec): 54.43 - samples/sec: 1353.01 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:41:39,526 epoch 2 - iter 780/2606 - loss 0.17220507 - time (sec): 81.54 - samples/sec: 1357.16 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:42:06,127 epoch 2 - iter 1040/2606 - loss 0.16936553 - time (sec): 108.14 - samples/sec: 1359.56 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:42:34,312 epoch 2 - iter 1300/2606 - loss 0.16526069 - time (sec): 136.32 - samples/sec: 1362.58 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:43:01,989 epoch 2 - iter 1560/2606 - loss 0.16534493 - time (sec): 164.00 - samples/sec: 1355.56 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:43:28,081 epoch 2 - iter 1820/2606 - loss 0.16526124 - time (sec): 190.09 - samples/sec: 1355.72 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:43:55,714 epoch 2 - iter 2080/2606 - loss 0.15960605 - time (sec): 217.73 - samples/sec: 1351.07 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:44:23,104 epoch 2 - iter 2340/2606 - loss 0.15681530 - time (sec): 245.12 - samples/sec: 1351.33 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:44:50,177 epoch 2 - iter 2600/2606 - loss 0.15743046 - time (sec): 272.19 - samples/sec: 1347.84 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:44:50,695 ----------------------------------------------------------------------------------------------------
2023-10-17 13:44:50,696 EPOCH 2 done: loss 0.1573 - lr: 0.000027
2023-10-17 13:45:02,783 DEV : loss 0.18409471213817596 - f1-score (micro avg) 0.2931
2023-10-17 13:45:02,846 saving best model
2023-10-17 13:45:04,273 ----------------------------------------------------------------------------------------------------
2023-10-17 13:45:31,452 epoch 3 - iter 260/2606 - loss 0.12993628 - time (sec): 27.17 - samples/sec: 1373.20 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:45:56,995 epoch 3 - iter 520/2606 - loss 0.12197809 - time (sec): 52.72 - samples/sec: 1378.15 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:46:23,269 epoch 3 - iter 780/2606 - loss 0.11338115 - time (sec): 78.99 - samples/sec: 1349.83 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:46:51,571 epoch 3 - iter 1040/2606 - loss 0.11312511 - time (sec): 107.29 - samples/sec: 1356.63 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:47:19,624 epoch 3 - iter 1300/2606 - loss 0.11104534 - time (sec): 135.35 - samples/sec: 1356.87 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:47:47,029 epoch 3 - iter 1560/2606 - loss 0.10839794 - time (sec): 162.75 - samples/sec: 1358.23 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:48:13,376 epoch 3 - iter 1820/2606 - loss 0.10823766 - time (sec): 189.10 - samples/sec: 1345.95 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:48:41,227 epoch 3 - iter 2080/2606 - loss 0.10901249 - time (sec): 216.95 - samples/sec: 1347.31 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:49:08,936 epoch 3 - iter 2340/2606 - loss 0.10998392 - time (sec): 244.66 - samples/sec: 1343.61 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:49:35,404 epoch 3 - iter 2600/2606 - loss 0.11017179 - time (sec): 271.13 - samples/sec: 1351.05 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:49:36,143 ----------------------------------------------------------------------------------------------------
2023-10-17 13:49:36,143 EPOCH 3 done: loss 0.1099 - lr: 0.000023
2023-10-17 13:49:47,073 DEV : loss 0.19609655439853668 - f1-score (micro avg) 0.3532
2023-10-17 13:49:47,126 saving best model
2023-10-17 13:49:48,541 ----------------------------------------------------------------------------------------------------
2023-10-17 13:50:16,836 epoch 4 - iter 260/2606 - loss 0.06308814 - time (sec): 28.29 - samples/sec: 1315.79 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:50:44,010 epoch 4 - iter 520/2606 - loss 0.06791172 - time (sec): 55.46 - samples/sec: 1324.28 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:51:10,499 epoch 4 - iter 780/2606 - loss 0.06827024 - time (sec): 81.95 - samples/sec: 1332.80 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:51:37,765 epoch 4 - iter 1040/2606 - loss 0.07212856 - time (sec): 109.22 - samples/sec: 1329.72 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:52:04,808 epoch 4 - iter 1300/2606 - loss 0.07317313 - time (sec): 136.26 - samples/sec: 1346.19 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:52:31,550 epoch 4 - iter 1560/2606 - loss 0.07332059 - time (sec): 163.00 - samples/sec: 1346.05 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:52:59,143 epoch 4 - iter 1820/2606 - loss 0.07391824 - time (sec): 190.60 - samples/sec: 1345.38 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:53:26,459 epoch 4 - iter 2080/2606 - loss 0.07288315 - time (sec): 217.91 - samples/sec: 1337.06 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:53:54,604 epoch 4 - iter 2340/2606 - loss 0.07585692 - time (sec): 246.06 - samples/sec: 1341.65 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:54:21,565 epoch 4 - iter 2600/2606 - loss 0.07713987 - time (sec): 273.02 - samples/sec: 1342.54 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:54:22,228 ----------------------------------------------------------------------------------------------------
2023-10-17 13:54:22,228 EPOCH 4 done: loss 0.0771 - lr: 0.000020
2023-10-17 13:54:33,663 DEV : loss 0.30645275115966797 - f1-score (micro avg) 0.3598
2023-10-17 13:54:33,734 saving best model
2023-10-17 13:54:35,171 ----------------------------------------------------------------------------------------------------
2023-10-17 13:55:03,942 epoch 5 - iter 260/2606 - loss 0.04419626 - time (sec): 28.77 - samples/sec: 1307.72 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:55:32,835 epoch 5 - iter 520/2606 - loss 0.05530628 - time (sec): 57.66 - samples/sec: 1297.68 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:56:01,826 epoch 5 - iter 780/2606 - loss 0.05500235 - time (sec): 86.65 - samples/sec: 1315.68 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:56:30,166 epoch 5 - iter 1040/2606 - loss 0.05623530 - time (sec): 114.99 - samples/sec: 1325.25 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:56:57,009 epoch 5 - iter 1300/2606 - loss 0.05618477 - time (sec): 141.83 - samples/sec: 1328.11 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:57:24,218 epoch 5 - iter 1560/2606 - loss 0.05551545 - time (sec): 169.04 - samples/sec: 1335.73 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:57:50,677 epoch 5 - iter 1820/2606 - loss 0.05518345 - time (sec): 195.50 - samples/sec: 1342.10 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:58:16,808 epoch 5 - iter 2080/2606 - loss 0.05536442 - time (sec): 221.63 - samples/sec: 1340.42 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:58:43,597 epoch 5 - iter 2340/2606 - loss 0.05513621 - time (sec): 248.42 - samples/sec: 1338.91 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:59:10,177 epoch 5 - iter 2600/2606 - loss 0.05550176 - time (sec): 275.00 - samples/sec: 1333.24 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:59:10,813 ----------------------------------------------------------------------------------------------------
2023-10-17 13:59:10,813 EPOCH 5 done: loss 0.0556 - lr: 0.000017
2023-10-17 13:59:22,685 DEV : loss 0.38089314103126526 - f1-score (micro avg) 0.3689
2023-10-17 13:59:22,758 saving best model
2023-10-17 13:59:24,340 ----------------------------------------------------------------------------------------------------
2023-10-17 13:59:51,579 epoch 6 - iter 260/2606 - loss 0.03634504 - time (sec): 27.24 - samples/sec: 1312.08 - lr: 0.000016 - momentum: 0.000000
2023-10-17 14:00:19,359 epoch 6 - iter 520/2606 - loss 0.03563055 - time (sec): 55.02 - samples/sec: 1327.10 - lr: 0.000016 - momentum: 0.000000
2023-10-17 14:00:45,525 epoch 6 - iter 780/2606 - loss 0.03541167 - time (sec): 81.18 - samples/sec: 1310.80 - lr: 0.000016 - momentum: 0.000000
2023-10-17 14:01:13,098 epoch 6 - iter 1040/2606 - loss 0.03662892 - time (sec): 108.75 - samples/sec: 1311.16 - lr: 0.000015 - momentum: 0.000000
2023-10-17 14:01:41,840 epoch 6 - iter 1300/2606 - loss 0.03651950 - time (sec): 137.50 - samples/sec: 1316.73 - lr: 0.000015 - momentum: 0.000000
2023-10-17 14:02:09,866 epoch 6 - iter 1560/2606 - loss 0.03685032 - time (sec): 165.52 - samples/sec: 1311.43 - lr: 0.000015 - momentum: 0.000000
2023-10-17 14:02:36,853 epoch 6 - iter 1820/2606 - loss 0.03810930 - time (sec): 192.51 - samples/sec: 1319.49 - lr: 0.000014 - momentum: 0.000000
2023-10-17 14:03:04,418 epoch 6 - iter 2080/2606 - loss 0.03893618 - time (sec): 220.07 - samples/sec: 1331.23 - lr: 0.000014 - momentum: 0.000000
2023-10-17 14:03:30,680 epoch 6 - iter 2340/2606 - loss 0.03981572 - time (sec): 246.34 - samples/sec: 1332.99 - lr: 0.000014 - momentum: 0.000000
2023-10-17 14:03:58,240 epoch 6 - iter 2600/2606 - loss 0.03973320 - time (sec): 273.90 - samples/sec: 1337.44 - lr: 0.000013 - momentum: 0.000000
2023-10-17 14:03:59,011 ----------------------------------------------------------------------------------------------------
2023-10-17 14:03:59,011 EPOCH 6 done: loss 0.0397 - lr: 0.000013
2023-10-17 14:04:10,365 DEV : loss 0.3598186671733856 - f1-score (micro avg) 0.3703
2023-10-17 14:04:10,425 saving best model
2023-10-17 14:04:11,836 ----------------------------------------------------------------------------------------------------
2023-10-17 14:04:38,872 epoch 7 - iter 260/2606 - loss 0.02914898 - time (sec): 27.03 - samples/sec: 1354.67 - lr: 0.000013 - momentum: 0.000000
2023-10-17 14:05:05,820 epoch 7 - iter 520/2606 - loss 0.02552344 - time (sec): 53.98 - samples/sec: 1376.27 - lr: 0.000013 - momentum: 0.000000
2023-10-17 14:05:31,950 epoch 7 - iter 780/2606 - loss 0.02775229 - time (sec): 80.11 - samples/sec: 1363.39 - lr: 0.000012 - momentum: 0.000000
2023-10-17 14:06:00,317 epoch 7 - iter 1040/2606 - loss 0.02635722 - time (sec): 108.48 - samples/sec: 1364.31 - lr: 0.000012 - momentum: 0.000000
2023-10-17 14:06:28,215 epoch 7 - iter 1300/2606 - loss 0.02810019 - time (sec): 136.37 - samples/sec: 1367.56 - lr: 0.000012 - momentum: 0.000000
2023-10-17 14:06:55,868 epoch 7 - iter 1560/2606 - loss 0.02775623 - time (sec): 164.03 - samples/sec: 1361.77 - lr: 0.000011 - momentum: 0.000000
2023-10-17 14:07:24,511 epoch 7 - iter 1820/2606 - loss 0.02725007 - time (sec): 192.67 - samples/sec: 1356.57 - lr: 0.000011 - momentum: 0.000000
2023-10-17 14:07:52,475 epoch 7 - iter 2080/2606 - loss 0.02794776 - time (sec): 220.64 - samples/sec: 1351.16 - lr: 0.000011 - momentum: 0.000000
2023-10-17 14:08:18,907 epoch 7 - iter 2340/2606 - loss 0.02788600 - time (sec): 247.07 - samples/sec: 1343.37 - lr: 0.000010 - momentum: 0.000000
2023-10-17 14:08:45,921 epoch 7 - iter 2600/2606 - loss 0.02768411 - time (sec): 274.08 - samples/sec: 1339.04 - lr: 0.000010 - momentum: 0.000000
2023-10-17 14:08:46,550 ----------------------------------------------------------------------------------------------------
2023-10-17 14:08:46,550 EPOCH 7 done: loss 0.0277 - lr: 0.000010
2023-10-17 14:08:58,244 DEV : loss 0.46519169211387634 - f1-score (micro avg) 0.3683
2023-10-17 14:08:58,320 ----------------------------------------------------------------------------------------------------
2023-10-17 14:09:25,709 epoch 8 - iter 260/2606 - loss 0.01354824 - time (sec): 27.39 - samples/sec: 1270.70 - lr: 0.000010 - momentum: 0.000000
2023-10-17 14:09:52,894 epoch 8 - iter 520/2606 - loss 0.01746483 - time (sec): 54.57 - samples/sec: 1295.91 - lr: 0.000009 - momentum: 0.000000
2023-10-17 14:10:20,826 epoch 8 - iter 780/2606 - loss 0.01722493 - time (sec): 82.50 - samples/sec: 1322.69 - lr: 0.000009 - momentum: 0.000000
2023-10-17 14:10:48,150 epoch 8 - iter 1040/2606 - loss 0.01799795 - time (sec): 109.83 - samples/sec: 1318.28 - lr: 0.000009 - momentum: 0.000000
2023-10-17 14:11:14,849 epoch 8 - iter 1300/2606 - loss 0.02034286 - time (sec): 136.53 - samples/sec: 1321.37 - lr: 0.000008 - momentum: 0.000000
2023-10-17 14:11:41,434 epoch 8 - iter 1560/2606 - loss 0.02094045 - time (sec): 163.11 - samples/sec: 1329.60 - lr: 0.000008 - momentum: 0.000000
2023-10-17 14:12:08,680 epoch 8 - iter 1820/2606 - loss 0.02116218 - time (sec): 190.36 - samples/sec: 1339.00 - lr: 0.000008 - momentum: 0.000000
2023-10-17 14:12:35,381 epoch 8 - iter 2080/2606 - loss 0.02089064 - time (sec): 217.06 - samples/sec: 1342.79 - lr: 0.000007 - momentum: 0.000000
2023-10-17 14:13:02,686 epoch 8 - iter 2340/2606 - loss 0.02211357 - time (sec): 244.36 - samples/sec: 1347.86 - lr: 0.000007 - momentum: 0.000000
2023-10-17 14:13:28,922 epoch 8 - iter 2600/2606 - loss 0.02165998 - time (sec): 270.60 - samples/sec: 1354.06 - lr: 0.000007 - momentum: 0.000000
2023-10-17 14:13:29,518 ----------------------------------------------------------------------------------------------------
2023-10-17 14:13:29,519 EPOCH 8 done: loss 0.0217 - lr: 0.000007
2023-10-17 14:13:41,241 DEV : loss 0.4531383812427521 - f1-score (micro avg) 0.3797
2023-10-17 14:13:41,306 saving best model
2023-10-17 14:13:42,719 ----------------------------------------------------------------------------------------------------
2023-10-17 14:14:10,088 epoch 9 - iter 260/2606 - loss 0.01374020 - time (sec): 27.36 - samples/sec: 1343.84 - lr: 0.000006 - momentum: 0.000000
2023-10-17 14:14:37,237 epoch 9 - iter 520/2606 - loss 0.01399606 - time (sec): 54.51 - samples/sec: 1373.81 - lr: 0.000006 - momentum: 0.000000
2023-10-17 14:15:04,557 epoch 9 - iter 780/2606 - loss 0.01498264 - time (sec): 81.83 - samples/sec: 1362.74 - lr: 0.000006 - momentum: 0.000000
2023-10-17 14:15:30,917 epoch 9 - iter 1040/2606 - loss 0.01450304 - time (sec): 108.19 - samples/sec: 1363.37 - lr: 0.000005 - momentum: 0.000000
2023-10-17 14:15:57,984 epoch 9 - iter 1300/2606 - loss 0.01479871 - time (sec): 135.26 - samples/sec: 1366.94 - lr: 0.000005 - momentum: 0.000000
2023-10-17 14:16:25,970 epoch 9 - iter 1560/2606 - loss 0.01476623 - time (sec): 163.25 - samples/sec: 1360.61 - lr: 0.000005 - momentum: 0.000000
2023-10-17 14:16:53,168 epoch 9 - iter 1820/2606 - loss 0.01440257 - time (sec): 190.44 - samples/sec: 1345.53 - lr: 0.000004 - momentum: 0.000000
2023-10-17 14:17:22,071 epoch 9 - iter 2080/2606 - loss 0.01471634 - time (sec): 219.35 - samples/sec: 1356.71 - lr: 0.000004 - momentum: 0.000000
2023-10-17 14:17:48,025 epoch 9 - iter 2340/2606 - loss 0.01462050 - time (sec): 245.30 - samples/sec: 1347.69 - lr: 0.000004 - momentum: 0.000000
2023-10-17 14:18:15,810 epoch 9 - iter 2600/2606 - loss 0.01420519 - time (sec): 273.09 - samples/sec: 1343.04 - lr: 0.000003 - momentum: 0.000000
2023-10-17 14:18:16,365 ----------------------------------------------------------------------------------------------------
2023-10-17 14:18:16,366 EPOCH 9 done: loss 0.0142 - lr: 0.000003
2023-10-17 14:18:29,180 DEV : loss 0.5335880517959595 - f1-score (micro avg) 0.3697
2023-10-17 14:18:29,239 ----------------------------------------------------------------------------------------------------
2023-10-17 14:18:57,797 epoch 10 - iter 260/2606 - loss 0.00788040 - time (sec): 28.56 - samples/sec: 1317.69 - lr: 0.000003 - momentum: 0.000000
2023-10-17 14:19:25,473 epoch 10 - iter 520/2606 - loss 0.00890575 - time (sec): 56.23 - samples/sec: 1311.63 - lr: 0.000003 - momentum: 0.000000
2023-10-17 14:19:53,273 epoch 10 - iter 780/2606 - loss 0.00939444 - time (sec): 84.03 - samples/sec: 1292.56 - lr: 0.000002 - momentum: 0.000000
2023-10-17 14:20:21,086 epoch 10 - iter 1040/2606 - loss 0.01008456 - time (sec): 111.84 - samples/sec: 1280.70 - lr: 0.000002 - momentum: 0.000000
2023-10-17 14:20:50,402 epoch 10 - iter 1300/2606 - loss 0.01075747 - time (sec): 141.16 - samples/sec: 1271.11 - lr: 0.000002 - momentum: 0.000000
2023-10-17 14:21:17,606 epoch 10 - iter 1560/2606 - loss 0.01058155 - time (sec): 168.36 - samples/sec: 1273.33 - lr: 0.000001 - momentum: 0.000000
2023-10-17 14:21:45,269 epoch 10 - iter 1820/2606 - loss 0.01068539 - time (sec): 196.03 - samples/sec: 1275.73 - lr: 0.000001 - momentum: 0.000000
2023-10-17 14:22:13,609 epoch 10 - iter 2080/2606 - loss 0.01111151 - time (sec): 224.37 - samples/sec: 1285.50 - lr: 0.000001 - momentum: 0.000000
2023-10-17 14:22:42,423 epoch 10 - iter 2340/2606 - loss 0.01084573 - time (sec): 253.18 - samples/sec: 1298.87 - lr: 0.000000 - momentum: 0.000000
2023-10-17 14:23:09,546 epoch 10 - iter 2600/2606 - loss 0.01098140 - time (sec): 280.30 - samples/sec: 1308.85 - lr: 0.000000 - momentum: 0.000000
2023-10-17 14:23:10,109 ----------------------------------------------------------------------------------------------------
2023-10-17 14:23:10,109 EPOCH 10 done: loss 0.0110 - lr: 0.000000
2023-10-17 14:23:22,465 DEV : loss 0.536491334438324 - f1-score (micro avg) 0.3636
2023-10-17 14:23:23,103 ----------------------------------------------------------------------------------------------------
2023-10-17 14:23:23,105 Loading model from best epoch ...
2023-10-17 14:23:25,695 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 14:23:45,508
Results:
- F-score (micro) 0.4513
- F-score (macro) 0.3196
- Accuracy 0.2959
By class:
precision recall f1-score support
LOC 0.4768 0.5074 0.4916 1214
PER 0.4066 0.5062 0.4509 808
ORG 0.3068 0.3711 0.3359 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4230 0.4837 0.4513 2390
macro avg 0.2975 0.3462 0.3196 2390
weighted avg 0.4249 0.4837 0.4518 2390
2023-10-17 14:23:45,509 ----------------------------------------------------------------------------------------------------