2023-10-17 13:35:36,187 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:35:36,189 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 13:35:36,189 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:35:36,189 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-17 13:35:36,189 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:35:36,189 Train: 20847 sentences 2023-10-17 13:35:36,189 (train_with_dev=False, train_with_test=False) 2023-10-17 13:35:36,189 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:35:36,189 Training Params: 2023-10-17 13:35:36,189 - learning_rate: "3e-05" 2023-10-17 13:35:36,189 - mini_batch_size: "8" 2023-10-17 13:35:36,190 - max_epochs: "10" 2023-10-17 13:35:36,190 - shuffle: "True" 2023-10-17 13:35:36,190 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:35:36,190 Plugins: 2023-10-17 13:35:36,190 - TensorboardLogger 2023-10-17 13:35:36,190 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 13:35:36,190 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:35:36,190 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 13:35:36,190 - metric: "('micro avg', 'f1-score')" 2023-10-17 13:35:36,190 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:35:36,190 Computation: 2023-10-17 13:35:36,190 - compute on device: cuda:0 2023-10-17 13:35:36,190 - embedding storage: none 2023-10-17 13:35:36,190 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:35:36,190 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 13:35:36,190 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:35:36,191 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:35:36,191 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 13:36:03,247 epoch 1 - iter 260/2606 - loss 2.23417308 - time (sec): 27.05 - samples/sec: 1321.16 - lr: 0.000003 - momentum: 0.000000 2023-10-17 13:36:32,899 epoch 1 - iter 520/2606 - loss 1.31459848 - time (sec): 56.71 - samples/sec: 1299.74 - lr: 0.000006 - momentum: 0.000000 2023-10-17 13:37:00,740 epoch 1 - iter 780/2606 - loss 0.98767282 - time (sec): 84.55 - samples/sec: 1322.73 - lr: 0.000009 - momentum: 0.000000 2023-10-17 13:37:28,917 epoch 1 - iter 1040/2606 - loss 0.80515632 - time (sec): 112.72 - samples/sec: 1326.87 - lr: 0.000012 - momentum: 0.000000 2023-10-17 13:37:57,443 epoch 1 - iter 1300/2606 - loss 0.69065513 - time (sec): 141.25 - samples/sec: 1325.36 - lr: 0.000015 - momentum: 0.000000 2023-10-17 13:38:24,254 epoch 1 - iter 1560/2606 - loss 0.62104172 - time (sec): 168.06 - samples/sec: 1325.26 - lr: 0.000018 - momentum: 0.000000 2023-10-17 13:38:49,550 epoch 1 - iter 1820/2606 - loss 0.56057291 - time (sec): 193.36 - samples/sec: 1349.16 - lr: 0.000021 - momentum: 0.000000 2023-10-17 13:39:15,981 epoch 1 - iter 2080/2606 - loss 0.51882473 - time (sec): 219.79 - samples/sec: 1343.75 - lr: 0.000024 - momentum: 0.000000 2023-10-17 13:39:42,923 epoch 1 - iter 2340/2606 - loss 0.48259607 - time (sec): 246.73 - samples/sec: 1345.69 - lr: 0.000027 - momentum: 0.000000 2023-10-17 13:40:08,974 epoch 1 - iter 2600/2606 - loss 0.45516666 - time (sec): 272.78 - samples/sec: 1344.72 - lr: 0.000030 - momentum: 0.000000 2023-10-17 13:40:09,555 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:40:09,555 EPOCH 1 done: loss 0.4546 - lr: 0.000030 2023-10-17 13:40:17,382 DEV : loss 0.12860068678855896 - f1-score (micro avg) 0.2748 2023-10-17 13:40:17,440 saving best model 2023-10-17 13:40:17,985 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:40:45,255 epoch 2 - iter 260/2606 - loss 0.17952761 - time (sec): 27.27 - samples/sec: 1355.53 - lr: 0.000030 - momentum: 0.000000 2023-10-17 13:41:12,418 epoch 2 - iter 520/2606 - loss 0.17895549 - time (sec): 54.43 - samples/sec: 1353.01 - lr: 0.000029 - momentum: 0.000000 2023-10-17 13:41:39,526 epoch 2 - iter 780/2606 - loss 0.17220507 - time (sec): 81.54 - samples/sec: 1357.16 - lr: 0.000029 - momentum: 0.000000 2023-10-17 13:42:06,127 epoch 2 - iter 1040/2606 - loss 0.16936553 - time (sec): 108.14 - samples/sec: 1359.56 - lr: 0.000029 - momentum: 0.000000 2023-10-17 13:42:34,312 epoch 2 - iter 1300/2606 - loss 0.16526069 - time (sec): 136.32 - samples/sec: 1362.58 - lr: 0.000028 - momentum: 0.000000 2023-10-17 13:43:01,989 epoch 2 - iter 1560/2606 - loss 0.16534493 - time (sec): 164.00 - samples/sec: 1355.56 - lr: 0.000028 - momentum: 0.000000 2023-10-17 13:43:28,081 epoch 2 - iter 1820/2606 - loss 0.16526124 - time (sec): 190.09 - samples/sec: 1355.72 - lr: 0.000028 - momentum: 0.000000 2023-10-17 13:43:55,714 epoch 2 - iter 2080/2606 - loss 0.15960605 - time (sec): 217.73 - samples/sec: 1351.07 - lr: 0.000027 - momentum: 0.000000 2023-10-17 13:44:23,104 epoch 2 - iter 2340/2606 - loss 0.15681530 - time (sec): 245.12 - samples/sec: 1351.33 - lr: 0.000027 - momentum: 0.000000 2023-10-17 13:44:50,177 epoch 2 - iter 2600/2606 - loss 0.15743046 - time (sec): 272.19 - samples/sec: 1347.84 - lr: 0.000027 - momentum: 0.000000 2023-10-17 13:44:50,695 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:44:50,696 EPOCH 2 done: loss 0.1573 - lr: 0.000027 2023-10-17 13:45:02,783 DEV : loss 0.18409471213817596 - f1-score (micro avg) 0.2931 2023-10-17 13:45:02,846 saving best model 2023-10-17 13:45:04,273 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:45:31,452 epoch 3 - iter 260/2606 - loss 0.12993628 - time (sec): 27.17 - samples/sec: 1373.20 - lr: 0.000026 - momentum: 0.000000 2023-10-17 13:45:56,995 epoch 3 - iter 520/2606 - loss 0.12197809 - time (sec): 52.72 - samples/sec: 1378.15 - lr: 0.000026 - momentum: 0.000000 2023-10-17 13:46:23,269 epoch 3 - iter 780/2606 - loss 0.11338115 - time (sec): 78.99 - samples/sec: 1349.83 - lr: 0.000026 - momentum: 0.000000 2023-10-17 13:46:51,571 epoch 3 - iter 1040/2606 - loss 0.11312511 - time (sec): 107.29 - samples/sec: 1356.63 - lr: 0.000025 - momentum: 0.000000 2023-10-17 13:47:19,624 epoch 3 - iter 1300/2606 - loss 0.11104534 - time (sec): 135.35 - samples/sec: 1356.87 - lr: 0.000025 - momentum: 0.000000 2023-10-17 13:47:47,029 epoch 3 - iter 1560/2606 - loss 0.10839794 - time (sec): 162.75 - samples/sec: 1358.23 - lr: 0.000025 - momentum: 0.000000 2023-10-17 13:48:13,376 epoch 3 - iter 1820/2606 - loss 0.10823766 - time (sec): 189.10 - samples/sec: 1345.95 - lr: 0.000024 - momentum: 0.000000 2023-10-17 13:48:41,227 epoch 3 - iter 2080/2606 - loss 0.10901249 - time (sec): 216.95 - samples/sec: 1347.31 - lr: 0.000024 - momentum: 0.000000 2023-10-17 13:49:08,936 epoch 3 - iter 2340/2606 - loss 0.10998392 - time (sec): 244.66 - samples/sec: 1343.61 - lr: 0.000024 - momentum: 0.000000 2023-10-17 13:49:35,404 epoch 3 - iter 2600/2606 - loss 0.11017179 - time (sec): 271.13 - samples/sec: 1351.05 - lr: 0.000023 - momentum: 0.000000 2023-10-17 13:49:36,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:49:36,143 EPOCH 3 done: loss 0.1099 - lr: 0.000023 2023-10-17 13:49:47,073 DEV : loss 0.19609655439853668 - f1-score (micro avg) 0.3532 2023-10-17 13:49:47,126 saving best model 2023-10-17 13:49:48,541 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:50:16,836 epoch 4 - iter 260/2606 - loss 0.06308814 - time (sec): 28.29 - samples/sec: 1315.79 - lr: 0.000023 - momentum: 0.000000 2023-10-17 13:50:44,010 epoch 4 - iter 520/2606 - loss 0.06791172 - time (sec): 55.46 - samples/sec: 1324.28 - lr: 0.000023 - momentum: 0.000000 2023-10-17 13:51:10,499 epoch 4 - iter 780/2606 - loss 0.06827024 - time (sec): 81.95 - samples/sec: 1332.80 - lr: 0.000022 - momentum: 0.000000 2023-10-17 13:51:37,765 epoch 4 - iter 1040/2606 - loss 0.07212856 - time (sec): 109.22 - samples/sec: 1329.72 - lr: 0.000022 - momentum: 0.000000 2023-10-17 13:52:04,808 epoch 4 - iter 1300/2606 - loss 0.07317313 - time (sec): 136.26 - samples/sec: 1346.19 - lr: 0.000022 - momentum: 0.000000 2023-10-17 13:52:31,550 epoch 4 - iter 1560/2606 - loss 0.07332059 - time (sec): 163.00 - samples/sec: 1346.05 - lr: 0.000021 - momentum: 0.000000 2023-10-17 13:52:59,143 epoch 4 - iter 1820/2606 - loss 0.07391824 - time (sec): 190.60 - samples/sec: 1345.38 - lr: 0.000021 - momentum: 0.000000 2023-10-17 13:53:26,459 epoch 4 - iter 2080/2606 - loss 0.07288315 - time (sec): 217.91 - samples/sec: 1337.06 - lr: 0.000021 - momentum: 0.000000 2023-10-17 13:53:54,604 epoch 4 - iter 2340/2606 - loss 0.07585692 - time (sec): 246.06 - samples/sec: 1341.65 - lr: 0.000020 - momentum: 0.000000 2023-10-17 13:54:21,565 epoch 4 - iter 2600/2606 - loss 0.07713987 - time (sec): 273.02 - samples/sec: 1342.54 - lr: 0.000020 - momentum: 0.000000 2023-10-17 13:54:22,228 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:54:22,228 EPOCH 4 done: loss 0.0771 - lr: 0.000020 2023-10-17 13:54:33,663 DEV : loss 0.30645275115966797 - f1-score (micro avg) 0.3598 2023-10-17 13:54:33,734 saving best model 2023-10-17 13:54:35,171 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:55:03,942 epoch 5 - iter 260/2606 - loss 0.04419626 - time (sec): 28.77 - samples/sec: 1307.72 - lr: 0.000020 - momentum: 0.000000 2023-10-17 13:55:32,835 epoch 5 - iter 520/2606 - loss 0.05530628 - time (sec): 57.66 - samples/sec: 1297.68 - lr: 0.000019 - momentum: 0.000000 2023-10-17 13:56:01,826 epoch 5 - iter 780/2606 - loss 0.05500235 - time (sec): 86.65 - samples/sec: 1315.68 - lr: 0.000019 - momentum: 0.000000 2023-10-17 13:56:30,166 epoch 5 - iter 1040/2606 - loss 0.05623530 - time (sec): 114.99 - samples/sec: 1325.25 - lr: 0.000019 - momentum: 0.000000 2023-10-17 13:56:57,009 epoch 5 - iter 1300/2606 - loss 0.05618477 - time (sec): 141.83 - samples/sec: 1328.11 - lr: 0.000018 - momentum: 0.000000 2023-10-17 13:57:24,218 epoch 5 - iter 1560/2606 - loss 0.05551545 - time (sec): 169.04 - samples/sec: 1335.73 - lr: 0.000018 - momentum: 0.000000 2023-10-17 13:57:50,677 epoch 5 - iter 1820/2606 - loss 0.05518345 - time (sec): 195.50 - samples/sec: 1342.10 - lr: 0.000018 - momentum: 0.000000 2023-10-17 13:58:16,808 epoch 5 - iter 2080/2606 - loss 0.05536442 - time (sec): 221.63 - samples/sec: 1340.42 - lr: 0.000017 - momentum: 0.000000 2023-10-17 13:58:43,597 epoch 5 - iter 2340/2606 - loss 0.05513621 - time (sec): 248.42 - samples/sec: 1338.91 - lr: 0.000017 - momentum: 0.000000 2023-10-17 13:59:10,177 epoch 5 - iter 2600/2606 - loss 0.05550176 - time (sec): 275.00 - samples/sec: 1333.24 - lr: 0.000017 - momentum: 0.000000 2023-10-17 13:59:10,813 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:59:10,813 EPOCH 5 done: loss 0.0556 - lr: 0.000017 2023-10-17 13:59:22,685 DEV : loss 0.38089314103126526 - f1-score (micro avg) 0.3689 2023-10-17 13:59:22,758 saving best model 2023-10-17 13:59:24,340 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:59:51,579 epoch 6 - iter 260/2606 - loss 0.03634504 - time (sec): 27.24 - samples/sec: 1312.08 - lr: 0.000016 - momentum: 0.000000 2023-10-17 14:00:19,359 epoch 6 - iter 520/2606 - loss 0.03563055 - time (sec): 55.02 - samples/sec: 1327.10 - lr: 0.000016 - momentum: 0.000000 2023-10-17 14:00:45,525 epoch 6 - iter 780/2606 - loss 0.03541167 - time (sec): 81.18 - samples/sec: 1310.80 - lr: 0.000016 - momentum: 0.000000 2023-10-17 14:01:13,098 epoch 6 - iter 1040/2606 - loss 0.03662892 - time (sec): 108.75 - samples/sec: 1311.16 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:01:41,840 epoch 6 - iter 1300/2606 - loss 0.03651950 - time (sec): 137.50 - samples/sec: 1316.73 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:02:09,866 epoch 6 - iter 1560/2606 - loss 0.03685032 - time (sec): 165.52 - samples/sec: 1311.43 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:02:36,853 epoch 6 - iter 1820/2606 - loss 0.03810930 - time (sec): 192.51 - samples/sec: 1319.49 - lr: 0.000014 - momentum: 0.000000 2023-10-17 14:03:04,418 epoch 6 - iter 2080/2606 - loss 0.03893618 - time (sec): 220.07 - samples/sec: 1331.23 - lr: 0.000014 - momentum: 0.000000 2023-10-17 14:03:30,680 epoch 6 - iter 2340/2606 - loss 0.03981572 - time (sec): 246.34 - samples/sec: 1332.99 - lr: 0.000014 - momentum: 0.000000 2023-10-17 14:03:58,240 epoch 6 - iter 2600/2606 - loss 0.03973320 - time (sec): 273.90 - samples/sec: 1337.44 - lr: 0.000013 - momentum: 0.000000 2023-10-17 14:03:59,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:03:59,011 EPOCH 6 done: loss 0.0397 - lr: 0.000013 2023-10-17 14:04:10,365 DEV : loss 0.3598186671733856 - f1-score (micro avg) 0.3703 2023-10-17 14:04:10,425 saving best model 2023-10-17 14:04:11,836 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:04:38,872 epoch 7 - iter 260/2606 - loss 0.02914898 - time (sec): 27.03 - samples/sec: 1354.67 - lr: 0.000013 - momentum: 0.000000 2023-10-17 14:05:05,820 epoch 7 - iter 520/2606 - loss 0.02552344 - time (sec): 53.98 - samples/sec: 1376.27 - lr: 0.000013 - momentum: 0.000000 2023-10-17 14:05:31,950 epoch 7 - iter 780/2606 - loss 0.02775229 - time (sec): 80.11 - samples/sec: 1363.39 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:06:00,317 epoch 7 - iter 1040/2606 - loss 0.02635722 - time (sec): 108.48 - samples/sec: 1364.31 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:06:28,215 epoch 7 - iter 1300/2606 - loss 0.02810019 - time (sec): 136.37 - samples/sec: 1367.56 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:06:55,868 epoch 7 - iter 1560/2606 - loss 0.02775623 - time (sec): 164.03 - samples/sec: 1361.77 - lr: 0.000011 - momentum: 0.000000 2023-10-17 14:07:24,511 epoch 7 - iter 1820/2606 - loss 0.02725007 - time (sec): 192.67 - samples/sec: 1356.57 - lr: 0.000011 - momentum: 0.000000 2023-10-17 14:07:52,475 epoch 7 - iter 2080/2606 - loss 0.02794776 - time (sec): 220.64 - samples/sec: 1351.16 - lr: 0.000011 - momentum: 0.000000 2023-10-17 14:08:18,907 epoch 7 - iter 2340/2606 - loss 0.02788600 - time (sec): 247.07 - samples/sec: 1343.37 - lr: 0.000010 - momentum: 0.000000 2023-10-17 14:08:45,921 epoch 7 - iter 2600/2606 - loss 0.02768411 - time (sec): 274.08 - samples/sec: 1339.04 - lr: 0.000010 - momentum: 0.000000 2023-10-17 14:08:46,550 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:08:46,550 EPOCH 7 done: loss 0.0277 - lr: 0.000010 2023-10-17 14:08:58,244 DEV : loss 0.46519169211387634 - f1-score (micro avg) 0.3683 2023-10-17 14:08:58,320 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:09:25,709 epoch 8 - iter 260/2606 - loss 0.01354824 - time (sec): 27.39 - samples/sec: 1270.70 - lr: 0.000010 - momentum: 0.000000 2023-10-17 14:09:52,894 epoch 8 - iter 520/2606 - loss 0.01746483 - time (sec): 54.57 - samples/sec: 1295.91 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:10:20,826 epoch 8 - iter 780/2606 - loss 0.01722493 - time (sec): 82.50 - samples/sec: 1322.69 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:10:48,150 epoch 8 - iter 1040/2606 - loss 0.01799795 - time (sec): 109.83 - samples/sec: 1318.28 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:11:14,849 epoch 8 - iter 1300/2606 - loss 0.02034286 - time (sec): 136.53 - samples/sec: 1321.37 - lr: 0.000008 - momentum: 0.000000 2023-10-17 14:11:41,434 epoch 8 - iter 1560/2606 - loss 0.02094045 - time (sec): 163.11 - samples/sec: 1329.60 - lr: 0.000008 - momentum: 0.000000 2023-10-17 14:12:08,680 epoch 8 - iter 1820/2606 - loss 0.02116218 - time (sec): 190.36 - samples/sec: 1339.00 - lr: 0.000008 - momentum: 0.000000 2023-10-17 14:12:35,381 epoch 8 - iter 2080/2606 - loss 0.02089064 - time (sec): 217.06 - samples/sec: 1342.79 - lr: 0.000007 - momentum: 0.000000 2023-10-17 14:13:02,686 epoch 8 - iter 2340/2606 - loss 0.02211357 - time (sec): 244.36 - samples/sec: 1347.86 - lr: 0.000007 - momentum: 0.000000 2023-10-17 14:13:28,922 epoch 8 - iter 2600/2606 - loss 0.02165998 - time (sec): 270.60 - samples/sec: 1354.06 - lr: 0.000007 - momentum: 0.000000 2023-10-17 14:13:29,518 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:13:29,519 EPOCH 8 done: loss 0.0217 - lr: 0.000007 2023-10-17 14:13:41,241 DEV : loss 0.4531383812427521 - f1-score (micro avg) 0.3797 2023-10-17 14:13:41,306 saving best model 2023-10-17 14:13:42,719 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:14:10,088 epoch 9 - iter 260/2606 - loss 0.01374020 - time (sec): 27.36 - samples/sec: 1343.84 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:14:37,237 epoch 9 - iter 520/2606 - loss 0.01399606 - time (sec): 54.51 - samples/sec: 1373.81 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:15:04,557 epoch 9 - iter 780/2606 - loss 0.01498264 - time (sec): 81.83 - samples/sec: 1362.74 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:15:30,917 epoch 9 - iter 1040/2606 - loss 0.01450304 - time (sec): 108.19 - samples/sec: 1363.37 - lr: 0.000005 - momentum: 0.000000 2023-10-17 14:15:57,984 epoch 9 - iter 1300/2606 - loss 0.01479871 - time (sec): 135.26 - samples/sec: 1366.94 - lr: 0.000005 - momentum: 0.000000 2023-10-17 14:16:25,970 epoch 9 - iter 1560/2606 - loss 0.01476623 - time (sec): 163.25 - samples/sec: 1360.61 - lr: 0.000005 - momentum: 0.000000 2023-10-17 14:16:53,168 epoch 9 - iter 1820/2606 - loss 0.01440257 - time (sec): 190.44 - samples/sec: 1345.53 - lr: 0.000004 - momentum: 0.000000 2023-10-17 14:17:22,071 epoch 9 - iter 2080/2606 - loss 0.01471634 - time (sec): 219.35 - samples/sec: 1356.71 - lr: 0.000004 - momentum: 0.000000 2023-10-17 14:17:48,025 epoch 9 - iter 2340/2606 - loss 0.01462050 - time (sec): 245.30 - samples/sec: 1347.69 - lr: 0.000004 - momentum: 0.000000 2023-10-17 14:18:15,810 epoch 9 - iter 2600/2606 - loss 0.01420519 - time (sec): 273.09 - samples/sec: 1343.04 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:18:16,365 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:18:16,366 EPOCH 9 done: loss 0.0142 - lr: 0.000003 2023-10-17 14:18:29,180 DEV : loss 0.5335880517959595 - f1-score (micro avg) 0.3697 2023-10-17 14:18:29,239 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:18:57,797 epoch 10 - iter 260/2606 - loss 0.00788040 - time (sec): 28.56 - samples/sec: 1317.69 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:19:25,473 epoch 10 - iter 520/2606 - loss 0.00890575 - time (sec): 56.23 - samples/sec: 1311.63 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:19:53,273 epoch 10 - iter 780/2606 - loss 0.00939444 - time (sec): 84.03 - samples/sec: 1292.56 - lr: 0.000002 - momentum: 0.000000 2023-10-17 14:20:21,086 epoch 10 - iter 1040/2606 - loss 0.01008456 - time (sec): 111.84 - samples/sec: 1280.70 - lr: 0.000002 - momentum: 0.000000 2023-10-17 14:20:50,402 epoch 10 - iter 1300/2606 - loss 0.01075747 - time (sec): 141.16 - samples/sec: 1271.11 - lr: 0.000002 - momentum: 0.000000 2023-10-17 14:21:17,606 epoch 10 - iter 1560/2606 - loss 0.01058155 - time (sec): 168.36 - samples/sec: 1273.33 - lr: 0.000001 - momentum: 0.000000 2023-10-17 14:21:45,269 epoch 10 - iter 1820/2606 - loss 0.01068539 - time (sec): 196.03 - samples/sec: 1275.73 - lr: 0.000001 - momentum: 0.000000 2023-10-17 14:22:13,609 epoch 10 - iter 2080/2606 - loss 0.01111151 - time (sec): 224.37 - samples/sec: 1285.50 - lr: 0.000001 - momentum: 0.000000 2023-10-17 14:22:42,423 epoch 10 - iter 2340/2606 - loss 0.01084573 - time (sec): 253.18 - samples/sec: 1298.87 - lr: 0.000000 - momentum: 0.000000 2023-10-17 14:23:09,546 epoch 10 - iter 2600/2606 - loss 0.01098140 - time (sec): 280.30 - samples/sec: 1308.85 - lr: 0.000000 - momentum: 0.000000 2023-10-17 14:23:10,109 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:23:10,109 EPOCH 10 done: loss 0.0110 - lr: 0.000000 2023-10-17 14:23:22,465 DEV : loss 0.536491334438324 - f1-score (micro avg) 0.3636 2023-10-17 14:23:23,103 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:23:23,105 Loading model from best epoch ... 2023-10-17 14:23:25,695 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 14:23:45,508 Results: - F-score (micro) 0.4513 - F-score (macro) 0.3196 - Accuracy 0.2959 By class: precision recall f1-score support LOC 0.4768 0.5074 0.4916 1214 PER 0.4066 0.5062 0.4509 808 ORG 0.3068 0.3711 0.3359 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4230 0.4837 0.4513 2390 macro avg 0.2975 0.3462 0.3196 2390 weighted avg 0.4249 0.4837 0.4518 2390 2023-10-17 14:23:45,509 ----------------------------------------------------------------------------------------------------