stefan-it's picture
Upload ./training.log with huggingface_hub
181ad88
raw
history blame
24.1 kB
2023-10-25 10:07:45,243 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:45,244 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 10:07:45,244 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:45,244 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-25 10:07:45,245 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:45,245 Train: 6183 sentences
2023-10-25 10:07:45,245 (train_with_dev=False, train_with_test=False)
2023-10-25 10:07:45,245 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:45,245 Training Params:
2023-10-25 10:07:45,245 - learning_rate: "3e-05"
2023-10-25 10:07:45,245 - mini_batch_size: "4"
2023-10-25 10:07:45,245 - max_epochs: "10"
2023-10-25 10:07:45,245 - shuffle: "True"
2023-10-25 10:07:45,245 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:45,245 Plugins:
2023-10-25 10:07:45,245 - TensorboardLogger
2023-10-25 10:07:45,245 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 10:07:45,245 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:45,245 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 10:07:45,245 - metric: "('micro avg', 'f1-score')"
2023-10-25 10:07:45,245 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:45,245 Computation:
2023-10-25 10:07:45,245 - compute on device: cuda:0
2023-10-25 10:07:45,245 - embedding storage: none
2023-10-25 10:07:45,245 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:45,245 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-25 10:07:45,245 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:45,245 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:45,245 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 10:07:54,382 epoch 1 - iter 154/1546 - loss 1.95898349 - time (sec): 9.14 - samples/sec: 1382.04 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:08:03,947 epoch 1 - iter 308/1546 - loss 1.08649823 - time (sec): 18.70 - samples/sec: 1356.53 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:08:13,323 epoch 1 - iter 462/1546 - loss 0.77749102 - time (sec): 28.08 - samples/sec: 1339.78 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:08:22,401 epoch 1 - iter 616/1546 - loss 0.61557554 - time (sec): 37.15 - samples/sec: 1350.74 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:08:31,651 epoch 1 - iter 770/1546 - loss 0.52609909 - time (sec): 46.40 - samples/sec: 1327.24 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:08:41,081 epoch 1 - iter 924/1546 - loss 0.45827477 - time (sec): 55.83 - samples/sec: 1323.47 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:08:50,887 epoch 1 - iter 1078/1546 - loss 0.40834105 - time (sec): 65.64 - samples/sec: 1316.31 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:09:00,366 epoch 1 - iter 1232/1546 - loss 0.37098150 - time (sec): 75.12 - samples/sec: 1322.36 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:09:09,449 epoch 1 - iter 1386/1546 - loss 0.34265115 - time (sec): 84.20 - samples/sec: 1322.78 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:09:18,649 epoch 1 - iter 1540/1546 - loss 0.31711290 - time (sec): 93.40 - samples/sec: 1327.83 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:09:18,971 ----------------------------------------------------------------------------------------------------
2023-10-25 10:09:18,972 EPOCH 1 done: loss 0.3168 - lr: 0.000030
2023-10-25 10:09:23,096 DEV : loss 0.06953319162130356 - f1-score (micro avg) 0.728
2023-10-25 10:09:23,121 saving best model
2023-10-25 10:09:23,684 ----------------------------------------------------------------------------------------------------
2023-10-25 10:09:32,952 epoch 2 - iter 154/1546 - loss 0.08565728 - time (sec): 9.27 - samples/sec: 1332.61 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:09:41,220 epoch 2 - iter 308/1546 - loss 0.07829584 - time (sec): 17.53 - samples/sec: 1394.51 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:09:49,136 epoch 2 - iter 462/1546 - loss 0.08041971 - time (sec): 25.45 - samples/sec: 1449.04 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:09:57,343 epoch 2 - iter 616/1546 - loss 0.07988986 - time (sec): 33.66 - samples/sec: 1467.37 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:10:05,884 epoch 2 - iter 770/1546 - loss 0.07995253 - time (sec): 42.20 - samples/sec: 1457.42 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:10:14,384 epoch 2 - iter 924/1546 - loss 0.07950843 - time (sec): 50.70 - samples/sec: 1456.83 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:10:23,311 epoch 2 - iter 1078/1546 - loss 0.07844762 - time (sec): 59.62 - samples/sec: 1450.06 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:10:31,932 epoch 2 - iter 1232/1546 - loss 0.07929517 - time (sec): 68.25 - samples/sec: 1451.56 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:10:40,728 epoch 2 - iter 1386/1546 - loss 0.07977055 - time (sec): 77.04 - samples/sec: 1450.59 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:10:49,454 epoch 2 - iter 1540/1546 - loss 0.08051213 - time (sec): 85.77 - samples/sec: 1444.81 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:10:49,779 ----------------------------------------------------------------------------------------------------
2023-10-25 10:10:49,779 EPOCH 2 done: loss 0.0805 - lr: 0.000027
2023-10-25 10:10:52,439 DEV : loss 0.06576813757419586 - f1-score (micro avg) 0.7718
2023-10-25 10:10:52,455 saving best model
2023-10-25 10:10:53,216 ----------------------------------------------------------------------------------------------------
2023-10-25 10:11:01,862 epoch 3 - iter 154/1546 - loss 0.04411862 - time (sec): 8.64 - samples/sec: 1443.78 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:11:10,682 epoch 3 - iter 308/1546 - loss 0.04134666 - time (sec): 17.46 - samples/sec: 1398.66 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:11:19,067 epoch 3 - iter 462/1546 - loss 0.04200707 - time (sec): 25.85 - samples/sec: 1413.11 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:11:28,242 epoch 3 - iter 616/1546 - loss 0.04883651 - time (sec): 35.02 - samples/sec: 1394.56 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:11:37,518 epoch 3 - iter 770/1546 - loss 0.04944480 - time (sec): 44.30 - samples/sec: 1380.19 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:11:46,001 epoch 3 - iter 924/1546 - loss 0.04932366 - time (sec): 52.78 - samples/sec: 1403.01 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:11:54,294 epoch 3 - iter 1078/1546 - loss 0.05053819 - time (sec): 61.08 - samples/sec: 1417.76 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:12:02,472 epoch 3 - iter 1232/1546 - loss 0.05116452 - time (sec): 69.25 - samples/sec: 1428.39 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:12:10,945 epoch 3 - iter 1386/1546 - loss 0.05124381 - time (sec): 77.73 - samples/sec: 1429.55 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:12:19,189 epoch 3 - iter 1540/1546 - loss 0.05213069 - time (sec): 85.97 - samples/sec: 1438.07 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:12:19,499 ----------------------------------------------------------------------------------------------------
2023-10-25 10:12:19,499 EPOCH 3 done: loss 0.0521 - lr: 0.000023
2023-10-25 10:12:22,512 DEV : loss 0.08034052699804306 - f1-score (micro avg) 0.7676
2023-10-25 10:12:22,534 ----------------------------------------------------------------------------------------------------
2023-10-25 10:12:31,143 epoch 4 - iter 154/1546 - loss 0.03047819 - time (sec): 8.61 - samples/sec: 1494.64 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:12:39,417 epoch 4 - iter 308/1546 - loss 0.03210798 - time (sec): 16.88 - samples/sec: 1449.02 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:12:47,693 epoch 4 - iter 462/1546 - loss 0.03589036 - time (sec): 25.16 - samples/sec: 1492.13 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:12:56,033 epoch 4 - iter 616/1546 - loss 0.03542909 - time (sec): 33.50 - samples/sec: 1506.75 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:13:04,396 epoch 4 - iter 770/1546 - loss 0.03538510 - time (sec): 41.86 - samples/sec: 1487.02 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:13:12,795 epoch 4 - iter 924/1546 - loss 0.03504173 - time (sec): 50.26 - samples/sec: 1489.25 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:13:21,058 epoch 4 - iter 1078/1546 - loss 0.03479597 - time (sec): 58.52 - samples/sec: 1501.68 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:13:29,439 epoch 4 - iter 1232/1546 - loss 0.03494097 - time (sec): 66.90 - samples/sec: 1495.78 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:13:37,640 epoch 4 - iter 1386/1546 - loss 0.03530495 - time (sec): 75.10 - samples/sec: 1491.99 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:13:45,873 epoch 4 - iter 1540/1546 - loss 0.03579614 - time (sec): 83.34 - samples/sec: 1485.33 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:13:46,164 ----------------------------------------------------------------------------------------------------
2023-10-25 10:13:46,164 EPOCH 4 done: loss 0.0358 - lr: 0.000020
2023-10-25 10:13:49,325 DEV : loss 0.10077176988124847 - f1-score (micro avg) 0.7794
2023-10-25 10:13:49,343 saving best model
2023-10-25 10:13:50,397 ----------------------------------------------------------------------------------------------------
2023-10-25 10:13:58,433 epoch 5 - iter 154/1546 - loss 0.02159984 - time (sec): 8.03 - samples/sec: 1417.90 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:14:06,479 epoch 5 - iter 308/1546 - loss 0.01984275 - time (sec): 16.08 - samples/sec: 1531.20 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:14:14,358 epoch 5 - iter 462/1546 - loss 0.01943297 - time (sec): 23.96 - samples/sec: 1561.52 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:14:22,405 epoch 5 - iter 616/1546 - loss 0.02182829 - time (sec): 32.01 - samples/sec: 1557.29 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:14:30,348 epoch 5 - iter 770/1546 - loss 0.02422414 - time (sec): 39.95 - samples/sec: 1552.65 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:14:38,271 epoch 5 - iter 924/1546 - loss 0.02304780 - time (sec): 47.87 - samples/sec: 1558.65 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:14:46,226 epoch 5 - iter 1078/1546 - loss 0.02401278 - time (sec): 55.83 - samples/sec: 1554.19 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:14:54,088 epoch 5 - iter 1232/1546 - loss 0.02393514 - time (sec): 63.69 - samples/sec: 1556.48 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:15:02,055 epoch 5 - iter 1386/1546 - loss 0.02427300 - time (sec): 71.66 - samples/sec: 1557.22 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:15:09,948 epoch 5 - iter 1540/1546 - loss 0.02413449 - time (sec): 79.55 - samples/sec: 1557.93 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:15:10,234 ----------------------------------------------------------------------------------------------------
2023-10-25 10:15:10,234 EPOCH 5 done: loss 0.0241 - lr: 0.000017
2023-10-25 10:15:13,016 DEV : loss 0.10452549159526825 - f1-score (micro avg) 0.7705
2023-10-25 10:15:13,036 ----------------------------------------------------------------------------------------------------
2023-10-25 10:15:21,266 epoch 6 - iter 154/1546 - loss 0.01168533 - time (sec): 8.23 - samples/sec: 1528.49 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:15:29,267 epoch 6 - iter 308/1546 - loss 0.01065419 - time (sec): 16.23 - samples/sec: 1537.60 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:15:37,298 epoch 6 - iter 462/1546 - loss 0.01526262 - time (sec): 24.26 - samples/sec: 1541.69 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:15:45,424 epoch 6 - iter 616/1546 - loss 0.01612627 - time (sec): 32.39 - samples/sec: 1542.05 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:15:53,337 epoch 6 - iter 770/1546 - loss 0.01582392 - time (sec): 40.30 - samples/sec: 1513.88 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:16:01,223 epoch 6 - iter 924/1546 - loss 0.01644036 - time (sec): 48.19 - samples/sec: 1517.02 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:16:09,384 epoch 6 - iter 1078/1546 - loss 0.01548608 - time (sec): 56.35 - samples/sec: 1520.60 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:16:17,233 epoch 6 - iter 1232/1546 - loss 0.01600009 - time (sec): 64.20 - samples/sec: 1541.94 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:16:25,302 epoch 6 - iter 1386/1546 - loss 0.01604283 - time (sec): 72.26 - samples/sec: 1542.64 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:16:33,560 epoch 6 - iter 1540/1546 - loss 0.01611172 - time (sec): 80.52 - samples/sec: 1538.36 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:16:33,865 ----------------------------------------------------------------------------------------------------
2023-10-25 10:16:33,865 EPOCH 6 done: loss 0.0161 - lr: 0.000013
2023-10-25 10:16:37,133 DEV : loss 0.10936635732650757 - f1-score (micro avg) 0.7838
2023-10-25 10:16:37,151 saving best model
2023-10-25 10:16:37,850 ----------------------------------------------------------------------------------------------------
2023-10-25 10:16:46,325 epoch 7 - iter 154/1546 - loss 0.01008689 - time (sec): 8.47 - samples/sec: 1438.72 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:16:54,564 epoch 7 - iter 308/1546 - loss 0.01216195 - time (sec): 16.71 - samples/sec: 1492.79 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:17:02,857 epoch 7 - iter 462/1546 - loss 0.00994628 - time (sec): 25.00 - samples/sec: 1525.72 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:17:11,081 epoch 7 - iter 616/1546 - loss 0.01019350 - time (sec): 33.23 - samples/sec: 1513.71 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:17:19,490 epoch 7 - iter 770/1546 - loss 0.01005783 - time (sec): 41.64 - samples/sec: 1499.81 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:17:27,757 epoch 7 - iter 924/1546 - loss 0.01009497 - time (sec): 49.90 - samples/sec: 1482.68 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:17:36,242 epoch 7 - iter 1078/1546 - loss 0.01001395 - time (sec): 58.39 - samples/sec: 1488.54 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:17:44,675 epoch 7 - iter 1232/1546 - loss 0.00998899 - time (sec): 66.82 - samples/sec: 1488.34 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:17:53,200 epoch 7 - iter 1386/1546 - loss 0.01038485 - time (sec): 75.35 - samples/sec: 1479.83 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:18:01,461 epoch 7 - iter 1540/1546 - loss 0.01017453 - time (sec): 83.61 - samples/sec: 1478.58 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:18:01,791 ----------------------------------------------------------------------------------------------------
2023-10-25 10:18:01,791 EPOCH 7 done: loss 0.0102 - lr: 0.000010
2023-10-25 10:18:05,007 DEV : loss 0.1250709891319275 - f1-score (micro avg) 0.7574
2023-10-25 10:18:05,026 ----------------------------------------------------------------------------------------------------
2023-10-25 10:18:13,181 epoch 8 - iter 154/1546 - loss 0.00665759 - time (sec): 8.15 - samples/sec: 1538.14 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:18:21,404 epoch 8 - iter 308/1546 - loss 0.00574229 - time (sec): 16.38 - samples/sec: 1574.33 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:18:29,281 epoch 8 - iter 462/1546 - loss 0.00845782 - time (sec): 24.25 - samples/sec: 1557.05 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:18:37,411 epoch 8 - iter 616/1546 - loss 0.00893510 - time (sec): 32.38 - samples/sec: 1529.51 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:18:45,240 epoch 8 - iter 770/1546 - loss 0.00897181 - time (sec): 40.21 - samples/sec: 1523.81 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:18:53,140 epoch 8 - iter 924/1546 - loss 0.00912196 - time (sec): 48.11 - samples/sec: 1520.34 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:19:01,009 epoch 8 - iter 1078/1546 - loss 0.00899284 - time (sec): 55.98 - samples/sec: 1517.08 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:19:09,092 epoch 8 - iter 1232/1546 - loss 0.00792079 - time (sec): 64.06 - samples/sec: 1537.54 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:19:17,137 epoch 8 - iter 1386/1546 - loss 0.00751409 - time (sec): 72.11 - samples/sec: 1545.03 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:19:25,122 epoch 8 - iter 1540/1546 - loss 0.00765103 - time (sec): 80.09 - samples/sec: 1545.22 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:19:25,437 ----------------------------------------------------------------------------------------------------
2023-10-25 10:19:25,438 EPOCH 8 done: loss 0.0076 - lr: 0.000007
2023-10-25 10:19:28,416 DEV : loss 0.138199582695961 - f1-score (micro avg) 0.7826
2023-10-25 10:19:28,435 ----------------------------------------------------------------------------------------------------
2023-10-25 10:19:36,386 epoch 9 - iter 154/1546 - loss 0.00388127 - time (sec): 7.95 - samples/sec: 1454.52 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:19:44,184 epoch 9 - iter 308/1546 - loss 0.00445294 - time (sec): 15.75 - samples/sec: 1513.85 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:19:52,139 epoch 9 - iter 462/1546 - loss 0.00384866 - time (sec): 23.70 - samples/sec: 1546.23 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:19:59,987 epoch 9 - iter 616/1546 - loss 0.00454109 - time (sec): 31.55 - samples/sec: 1565.36 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:20:08,351 epoch 9 - iter 770/1546 - loss 0.00384940 - time (sec): 39.91 - samples/sec: 1567.95 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:20:16,436 epoch 9 - iter 924/1546 - loss 0.00331364 - time (sec): 48.00 - samples/sec: 1565.11 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:20:24,623 epoch 9 - iter 1078/1546 - loss 0.00371152 - time (sec): 56.19 - samples/sec: 1570.81 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:20:32,735 epoch 9 - iter 1232/1546 - loss 0.00348174 - time (sec): 64.30 - samples/sec: 1561.45 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:20:40,421 epoch 9 - iter 1386/1546 - loss 0.00368389 - time (sec): 71.98 - samples/sec: 1552.80 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:20:48,214 epoch 9 - iter 1540/1546 - loss 0.00399989 - time (sec): 79.78 - samples/sec: 1551.76 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:20:48,507 ----------------------------------------------------------------------------------------------------
2023-10-25 10:20:48,507 EPOCH 9 done: loss 0.0040 - lr: 0.000003
2023-10-25 10:20:51,401 DEV : loss 0.14553460478782654 - f1-score (micro avg) 0.7775
2023-10-25 10:20:51,418 ----------------------------------------------------------------------------------------------------
2023-10-25 10:20:59,196 epoch 10 - iter 154/1546 - loss 0.00354279 - time (sec): 7.78 - samples/sec: 1585.87 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:21:07,094 epoch 10 - iter 308/1546 - loss 0.00332450 - time (sec): 15.67 - samples/sec: 1497.57 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:21:15,194 epoch 10 - iter 462/1546 - loss 0.00258446 - time (sec): 23.77 - samples/sec: 1505.04 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:21:23,161 epoch 10 - iter 616/1546 - loss 0.00309299 - time (sec): 31.74 - samples/sec: 1515.18 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:21:31,077 epoch 10 - iter 770/1546 - loss 0.00275583 - time (sec): 39.66 - samples/sec: 1538.77 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:21:39,083 epoch 10 - iter 924/1546 - loss 0.00295097 - time (sec): 47.66 - samples/sec: 1538.93 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:21:46,922 epoch 10 - iter 1078/1546 - loss 0.00274496 - time (sec): 55.50 - samples/sec: 1541.17 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:21:54,953 epoch 10 - iter 1232/1546 - loss 0.00293898 - time (sec): 63.53 - samples/sec: 1546.02 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:22:02,893 epoch 10 - iter 1386/1546 - loss 0.00260959 - time (sec): 71.47 - samples/sec: 1549.52 - lr: 0.000000 - momentum: 0.000000
2023-10-25 10:22:10,803 epoch 10 - iter 1540/1546 - loss 0.00281452 - time (sec): 79.38 - samples/sec: 1559.37 - lr: 0.000000 - momentum: 0.000000
2023-10-25 10:22:11,140 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:11,141 EPOCH 10 done: loss 0.0028 - lr: 0.000000
2023-10-25 10:22:14,123 DEV : loss 0.14905457198619843 - f1-score (micro avg) 0.7683
2023-10-25 10:22:14,660 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:14,662 Loading model from best epoch ...
2023-10-25 10:22:16,789 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-25 10:22:26,927
Results:
- F-score (micro) 0.7887
- F-score (macro) 0.6962
- Accuracy 0.6725
By class:
precision recall f1-score support
LOC 0.8172 0.8552 0.8357 946
BUILDING 0.5736 0.6108 0.5916 185
STREET 0.6029 0.7321 0.6613 56
micro avg 0.7673 0.8113 0.7887 1187
macro avg 0.6646 0.7327 0.6962 1187
weighted avg 0.7691 0.8113 0.7895 1187
2023-10-25 10:22:26,928 ----------------------------------------------------------------------------------------------------