|
2023-10-18 16:48:35,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:35,521 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 16:48:35,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:35,521 MultiCorpus: 966 train + 219 dev + 204 test sentences |
|
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator |
|
2023-10-18 16:48:35,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:35,521 Train: 966 sentences |
|
2023-10-18 16:48:35,521 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 16:48:35,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:35,521 Training Params: |
|
2023-10-18 16:48:35,521 - learning_rate: "3e-05" |
|
2023-10-18 16:48:35,522 - mini_batch_size: "8" |
|
2023-10-18 16:48:35,522 - max_epochs: "10" |
|
2023-10-18 16:48:35,522 - shuffle: "True" |
|
2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:35,522 Plugins: |
|
2023-10-18 16:48:35,522 - TensorboardLogger |
|
2023-10-18 16:48:35,522 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:35,522 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 16:48:35,522 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:35,522 Computation: |
|
2023-10-18 16:48:35,522 - compute on device: cuda:0 |
|
2023-10-18 16:48:35,522 - embedding storage: none |
|
2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:35,522 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:35,522 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 16:48:35,781 epoch 1 - iter 12/121 - loss 3.73135666 - time (sec): 0.26 - samples/sec: 8989.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:48:36,037 epoch 1 - iter 24/121 - loss 3.70230323 - time (sec): 0.51 - samples/sec: 8365.49 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:48:36,312 epoch 1 - iter 36/121 - loss 3.64697934 - time (sec): 0.79 - samples/sec: 9056.15 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:48:36,585 epoch 1 - iter 48/121 - loss 3.64897593 - time (sec): 1.06 - samples/sec: 8930.69 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:48:36,853 epoch 1 - iter 60/121 - loss 3.61059523 - time (sec): 1.33 - samples/sec: 8967.33 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:48:37,110 epoch 1 - iter 72/121 - loss 3.52905406 - time (sec): 1.59 - samples/sec: 8849.71 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:48:37,383 epoch 1 - iter 84/121 - loss 3.40711096 - time (sec): 1.86 - samples/sec: 9028.03 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:48:37,654 epoch 1 - iter 96/121 - loss 3.27300372 - time (sec): 2.13 - samples/sec: 9245.22 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:48:37,931 epoch 1 - iter 108/121 - loss 3.11948802 - time (sec): 2.41 - samples/sec: 9258.34 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:48:38,186 epoch 1 - iter 120/121 - loss 2.98945042 - time (sec): 2.66 - samples/sec: 9256.13 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 16:48:38,202 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:38,202 EPOCH 1 done: loss 2.9857 - lr: 0.000030 |
|
2023-10-18 16:48:38,475 DEV : loss 0.888546884059906 - f1-score (micro avg) 0.0 |
|
2023-10-18 16:48:38,480 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:38,743 epoch 2 - iter 12/121 - loss 1.34267957 - time (sec): 0.26 - samples/sec: 8726.53 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 16:48:39,014 epoch 2 - iter 24/121 - loss 1.20006107 - time (sec): 0.53 - samples/sec: 8960.14 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:48:39,281 epoch 2 - iter 36/121 - loss 1.07692122 - time (sec): 0.80 - samples/sec: 9001.51 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:48:39,559 epoch 2 - iter 48/121 - loss 1.02313150 - time (sec): 1.08 - samples/sec: 9151.99 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:48:39,830 epoch 2 - iter 60/121 - loss 0.98698774 - time (sec): 1.35 - samples/sec: 8953.45 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:48:40,110 epoch 2 - iter 72/121 - loss 0.95003231 - time (sec): 1.63 - samples/sec: 9022.75 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:48:40,342 epoch 2 - iter 84/121 - loss 0.89889328 - time (sec): 1.86 - samples/sec: 9294.91 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:48:40,744 epoch 2 - iter 96/121 - loss 0.86386461 - time (sec): 2.26 - samples/sec: 8747.32 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:48:41,018 epoch 2 - iter 108/121 - loss 0.86218533 - time (sec): 2.54 - samples/sec: 8706.77 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:48:41,283 epoch 2 - iter 120/121 - loss 0.84499695 - time (sec): 2.80 - samples/sec: 8744.14 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:48:41,305 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:41,305 EPOCH 2 done: loss 0.8462 - lr: 0.000027 |
|
2023-10-18 16:48:41,719 DEV : loss 0.6577260494232178 - f1-score (micro avg) 0.0 |
|
2023-10-18 16:48:41,724 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:42,009 epoch 3 - iter 12/121 - loss 0.72147088 - time (sec): 0.28 - samples/sec: 8779.43 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:48:42,279 epoch 3 - iter 24/121 - loss 0.76590949 - time (sec): 0.56 - samples/sec: 8647.80 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:48:42,457 epoch 3 - iter 36/121 - loss 0.74445104 - time (sec): 0.73 - samples/sec: 9633.18 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:48:42,640 epoch 3 - iter 48/121 - loss 0.73264098 - time (sec): 0.92 - samples/sec: 10434.44 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:48:42,827 epoch 3 - iter 60/121 - loss 0.71998033 - time (sec): 1.10 - samples/sec: 10910.00 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:48:43,005 epoch 3 - iter 72/121 - loss 0.70956971 - time (sec): 1.28 - samples/sec: 11157.50 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:48:43,192 epoch 3 - iter 84/121 - loss 0.69546580 - time (sec): 1.47 - samples/sec: 11494.33 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:48:43,392 epoch 3 - iter 96/121 - loss 0.68506052 - time (sec): 1.67 - samples/sec: 11815.01 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:48:43,616 epoch 3 - iter 108/121 - loss 0.67453678 - time (sec): 1.89 - samples/sec: 11736.39 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:48:43,840 epoch 3 - iter 120/121 - loss 0.67556474 - time (sec): 2.12 - samples/sec: 11614.32 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:48:43,855 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:43,855 EPOCH 3 done: loss 0.6758 - lr: 0.000023 |
|
2023-10-18 16:48:44,269 DEV : loss 0.5368312001228333 - f1-score (micro avg) 0.0 |
|
2023-10-18 16:48:44,273 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:44,540 epoch 4 - iter 12/121 - loss 0.67001378 - time (sec): 0.27 - samples/sec: 7861.72 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:48:44,819 epoch 4 - iter 24/121 - loss 0.64732484 - time (sec): 0.55 - samples/sec: 8204.05 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:48:45,098 epoch 4 - iter 36/121 - loss 0.61911750 - time (sec): 0.82 - samples/sec: 8823.53 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:48:45,356 epoch 4 - iter 48/121 - loss 0.61389990 - time (sec): 1.08 - samples/sec: 8897.35 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:48:45,623 epoch 4 - iter 60/121 - loss 0.61282173 - time (sec): 1.35 - samples/sec: 8954.13 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:48:45,889 epoch 4 - iter 72/121 - loss 0.59776583 - time (sec): 1.61 - samples/sec: 9038.88 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:48:46,159 epoch 4 - iter 84/121 - loss 0.59023191 - time (sec): 1.89 - samples/sec: 9178.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:48:46,434 epoch 4 - iter 96/121 - loss 0.58243842 - time (sec): 2.16 - samples/sec: 9158.22 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:48:46,721 epoch 4 - iter 108/121 - loss 0.58528172 - time (sec): 2.45 - samples/sec: 9086.95 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:48:46,991 epoch 4 - iter 120/121 - loss 0.57638690 - time (sec): 2.72 - samples/sec: 9077.62 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:48:47,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:47,008 EPOCH 4 done: loss 0.5767 - lr: 0.000020 |
|
2023-10-18 16:48:47,426 DEV : loss 0.43159720301628113 - f1-score (micro avg) 0.0952 |
|
2023-10-18 16:48:47,430 saving best model |
|
2023-10-18 16:48:47,463 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:47,734 epoch 5 - iter 12/121 - loss 0.54535688 - time (sec): 0.27 - samples/sec: 9126.35 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:48:48,003 epoch 5 - iter 24/121 - loss 0.53077766 - time (sec): 0.54 - samples/sec: 9335.20 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:48:48,272 epoch 5 - iter 36/121 - loss 0.51623647 - time (sec): 0.81 - samples/sec: 9313.01 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:48:48,551 epoch 5 - iter 48/121 - loss 0.52815520 - time (sec): 1.09 - samples/sec: 9360.77 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:48:48,739 epoch 5 - iter 60/121 - loss 0.52803554 - time (sec): 1.27 - samples/sec: 9878.01 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:48:48,937 epoch 5 - iter 72/121 - loss 0.52301950 - time (sec): 1.47 - samples/sec: 10188.78 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:48:49,159 epoch 5 - iter 84/121 - loss 0.51809378 - time (sec): 1.69 - samples/sec: 10286.49 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:48:49,354 epoch 5 - iter 96/121 - loss 0.51630126 - time (sec): 1.89 - samples/sec: 10552.00 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:48:49,577 epoch 5 - iter 108/121 - loss 0.50390050 - time (sec): 2.11 - samples/sec: 10559.03 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:48:49,829 epoch 5 - iter 120/121 - loss 0.50184025 - time (sec): 2.36 - samples/sec: 10415.44 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:48:49,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:49,847 EPOCH 5 done: loss 0.5019 - lr: 0.000017 |
|
2023-10-18 16:48:50,277 DEV : loss 0.38968807458877563 - f1-score (micro avg) 0.236 |
|
2023-10-18 16:48:50,283 saving best model |
|
2023-10-18 16:48:50,320 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:50,611 epoch 6 - iter 12/121 - loss 0.48606714 - time (sec): 0.29 - samples/sec: 8841.20 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:48:50,929 epoch 6 - iter 24/121 - loss 0.47768069 - time (sec): 0.61 - samples/sec: 8232.48 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:48:51,229 epoch 6 - iter 36/121 - loss 0.49199054 - time (sec): 0.91 - samples/sec: 8306.29 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:48:51,507 epoch 6 - iter 48/121 - loss 0.45279491 - time (sec): 1.19 - samples/sec: 8222.37 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:48:51,781 epoch 6 - iter 60/121 - loss 0.45754327 - time (sec): 1.46 - samples/sec: 8416.10 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:48:52,049 epoch 6 - iter 72/121 - loss 0.46451499 - time (sec): 1.73 - samples/sec: 8553.99 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:48:52,313 epoch 6 - iter 84/121 - loss 0.46666705 - time (sec): 1.99 - samples/sec: 8596.65 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:48:52,598 epoch 6 - iter 96/121 - loss 0.46486521 - time (sec): 2.28 - samples/sec: 8641.31 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:48:52,856 epoch 6 - iter 108/121 - loss 0.46672906 - time (sec): 2.54 - samples/sec: 8654.28 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:48:53,087 epoch 6 - iter 120/121 - loss 0.46874155 - time (sec): 2.77 - samples/sec: 8879.28 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:48:53,104 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:53,104 EPOCH 6 done: loss 0.4705 - lr: 0.000013 |
|
2023-10-18 16:48:53,519 DEV : loss 0.36645838618278503 - f1-score (micro avg) 0.4151 |
|
2023-10-18 16:48:53,524 saving best model |
|
2023-10-18 16:48:53,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:53,838 epoch 7 - iter 12/121 - loss 0.50523558 - time (sec): 0.28 - samples/sec: 9255.92 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:48:54,101 epoch 7 - iter 24/121 - loss 0.52339164 - time (sec): 0.54 - samples/sec: 8657.21 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:48:54,392 epoch 7 - iter 36/121 - loss 0.48059441 - time (sec): 0.83 - samples/sec: 8492.88 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:48:54,658 epoch 7 - iter 48/121 - loss 0.46494498 - time (sec): 1.10 - samples/sec: 8560.68 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:48:54,908 epoch 7 - iter 60/121 - loss 0.45264593 - time (sec): 1.35 - samples/sec: 8831.21 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:48:55,174 epoch 7 - iter 72/121 - loss 0.43933276 - time (sec): 1.62 - samples/sec: 8974.31 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:48:55,445 epoch 7 - iter 84/121 - loss 0.43595279 - time (sec): 1.89 - samples/sec: 8984.70 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:48:55,726 epoch 7 - iter 96/121 - loss 0.43515044 - time (sec): 2.17 - samples/sec: 9174.93 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:48:56,000 epoch 7 - iter 108/121 - loss 0.43071109 - time (sec): 2.44 - samples/sec: 9085.00 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:48:56,270 epoch 7 - iter 120/121 - loss 0.43660508 - time (sec): 2.71 - samples/sec: 9052.66 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:48:56,291 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:56,291 EPOCH 7 done: loss 0.4377 - lr: 0.000010 |
|
2023-10-18 16:48:56,717 DEV : loss 0.34206482768058777 - f1-score (micro avg) 0.4824 |
|
2023-10-18 16:48:56,722 saving best model |
|
2023-10-18 16:48:56,754 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:57,038 epoch 8 - iter 12/121 - loss 0.39657285 - time (sec): 0.28 - samples/sec: 9431.46 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:48:57,301 epoch 8 - iter 24/121 - loss 0.41035317 - time (sec): 0.55 - samples/sec: 9594.04 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:48:57,576 epoch 8 - iter 36/121 - loss 0.43193841 - time (sec): 0.82 - samples/sec: 9142.53 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:48:57,855 epoch 8 - iter 48/121 - loss 0.42462507 - time (sec): 1.10 - samples/sec: 8958.80 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:48:58,128 epoch 8 - iter 60/121 - loss 0.42070240 - time (sec): 1.37 - samples/sec: 9029.39 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:48:58,400 epoch 8 - iter 72/121 - loss 0.43213186 - time (sec): 1.64 - samples/sec: 9184.13 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:48:58,659 epoch 8 - iter 84/121 - loss 0.43272228 - time (sec): 1.90 - samples/sec: 9123.10 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:48:58,943 epoch 8 - iter 96/121 - loss 0.42418418 - time (sec): 2.19 - samples/sec: 9023.54 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:48:59,233 epoch 8 - iter 108/121 - loss 0.42133625 - time (sec): 2.48 - samples/sec: 8978.89 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 16:48:59,501 epoch 8 - iter 120/121 - loss 0.42313856 - time (sec): 2.75 - samples/sec: 8956.45 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 16:48:59,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:48:59,521 EPOCH 8 done: loss 0.4254 - lr: 0.000007 |
|
2023-10-18 16:48:59,953 DEV : loss 0.33342665433883667 - f1-score (micro avg) 0.4978 |
|
2023-10-18 16:48:59,958 saving best model |
|
2023-10-18 16:48:59,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:49:00,250 epoch 9 - iter 12/121 - loss 0.38063288 - time (sec): 0.26 - samples/sec: 7829.16 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:49:00,517 epoch 9 - iter 24/121 - loss 0.39518238 - time (sec): 0.53 - samples/sec: 8546.75 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:49:00,801 epoch 9 - iter 36/121 - loss 0.41125975 - time (sec): 0.81 - samples/sec: 8801.67 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:49:01,072 epoch 9 - iter 48/121 - loss 0.42758374 - time (sec): 1.08 - samples/sec: 8857.82 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:49:01,308 epoch 9 - iter 60/121 - loss 0.43010899 - time (sec): 1.32 - samples/sec: 9144.43 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:49:01,487 epoch 9 - iter 72/121 - loss 0.41322277 - time (sec): 1.50 - samples/sec: 9543.12 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:49:01,756 epoch 9 - iter 84/121 - loss 0.41753163 - time (sec): 1.77 - samples/sec: 9339.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:49:02,050 epoch 9 - iter 96/121 - loss 0.41759424 - time (sec): 2.06 - samples/sec: 9336.66 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 16:49:02,327 epoch 9 - iter 108/121 - loss 0.41214860 - time (sec): 2.34 - samples/sec: 9410.62 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 16:49:02,609 epoch 9 - iter 120/121 - loss 0.40768587 - time (sec): 2.62 - samples/sec: 9430.79 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 16:49:02,630 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:49:02,630 EPOCH 9 done: loss 0.4079 - lr: 0.000004 |
|
2023-10-18 16:49:03,057 DEV : loss 0.3280840814113617 - f1-score (micro avg) 0.5015 |
|
2023-10-18 16:49:03,061 saving best model |
|
2023-10-18 16:49:03,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:49:03,354 epoch 10 - iter 12/121 - loss 0.36093503 - time (sec): 0.26 - samples/sec: 8626.32 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:49:03,619 epoch 10 - iter 24/121 - loss 0.36817826 - time (sec): 0.53 - samples/sec: 8930.16 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:49:03,896 epoch 10 - iter 36/121 - loss 0.38006792 - time (sec): 0.80 - samples/sec: 8872.17 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:49:04,162 epoch 10 - iter 48/121 - loss 0.37238255 - time (sec): 1.07 - samples/sec: 9074.65 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:49:04,428 epoch 10 - iter 60/121 - loss 0.39144881 - time (sec): 1.33 - samples/sec: 9013.20 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:49:04,698 epoch 10 - iter 72/121 - loss 0.40218852 - time (sec): 1.60 - samples/sec: 9206.61 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:49:04,990 epoch 10 - iter 84/121 - loss 0.38920723 - time (sec): 1.90 - samples/sec: 9100.69 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:49:05,255 epoch 10 - iter 96/121 - loss 0.39328880 - time (sec): 2.16 - samples/sec: 9091.63 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:49:05,531 epoch 10 - iter 108/121 - loss 0.39338392 - time (sec): 2.44 - samples/sec: 9085.86 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:49:05,795 epoch 10 - iter 120/121 - loss 0.40198011 - time (sec): 2.70 - samples/sec: 9092.62 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 16:49:05,814 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:49:05,814 EPOCH 10 done: loss 0.4003 - lr: 0.000000 |
|
2023-10-18 16:49:06,241 DEV : loss 0.32836541533470154 - f1-score (micro avg) 0.4978 |
|
2023-10-18 16:49:06,276 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:49:06,276 Loading model from best epoch ... |
|
2023-10-18 16:49:06,355 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date |
|
2023-10-18 16:49:06,782 |
|
Results: |
|
- F-score (micro) 0.4535 |
|
- F-score (macro) 0.2114 |
|
- Accuracy 0.3051 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
pers 0.6099 0.6187 0.6143 139 |
|
scope 0.4361 0.4496 0.4427 129 |
|
work 0.0000 0.0000 0.0000 80 |
|
loc 0.0000 0.0000 0.0000 9 |
|
date 0.0000 0.0000 0.0000 3 |
|
|
|
micro avg 0.5236 0.4000 0.4535 360 |
|
macro avg 0.2092 0.2137 0.2114 360 |
|
weighted avg 0.3918 0.4000 0.3958 360 |
|
|
|
2023-10-18 16:49:06,782 ---------------------------------------------------------------------------------------------------- |
|
|