2023-10-18 16:37:10,346 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:10,346 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 16:37:10,346 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:10,346 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-18 16:37:10,346 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:10,346 Train: 966 sentences 2023-10-18 16:37:10,346 (train_with_dev=False, train_with_test=False) 2023-10-18 16:37:10,346 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:10,346 Training Params: 2023-10-18 16:37:10,346 - learning_rate: "3e-05" 2023-10-18 16:37:10,346 - mini_batch_size: "8" 2023-10-18 16:37:10,346 - max_epochs: "10" 2023-10-18 16:37:10,346 - shuffle: "True" 2023-10-18 16:37:10,346 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:10,346 Plugins: 2023-10-18 16:37:10,346 - TensorboardLogger 2023-10-18 16:37:10,346 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 16:37:10,346 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:10,346 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 16:37:10,347 - metric: "('micro avg', 'f1-score')" 2023-10-18 16:37:10,347 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:10,347 Computation: 2023-10-18 16:37:10,347 - compute on device: cuda:0 2023-10-18 16:37:10,347 - embedding storage: none 2023-10-18 16:37:10,347 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:10,347 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-18 16:37:10,347 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:10,347 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:10,347 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 16:37:10,642 epoch 1 - iter 12/121 - loss 4.06949337 - time (sec): 0.29 - samples/sec: 8569.37 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:37:10,868 epoch 1 - iter 24/121 - loss 3.92402343 - time (sec): 0.52 - samples/sec: 9461.52 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:37:11,154 epoch 1 - iter 36/121 - loss 3.90924281 - time (sec): 0.81 - samples/sec: 9363.92 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:37:11,430 epoch 1 - iter 48/121 - loss 3.82420542 - time (sec): 1.08 - samples/sec: 9588.44 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:37:11,693 epoch 1 - iter 60/121 - loss 3.73197307 - time (sec): 1.35 - samples/sec: 9266.92 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:37:11,954 epoch 1 - iter 72/121 - loss 3.64700568 - time (sec): 1.61 - samples/sec: 9243.81 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:37:12,213 epoch 1 - iter 84/121 - loss 3.53731266 - time (sec): 1.87 - samples/sec: 9200.17 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:37:12,486 epoch 1 - iter 96/121 - loss 3.39255785 - time (sec): 2.14 - samples/sec: 9150.27 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:37:12,760 epoch 1 - iter 108/121 - loss 3.23217265 - time (sec): 2.41 - samples/sec: 9237.27 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:37:13,041 epoch 1 - iter 120/121 - loss 3.07201923 - time (sec): 2.69 - samples/sec: 9114.86 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:37:13,062 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:13,062 EPOCH 1 done: loss 3.0616 - lr: 0.000030 2023-10-18 16:37:13,460 DEV : loss 0.8901988863945007 - f1-score (micro avg) 0.0 2023-10-18 16:37:13,464 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:13,722 epoch 2 - iter 12/121 - loss 1.24918228 - time (sec): 0.26 - samples/sec: 9602.14 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:37:13,986 epoch 2 - iter 24/121 - loss 1.18625425 - time (sec): 0.52 - samples/sec: 10018.70 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:37:14,250 epoch 2 - iter 36/121 - loss 1.08567570 - time (sec): 0.79 - samples/sec: 10150.88 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:37:14,509 epoch 2 - iter 48/121 - loss 1.00221450 - time (sec): 1.04 - samples/sec: 10187.16 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:37:14,762 epoch 2 - iter 60/121 - loss 0.96635268 - time (sec): 1.30 - samples/sec: 9676.55 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:37:15,021 epoch 2 - iter 72/121 - loss 0.93612091 - time (sec): 1.56 - samples/sec: 9460.36 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:37:15,284 epoch 2 - iter 84/121 - loss 0.91233165 - time (sec): 1.82 - samples/sec: 9377.43 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:37:15,570 epoch 2 - iter 96/121 - loss 0.89752174 - time (sec): 2.11 - samples/sec: 9341.05 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:37:15,833 epoch 2 - iter 108/121 - loss 0.87281283 - time (sec): 2.37 - samples/sec: 9350.18 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:37:16,120 epoch 2 - iter 120/121 - loss 0.84135366 - time (sec): 2.66 - samples/sec: 9264.18 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:37:16,139 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:16,139 EPOCH 2 done: loss 0.8412 - lr: 0.000027 2023-10-18 16:37:16,554 DEV : loss 0.6392806172370911 - f1-score (micro avg) 0.0 2023-10-18 16:37:16,559 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:16,819 epoch 3 - iter 12/121 - loss 0.65506124 - time (sec): 0.26 - samples/sec: 9540.80 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:37:17,076 epoch 3 - iter 24/121 - loss 0.67407294 - time (sec): 0.52 - samples/sec: 8975.88 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:37:17,340 epoch 3 - iter 36/121 - loss 0.70988086 - time (sec): 0.78 - samples/sec: 8922.45 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:37:17,609 epoch 3 - iter 48/121 - loss 0.69876544 - time (sec): 1.05 - samples/sec: 9045.37 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:37:17,883 epoch 3 - iter 60/121 - loss 0.69988096 - time (sec): 1.32 - samples/sec: 8994.96 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:37:18,176 epoch 3 - iter 72/121 - loss 0.67348692 - time (sec): 1.62 - samples/sec: 9060.80 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:37:18,461 epoch 3 - iter 84/121 - loss 0.66848577 - time (sec): 1.90 - samples/sec: 8985.50 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:37:18,748 epoch 3 - iter 96/121 - loss 0.66604896 - time (sec): 2.19 - samples/sec: 8896.87 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:37:19,033 epoch 3 - iter 108/121 - loss 0.65559393 - time (sec): 2.47 - samples/sec: 8921.85 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:37:19,332 epoch 3 - iter 120/121 - loss 0.65564264 - time (sec): 2.77 - samples/sec: 8849.12 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:37:19,356 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:19,356 EPOCH 3 done: loss 0.6560 - lr: 0.000023 2023-10-18 16:37:19,767 DEV : loss 0.5729600191116333 - f1-score (micro avg) 0.0 2023-10-18 16:37:19,772 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:20,041 epoch 4 - iter 12/121 - loss 0.67795030 - time (sec): 0.27 - samples/sec: 10089.09 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:37:20,308 epoch 4 - iter 24/121 - loss 0.63753020 - time (sec): 0.54 - samples/sec: 9285.86 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:37:20,579 epoch 4 - iter 36/121 - loss 0.63036612 - time (sec): 0.81 - samples/sec: 9042.59 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:37:20,852 epoch 4 - iter 48/121 - loss 0.61503396 - time (sec): 1.08 - samples/sec: 9010.56 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:37:21,127 epoch 4 - iter 60/121 - loss 0.61113920 - time (sec): 1.36 - samples/sec: 9213.20 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:37:21,397 epoch 4 - iter 72/121 - loss 0.60566216 - time (sec): 1.63 - samples/sec: 9134.02 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:37:21,663 epoch 4 - iter 84/121 - loss 0.59600441 - time (sec): 1.89 - samples/sec: 9078.00 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:37:21,939 epoch 4 - iter 96/121 - loss 0.59427862 - time (sec): 2.17 - samples/sec: 9138.26 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:37:22,209 epoch 4 - iter 108/121 - loss 0.60191767 - time (sec): 2.44 - samples/sec: 9099.19 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:37:22,488 epoch 4 - iter 120/121 - loss 0.59792216 - time (sec): 2.72 - samples/sec: 9060.52 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:37:22,507 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:22,507 EPOCH 4 done: loss 0.5972 - lr: 0.000020 2023-10-18 16:37:22,925 DEV : loss 0.4878302812576294 - f1-score (micro avg) 0.0 2023-10-18 16:37:22,929 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:23,190 epoch 5 - iter 12/121 - loss 0.55930794 - time (sec): 0.26 - samples/sec: 8987.35 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:37:23,417 epoch 5 - iter 24/121 - loss 0.57064222 - time (sec): 0.49 - samples/sec: 9799.07 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:37:23,676 epoch 5 - iter 36/121 - loss 0.54446297 - time (sec): 0.75 - samples/sec: 9575.48 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:37:23,954 epoch 5 - iter 48/121 - loss 0.53475011 - time (sec): 1.02 - samples/sec: 9721.46 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:37:24,234 epoch 5 - iter 60/121 - loss 0.54399535 - time (sec): 1.30 - samples/sec: 9633.64 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:37:24,503 epoch 5 - iter 72/121 - loss 0.53901660 - time (sec): 1.57 - samples/sec: 9547.64 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:37:24,773 epoch 5 - iter 84/121 - loss 0.53479758 - time (sec): 1.84 - samples/sec: 9385.47 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:37:25,042 epoch 5 - iter 96/121 - loss 0.53036123 - time (sec): 2.11 - samples/sec: 9407.08 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:37:25,308 epoch 5 - iter 108/121 - loss 0.53524369 - time (sec): 2.38 - samples/sec: 9357.12 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:37:25,582 epoch 5 - iter 120/121 - loss 0.52939372 - time (sec): 2.65 - samples/sec: 9271.87 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:37:25,600 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:25,600 EPOCH 5 done: loss 0.5310 - lr: 0.000017 2023-10-18 16:37:26,020 DEV : loss 0.42453038692474365 - f1-score (micro avg) 0.2495 2023-10-18 16:37:26,024 saving best model 2023-10-18 16:37:26,054 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:26,333 epoch 6 - iter 12/121 - loss 0.47014416 - time (sec): 0.28 - samples/sec: 9218.56 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:37:26,609 epoch 6 - iter 24/121 - loss 0.48371661 - time (sec): 0.55 - samples/sec: 9277.08 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:37:26,874 epoch 6 - iter 36/121 - loss 0.47938174 - time (sec): 0.82 - samples/sec: 9329.95 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:37:27,142 epoch 6 - iter 48/121 - loss 0.48943329 - time (sec): 1.09 - samples/sec: 9350.29 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:37:27,413 epoch 6 - iter 60/121 - loss 0.47826806 - time (sec): 1.36 - samples/sec: 9329.89 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:37:27,690 epoch 6 - iter 72/121 - loss 0.47219903 - time (sec): 1.63 - samples/sec: 9269.02 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:37:27,955 epoch 6 - iter 84/121 - loss 0.46913442 - time (sec): 1.90 - samples/sec: 9137.79 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:37:28,230 epoch 6 - iter 96/121 - loss 0.48002819 - time (sec): 2.18 - samples/sec: 9105.99 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:37:28,517 epoch 6 - iter 108/121 - loss 0.48759090 - time (sec): 2.46 - samples/sec: 9057.06 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:37:28,779 epoch 6 - iter 120/121 - loss 0.48404846 - time (sec): 2.72 - samples/sec: 9033.51 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:37:28,798 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:28,798 EPOCH 6 done: loss 0.4821 - lr: 0.000013 2023-10-18 16:37:29,224 DEV : loss 0.3911217451095581 - f1-score (micro avg) 0.3772 2023-10-18 16:37:29,229 saving best model 2023-10-18 16:37:29,270 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:29,542 epoch 7 - iter 12/121 - loss 0.43796750 - time (sec): 0.27 - samples/sec: 10272.77 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:37:29,808 epoch 7 - iter 24/121 - loss 0.44618841 - time (sec): 0.54 - samples/sec: 9812.14 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:37:30,087 epoch 7 - iter 36/121 - loss 0.44995230 - time (sec): 0.82 - samples/sec: 9382.21 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:37:30,354 epoch 7 - iter 48/121 - loss 0.46082776 - time (sec): 1.08 - samples/sec: 9179.47 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:37:30,629 epoch 7 - iter 60/121 - loss 0.46132547 - time (sec): 1.36 - samples/sec: 9012.35 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:37:30,895 epoch 7 - iter 72/121 - loss 0.45612570 - time (sec): 1.62 - samples/sec: 8974.32 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:37:31,177 epoch 7 - iter 84/121 - loss 0.45907753 - time (sec): 1.91 - samples/sec: 8961.68 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:37:31,437 epoch 7 - iter 96/121 - loss 0.45990670 - time (sec): 2.17 - samples/sec: 9045.83 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:37:31,704 epoch 7 - iter 108/121 - loss 0.46245562 - time (sec): 2.43 - samples/sec: 9068.47 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:37:31,965 epoch 7 - iter 120/121 - loss 0.46073503 - time (sec): 2.70 - samples/sec: 9125.68 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:37:31,983 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:31,983 EPOCH 7 done: loss 0.4611 - lr: 0.000010 2023-10-18 16:37:32,401 DEV : loss 0.367324560880661 - f1-score (micro avg) 0.4359 2023-10-18 16:37:32,405 saving best model 2023-10-18 16:37:32,440 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:32,716 epoch 8 - iter 12/121 - loss 0.58329257 - time (sec): 0.28 - samples/sec: 9925.29 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:37:32,967 epoch 8 - iter 24/121 - loss 0.49244715 - time (sec): 0.53 - samples/sec: 9450.43 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:37:33,232 epoch 8 - iter 36/121 - loss 0.46752021 - time (sec): 0.79 - samples/sec: 9373.41 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:37:33,492 epoch 8 - iter 48/121 - loss 0.44705756 - time (sec): 1.05 - samples/sec: 9475.98 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:37:33,767 epoch 8 - iter 60/121 - loss 0.44367698 - time (sec): 1.33 - samples/sec: 9467.50 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:37:34,025 epoch 8 - iter 72/121 - loss 0.43995896 - time (sec): 1.58 - samples/sec: 9341.63 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:37:34,297 epoch 8 - iter 84/121 - loss 0.43460896 - time (sec): 1.86 - samples/sec: 9304.63 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:37:34,559 epoch 8 - iter 96/121 - loss 0.43632047 - time (sec): 2.12 - samples/sec: 9361.30 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:37:34,826 epoch 8 - iter 108/121 - loss 0.44408811 - time (sec): 2.39 - samples/sec: 9375.39 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:37:35,080 epoch 8 - iter 120/121 - loss 0.44238220 - time (sec): 2.64 - samples/sec: 9333.91 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:37:35,100 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:35,100 EPOCH 8 done: loss 0.4417 - lr: 0.000007 2023-10-18 16:37:35,537 DEV : loss 0.35510483384132385 - f1-score (micro avg) 0.4573 2023-10-18 16:37:35,542 saving best model 2023-10-18 16:37:35,579 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:35,866 epoch 9 - iter 12/121 - loss 0.48007808 - time (sec): 0.29 - samples/sec: 8257.07 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:37:36,143 epoch 9 - iter 24/121 - loss 0.45715634 - time (sec): 0.56 - samples/sec: 8307.93 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:37:36,433 epoch 9 - iter 36/121 - loss 0.44856353 - time (sec): 0.85 - samples/sec: 8518.03 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:37:36,685 epoch 9 - iter 48/121 - loss 0.43417257 - time (sec): 1.11 - samples/sec: 8654.16 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:37:36,969 epoch 9 - iter 60/121 - loss 0.43697026 - time (sec): 1.39 - samples/sec: 8679.61 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:37:37,242 epoch 9 - iter 72/121 - loss 0.42932000 - time (sec): 1.66 - samples/sec: 8799.54 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:37:37,513 epoch 9 - iter 84/121 - loss 0.43959700 - time (sec): 1.93 - samples/sec: 8849.40 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:37:37,777 epoch 9 - iter 96/121 - loss 0.43591501 - time (sec): 2.20 - samples/sec: 8898.91 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:37:38,047 epoch 9 - iter 108/121 - loss 0.43011628 - time (sec): 2.47 - samples/sec: 8975.11 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:37:38,323 epoch 9 - iter 120/121 - loss 0.42551016 - time (sec): 2.74 - samples/sec: 8973.30 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:37:38,343 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:38,343 EPOCH 9 done: loss 0.4272 - lr: 0.000004 2023-10-18 16:37:38,773 DEV : loss 0.3516523838043213 - f1-score (micro avg) 0.4615 2023-10-18 16:37:38,777 saving best model 2023-10-18 16:37:38,815 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:39,069 epoch 10 - iter 12/121 - loss 0.43269963 - time (sec): 0.25 - samples/sec: 8269.27 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:37:39,340 epoch 10 - iter 24/121 - loss 0.43360888 - time (sec): 0.52 - samples/sec: 8630.86 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:37:39,605 epoch 10 - iter 36/121 - loss 0.42821907 - time (sec): 0.79 - samples/sec: 8866.28 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:37:39,891 epoch 10 - iter 48/121 - loss 0.43666661 - time (sec): 1.07 - samples/sec: 8924.94 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:37:40,165 epoch 10 - iter 60/121 - loss 0.42576494 - time (sec): 1.35 - samples/sec: 8963.72 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:37:40,441 epoch 10 - iter 72/121 - loss 0.41673630 - time (sec): 1.63 - samples/sec: 9017.34 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:37:40,710 epoch 10 - iter 84/121 - loss 0.43130808 - time (sec): 1.89 - samples/sec: 9017.85 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:37:40,971 epoch 10 - iter 96/121 - loss 0.43872535 - time (sec): 2.15 - samples/sec: 9065.70 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:37:41,234 epoch 10 - iter 108/121 - loss 0.43176197 - time (sec): 2.42 - samples/sec: 9085.36 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:37:41,505 epoch 10 - iter 120/121 - loss 0.42563670 - time (sec): 2.69 - samples/sec: 9174.29 - lr: 0.000000 - momentum: 0.000000 2023-10-18 16:37:41,524 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:41,524 EPOCH 10 done: loss 0.4254 - lr: 0.000000 2023-10-18 16:37:41,958 DEV : loss 0.34855473041534424 - f1-score (micro avg) 0.4695 2023-10-18 16:37:41,963 saving best model 2023-10-18 16:37:42,024 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:37:42,024 Loading model from best epoch ... 2023-10-18 16:37:42,104 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 16:37:42,523 Results: - F-score (micro) 0.4185 - F-score (macro) 0.1967 - Accuracy 0.2735 By class: precision recall f1-score support scope 0.4056 0.4496 0.4265 129 pers 0.5935 0.5252 0.5573 139 work 0.0000 0.0000 0.0000 80 loc 0.0000 0.0000 0.0000 9 date 0.0000 0.0000 0.0000 3 micro avg 0.4925 0.3639 0.4185 360 macro avg 0.1998 0.1950 0.1967 360 weighted avg 0.3745 0.3639 0.3680 360 2023-10-18 16:37:42,524 ----------------------------------------------------------------------------------------------------