2023-10-18 16:10:45,266 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:45,266 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 16:10:45,266 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:45,266 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-18 16:10:45,266 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:45,266 Train: 1214 sentences 2023-10-18 16:10:45,266 (train_with_dev=False, train_with_test=False) 2023-10-18 16:10:45,266 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:45,267 Training Params: 2023-10-18 16:10:45,267 - learning_rate: "3e-05" 2023-10-18 16:10:45,267 - mini_batch_size: "8" 2023-10-18 16:10:45,267 - max_epochs: "10" 2023-10-18 16:10:45,267 - shuffle: "True" 2023-10-18 16:10:45,267 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:45,267 Plugins: 2023-10-18 16:10:45,267 - TensorboardLogger 2023-10-18 16:10:45,267 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 16:10:45,267 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:45,267 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 16:10:45,267 - metric: "('micro avg', 'f1-score')" 2023-10-18 16:10:45,267 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:45,267 Computation: 2023-10-18 16:10:45,267 - compute on device: cuda:0 2023-10-18 16:10:45,267 - embedding storage: none 2023-10-18 16:10:45,267 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:45,267 Model training base path: "hmbench-ajmc/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-18 16:10:45,267 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:45,267 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:45,267 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 16:10:45,643 epoch 1 - iter 15/152 - loss 4.00194862 - time (sec): 0.38 - samples/sec: 8872.39 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:10:46,007 epoch 1 - iter 30/152 - loss 3.93032449 - time (sec): 0.74 - samples/sec: 8857.31 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:10:46,391 epoch 1 - iter 45/152 - loss 3.90027579 - time (sec): 1.12 - samples/sec: 8845.87 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:10:46,761 epoch 1 - iter 60/152 - loss 3.82549513 - time (sec): 1.49 - samples/sec: 8629.62 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:10:47,105 epoch 1 - iter 75/152 - loss 3.71965164 - time (sec): 1.84 - samples/sec: 8711.14 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:10:47,450 epoch 1 - iter 90/152 - loss 3.58454282 - time (sec): 2.18 - samples/sec: 8739.34 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:10:47,795 epoch 1 - iter 105/152 - loss 3.43789527 - time (sec): 2.53 - samples/sec: 8731.14 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:10:48,129 epoch 1 - iter 120/152 - loss 3.30319039 - time (sec): 2.86 - samples/sec: 8678.74 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:10:48,470 epoch 1 - iter 135/152 - loss 3.11978823 - time (sec): 3.20 - samples/sec: 8750.76 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:10:48,824 epoch 1 - iter 150/152 - loss 2.97198898 - time (sec): 3.56 - samples/sec: 8635.08 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:10:48,872 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:48,873 EPOCH 1 done: loss 2.9587 - lr: 0.000029 2023-10-18 16:10:49,350 DEV : loss 0.8460401296615601 - f1-score (micro avg) 0.0 2023-10-18 16:10:49,356 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:49,733 epoch 2 - iter 15/152 - loss 0.99039188 - time (sec): 0.38 - samples/sec: 8243.63 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:10:50,111 epoch 2 - iter 30/152 - loss 0.89924649 - time (sec): 0.75 - samples/sec: 8463.66 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:10:50,447 epoch 2 - iter 45/152 - loss 0.89484187 - time (sec): 1.09 - samples/sec: 8546.18 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:10:50,772 epoch 2 - iter 60/152 - loss 0.90047475 - time (sec): 1.42 - samples/sec: 8772.68 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:10:51,082 epoch 2 - iter 75/152 - loss 0.85006453 - time (sec): 1.73 - samples/sec: 8869.57 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:10:51,408 epoch 2 - iter 90/152 - loss 0.85261376 - time (sec): 2.05 - samples/sec: 8921.11 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:10:51,737 epoch 2 - iter 105/152 - loss 0.84452527 - time (sec): 2.38 - samples/sec: 9077.56 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:10:52,067 epoch 2 - iter 120/152 - loss 0.84136420 - time (sec): 2.71 - samples/sec: 9127.68 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:10:52,389 epoch 2 - iter 135/152 - loss 0.82452902 - time (sec): 3.03 - samples/sec: 9111.12 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:10:52,713 epoch 2 - iter 150/152 - loss 0.82874038 - time (sec): 3.36 - samples/sec: 9147.55 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:10:52,755 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:52,755 EPOCH 2 done: loss 0.8277 - lr: 0.000027 2023-10-18 16:10:53,250 DEV : loss 0.6869751214981079 - f1-score (micro avg) 0.0 2023-10-18 16:10:53,256 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:53,594 epoch 3 - iter 15/152 - loss 0.71007861 - time (sec): 0.34 - samples/sec: 8663.36 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:10:53,915 epoch 3 - iter 30/152 - loss 0.72173453 - time (sec): 0.66 - samples/sec: 9286.33 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:10:54,243 epoch 3 - iter 45/152 - loss 0.71552143 - time (sec): 0.99 - samples/sec: 9286.50 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:10:54,580 epoch 3 - iter 60/152 - loss 0.69879105 - time (sec): 1.32 - samples/sec: 9501.57 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:10:54,910 epoch 3 - iter 75/152 - loss 0.66378317 - time (sec): 1.65 - samples/sec: 9334.83 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:10:55,239 epoch 3 - iter 90/152 - loss 0.64699215 - time (sec): 1.98 - samples/sec: 9166.23 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:10:55,579 epoch 3 - iter 105/152 - loss 0.63360388 - time (sec): 2.32 - samples/sec: 9073.35 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:10:55,926 epoch 3 - iter 120/152 - loss 0.63428746 - time (sec): 2.67 - samples/sec: 9078.16 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:10:56,278 epoch 3 - iter 135/152 - loss 0.63744324 - time (sec): 3.02 - samples/sec: 9102.45 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:10:56,627 epoch 3 - iter 150/152 - loss 0.62855569 - time (sec): 3.37 - samples/sec: 9084.64 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:10:56,672 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:56,672 EPOCH 3 done: loss 0.6253 - lr: 0.000023 2023-10-18 16:10:57,159 DEV : loss 0.5135114789009094 - f1-score (micro avg) 0.0181 2023-10-18 16:10:57,164 saving best model 2023-10-18 16:10:57,197 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:10:57,464 epoch 4 - iter 15/152 - loss 0.59486776 - time (sec): 0.27 - samples/sec: 10646.48 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:10:57,733 epoch 4 - iter 30/152 - loss 0.62103914 - time (sec): 0.54 - samples/sec: 11352.89 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:10:57,992 epoch 4 - iter 45/152 - loss 0.59834986 - time (sec): 0.79 - samples/sec: 11531.11 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:10:58,252 epoch 4 - iter 60/152 - loss 0.58045269 - time (sec): 1.05 - samples/sec: 11638.12 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:10:58,513 epoch 4 - iter 75/152 - loss 0.57649671 - time (sec): 1.32 - samples/sec: 11623.58 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:10:58,774 epoch 4 - iter 90/152 - loss 0.56525450 - time (sec): 1.58 - samples/sec: 11620.33 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:10:59,055 epoch 4 - iter 105/152 - loss 0.55064825 - time (sec): 1.86 - samples/sec: 11574.46 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:10:59,336 epoch 4 - iter 120/152 - loss 0.53920561 - time (sec): 2.14 - samples/sec: 11569.34 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:10:59,655 epoch 4 - iter 135/152 - loss 0.53041123 - time (sec): 2.46 - samples/sec: 11269.83 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:10:59,990 epoch 4 - iter 150/152 - loss 0.52477527 - time (sec): 2.79 - samples/sec: 10967.30 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:11:00,034 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:00,034 EPOCH 4 done: loss 0.5230 - lr: 0.000020 2023-10-18 16:11:00,549 DEV : loss 0.42890670895576477 - f1-score (micro avg) 0.2158 2023-10-18 16:11:00,554 saving best model 2023-10-18 16:11:00,587 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:00,913 epoch 5 - iter 15/152 - loss 0.42012484 - time (sec): 0.33 - samples/sec: 9828.59 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:11:01,250 epoch 5 - iter 30/152 - loss 0.45240271 - time (sec): 0.66 - samples/sec: 9358.30 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:11:01,575 epoch 5 - iter 45/152 - loss 0.44060881 - time (sec): 0.99 - samples/sec: 9068.59 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:11:01,913 epoch 5 - iter 60/152 - loss 0.48649653 - time (sec): 1.33 - samples/sec: 8980.75 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:11:02,245 epoch 5 - iter 75/152 - loss 0.48018059 - time (sec): 1.66 - samples/sec: 9075.73 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:11:02,572 epoch 5 - iter 90/152 - loss 0.47988015 - time (sec): 1.98 - samples/sec: 9281.66 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:11:02,886 epoch 5 - iter 105/152 - loss 0.47894319 - time (sec): 2.30 - samples/sec: 9371.53 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:11:03,203 epoch 5 - iter 120/152 - loss 0.47141827 - time (sec): 2.62 - samples/sec: 9356.51 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:11:03,541 epoch 5 - iter 135/152 - loss 0.46962915 - time (sec): 2.95 - samples/sec: 9355.58 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:11:03,858 epoch 5 - iter 150/152 - loss 0.45784441 - time (sec): 3.27 - samples/sec: 9367.37 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:11:03,896 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:03,897 EPOCH 5 done: loss 0.4599 - lr: 0.000017 2023-10-18 16:11:04,412 DEV : loss 0.38793084025382996 - f1-score (micro avg) 0.2661 2023-10-18 16:11:04,417 saving best model 2023-10-18 16:11:04,449 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:04,780 epoch 6 - iter 15/152 - loss 0.38728378 - time (sec): 0.33 - samples/sec: 8716.49 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:11:05,103 epoch 6 - iter 30/152 - loss 0.43917481 - time (sec): 0.65 - samples/sec: 9334.42 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:11:05,419 epoch 6 - iter 45/152 - loss 0.42210125 - time (sec): 0.97 - samples/sec: 9261.61 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:11:05,739 epoch 6 - iter 60/152 - loss 0.43358170 - time (sec): 1.29 - samples/sec: 9215.79 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:11:06,062 epoch 6 - iter 75/152 - loss 0.43677357 - time (sec): 1.61 - samples/sec: 9142.77 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:11:06,400 epoch 6 - iter 90/152 - loss 0.43060696 - time (sec): 1.95 - samples/sec: 9191.67 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:11:06,719 epoch 6 - iter 105/152 - loss 0.42919675 - time (sec): 2.27 - samples/sec: 9261.38 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:11:07,046 epoch 6 - iter 120/152 - loss 0.41978387 - time (sec): 2.60 - samples/sec: 9343.64 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:11:07,399 epoch 6 - iter 135/152 - loss 0.42473343 - time (sec): 2.95 - samples/sec: 9277.22 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:11:07,750 epoch 6 - iter 150/152 - loss 0.42455247 - time (sec): 3.30 - samples/sec: 9270.99 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:11:07,792 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:07,792 EPOCH 6 done: loss 0.4240 - lr: 0.000013 2023-10-18 16:11:08,294 DEV : loss 0.36627092957496643 - f1-score (micro avg) 0.3059 2023-10-18 16:11:08,299 saving best model 2023-10-18 16:11:08,332 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:08,684 epoch 7 - iter 15/152 - loss 0.42443388 - time (sec): 0.35 - samples/sec: 9353.96 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:11:09,013 epoch 7 - iter 30/152 - loss 0.43107976 - time (sec): 0.68 - samples/sec: 9315.62 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:11:09,355 epoch 7 - iter 45/152 - loss 0.43213838 - time (sec): 1.02 - samples/sec: 9195.84 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:11:09,677 epoch 7 - iter 60/152 - loss 0.43004851 - time (sec): 1.34 - samples/sec: 9259.06 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:11:10,021 epoch 7 - iter 75/152 - loss 0.43143165 - time (sec): 1.69 - samples/sec: 9178.06 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:11:10,343 epoch 7 - iter 90/152 - loss 0.42030296 - time (sec): 2.01 - samples/sec: 9080.88 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:11:10,670 epoch 7 - iter 105/152 - loss 0.41714489 - time (sec): 2.34 - samples/sec: 9168.85 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:11:10,989 epoch 7 - iter 120/152 - loss 0.41080530 - time (sec): 2.66 - samples/sec: 9172.11 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:11:11,304 epoch 7 - iter 135/152 - loss 0.40597047 - time (sec): 2.97 - samples/sec: 9266.47 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:11:11,630 epoch 7 - iter 150/152 - loss 0.40237893 - time (sec): 3.30 - samples/sec: 9282.50 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:11:11,674 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:11,674 EPOCH 7 done: loss 0.3989 - lr: 0.000010 2023-10-18 16:11:12,189 DEV : loss 0.34827205538749695 - f1-score (micro avg) 0.3443 2023-10-18 16:11:12,195 saving best model 2023-10-18 16:11:12,228 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:12,561 epoch 8 - iter 15/152 - loss 0.39763025 - time (sec): 0.33 - samples/sec: 9434.85 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:11:12,887 epoch 8 - iter 30/152 - loss 0.39423381 - time (sec): 0.66 - samples/sec: 9268.55 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:11:13,213 epoch 8 - iter 45/152 - loss 0.40063428 - time (sec): 0.98 - samples/sec: 9205.29 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:11:13,548 epoch 8 - iter 60/152 - loss 0.39201927 - time (sec): 1.32 - samples/sec: 9205.39 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:11:13,904 epoch 8 - iter 75/152 - loss 0.39276233 - time (sec): 1.67 - samples/sec: 9062.29 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:11:14,257 epoch 8 - iter 90/152 - loss 0.39413049 - time (sec): 2.03 - samples/sec: 8991.94 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:11:14,598 epoch 8 - iter 105/152 - loss 0.39315056 - time (sec): 2.37 - samples/sec: 9030.40 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:11:14,938 epoch 8 - iter 120/152 - loss 0.39718127 - time (sec): 2.71 - samples/sec: 9105.15 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:11:15,275 epoch 8 - iter 135/152 - loss 0.39900868 - time (sec): 3.05 - samples/sec: 9122.55 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:11:15,604 epoch 8 - iter 150/152 - loss 0.38543416 - time (sec): 3.37 - samples/sec: 9100.19 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:11:15,642 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:15,643 EPOCH 8 done: loss 0.3853 - lr: 0.000007 2023-10-18 16:11:16,160 DEV : loss 0.3415715992450714 - f1-score (micro avg) 0.3607 2023-10-18 16:11:16,166 saving best model 2023-10-18 16:11:16,198 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:16,550 epoch 9 - iter 15/152 - loss 0.37449454 - time (sec): 0.35 - samples/sec: 9501.05 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:11:16,873 epoch 9 - iter 30/152 - loss 0.35152597 - time (sec): 0.67 - samples/sec: 9574.63 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:11:17,192 epoch 9 - iter 45/152 - loss 0.36547659 - time (sec): 0.99 - samples/sec: 9754.90 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:11:17,530 epoch 9 - iter 60/152 - loss 0.35377707 - time (sec): 1.33 - samples/sec: 9596.35 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:11:17,857 epoch 9 - iter 75/152 - loss 0.37707230 - time (sec): 1.66 - samples/sec: 9577.50 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:11:18,184 epoch 9 - iter 90/152 - loss 0.38457408 - time (sec): 1.98 - samples/sec: 9608.64 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:11:18,499 epoch 9 - iter 105/152 - loss 0.38246205 - time (sec): 2.30 - samples/sec: 9499.41 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:11:18,821 epoch 9 - iter 120/152 - loss 0.37841646 - time (sec): 2.62 - samples/sec: 9440.01 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:11:19,157 epoch 9 - iter 135/152 - loss 0.38219358 - time (sec): 2.96 - samples/sec: 9318.53 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:11:19,474 epoch 9 - iter 150/152 - loss 0.37840206 - time (sec): 3.27 - samples/sec: 9363.98 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:11:19,513 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:19,513 EPOCH 9 done: loss 0.3758 - lr: 0.000004 2023-10-18 16:11:20,019 DEV : loss 0.33774086833000183 - f1-score (micro avg) 0.3705 2023-10-18 16:11:20,025 saving best model 2023-10-18 16:11:20,058 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:20,382 epoch 10 - iter 15/152 - loss 0.29312128 - time (sec): 0.32 - samples/sec: 9094.57 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:11:20,702 epoch 10 - iter 30/152 - loss 0.34740243 - time (sec): 0.64 - samples/sec: 9449.94 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:11:21,037 epoch 10 - iter 45/152 - loss 0.33882855 - time (sec): 0.98 - samples/sec: 9418.48 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:11:21,363 epoch 10 - iter 60/152 - loss 0.34352526 - time (sec): 1.30 - samples/sec: 9374.66 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:11:21,701 epoch 10 - iter 75/152 - loss 0.34753687 - time (sec): 1.64 - samples/sec: 9215.78 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:11:22,038 epoch 10 - iter 90/152 - loss 0.35076983 - time (sec): 1.98 - samples/sec: 9267.92 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:11:22,380 epoch 10 - iter 105/152 - loss 0.35893496 - time (sec): 2.32 - samples/sec: 9208.24 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:11:22,721 epoch 10 - iter 120/152 - loss 0.35889586 - time (sec): 2.66 - samples/sec: 9224.37 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:11:23,061 epoch 10 - iter 135/152 - loss 0.35917971 - time (sec): 3.00 - samples/sec: 9205.51 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:11:23,397 epoch 10 - iter 150/152 - loss 0.36218218 - time (sec): 3.34 - samples/sec: 9180.46 - lr: 0.000000 - momentum: 0.000000 2023-10-18 16:11:23,438 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:23,438 EPOCH 10 done: loss 0.3616 - lr: 0.000000 2023-10-18 16:11:23,957 DEV : loss 0.33346304297447205 - f1-score (micro avg) 0.3808 2023-10-18 16:11:23,963 saving best model 2023-10-18 16:11:24,024 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:11:24,025 Loading model from best epoch ... 2023-10-18 16:11:24,104 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-18 16:11:24,588 Results: - F-score (micro) 0.3636 - F-score (macro) 0.2306 - Accuracy 0.2285 By class: precision recall f1-score support scope 0.3536 0.4238 0.3855 151 work 0.1533 0.2421 0.1878 95 pers 0.6375 0.5312 0.5795 96 loc 0.0000 0.0000 0.0000 3 date 0.0000 0.0000 0.0000 3 micro avg 0.3358 0.3966 0.3636 348 macro avg 0.2289 0.2394 0.2306 348 weighted avg 0.3711 0.3966 0.3784 348 2023-10-18 16:11:24,589 ----------------------------------------------------------------------------------------------------