2023-10-18 22:43:26,205 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:26,205 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 22:43:26,205 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:26,205 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-18 22:43:26,205 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:26,205 Train: 5777 sentences 2023-10-18 22:43:26,205 (train_with_dev=False, train_with_test=False) 2023-10-18 22:43:26,205 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:26,205 Training Params: 2023-10-18 22:43:26,205 - learning_rate: "5e-05" 2023-10-18 22:43:26,205 - mini_batch_size: "8" 2023-10-18 22:43:26,205 - max_epochs: "10" 2023-10-18 22:43:26,205 - shuffle: "True" 2023-10-18 22:43:26,205 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:26,205 Plugins: 2023-10-18 22:43:26,206 - TensorboardLogger 2023-10-18 22:43:26,206 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 22:43:26,206 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:26,206 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 22:43:26,206 - metric: "('micro avg', 'f1-score')" 2023-10-18 22:43:26,206 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:26,206 Computation: 2023-10-18 22:43:26,206 - compute on device: cuda:0 2023-10-18 22:43:26,206 - embedding storage: none 2023-10-18 22:43:26,206 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:26,206 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-18 22:43:26,206 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:26,206 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:26,206 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 22:43:27,760 epoch 1 - iter 72/723 - loss 2.94746003 - time (sec): 1.55 - samples/sec: 11479.97 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:43:29,590 epoch 1 - iter 144/723 - loss 2.63223089 - time (sec): 3.38 - samples/sec: 10690.87 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:43:31,387 epoch 1 - iter 216/723 - loss 2.22712743 - time (sec): 5.18 - samples/sec: 10302.01 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:43:33,149 epoch 1 - iter 288/723 - loss 1.81640375 - time (sec): 6.94 - samples/sec: 10247.87 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:43:34,912 epoch 1 - iter 360/723 - loss 1.51715194 - time (sec): 8.71 - samples/sec: 10274.19 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:43:36,675 epoch 1 - iter 432/723 - loss 1.32254370 - time (sec): 10.47 - samples/sec: 10224.03 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:43:38,449 epoch 1 - iter 504/723 - loss 1.17279273 - time (sec): 12.24 - samples/sec: 10228.14 - lr: 0.000035 - momentum: 0.000000 2023-10-18 22:43:40,193 epoch 1 - iter 576/723 - loss 1.06826804 - time (sec): 13.99 - samples/sec: 10182.37 - lr: 0.000040 - momentum: 0.000000 2023-10-18 22:43:41,921 epoch 1 - iter 648/723 - loss 0.98793177 - time (sec): 15.71 - samples/sec: 10124.74 - lr: 0.000045 - momentum: 0.000000 2023-10-18 22:43:43,652 epoch 1 - iter 720/723 - loss 0.91913895 - time (sec): 17.45 - samples/sec: 10077.21 - lr: 0.000050 - momentum: 0.000000 2023-10-18 22:43:43,709 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:43,709 EPOCH 1 done: loss 0.9178 - lr: 0.000050 2023-10-18 22:43:44,970 DEV : loss 0.3095312714576721 - f1-score (micro avg) 0.0103 2023-10-18 22:43:44,984 saving best model 2023-10-18 22:43:45,015 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:46,740 epoch 2 - iter 72/723 - loss 0.23038443 - time (sec): 1.72 - samples/sec: 9816.59 - lr: 0.000049 - momentum: 0.000000 2023-10-18 22:43:48,472 epoch 2 - iter 144/723 - loss 0.22913658 - time (sec): 3.46 - samples/sec: 9990.04 - lr: 0.000049 - momentum: 0.000000 2023-10-18 22:43:50,329 epoch 2 - iter 216/723 - loss 0.21771951 - time (sec): 5.31 - samples/sec: 9885.36 - lr: 0.000048 - momentum: 0.000000 2023-10-18 22:43:52,078 epoch 2 - iter 288/723 - loss 0.22092270 - time (sec): 7.06 - samples/sec: 9815.71 - lr: 0.000048 - momentum: 0.000000 2023-10-18 22:43:53,843 epoch 2 - iter 360/723 - loss 0.21669715 - time (sec): 8.83 - samples/sec: 9797.59 - lr: 0.000047 - momentum: 0.000000 2023-10-18 22:43:55,641 epoch 2 - iter 432/723 - loss 0.21121141 - time (sec): 10.63 - samples/sec: 9825.29 - lr: 0.000047 - momentum: 0.000000 2023-10-18 22:43:57,429 epoch 2 - iter 504/723 - loss 0.21097921 - time (sec): 12.41 - samples/sec: 9841.68 - lr: 0.000046 - momentum: 0.000000 2023-10-18 22:43:59,222 epoch 2 - iter 576/723 - loss 0.21002392 - time (sec): 14.21 - samples/sec: 9936.06 - lr: 0.000046 - momentum: 0.000000 2023-10-18 22:44:00,969 epoch 2 - iter 648/723 - loss 0.20469560 - time (sec): 15.95 - samples/sec: 9954.58 - lr: 0.000045 - momentum: 0.000000 2023-10-18 22:44:02,759 epoch 2 - iter 720/723 - loss 0.20751716 - time (sec): 17.74 - samples/sec: 9900.49 - lr: 0.000044 - momentum: 0.000000 2023-10-18 22:44:02,823 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:44:02,823 EPOCH 2 done: loss 0.2076 - lr: 0.000044 2023-10-18 22:44:04,913 DEV : loss 0.22874997556209564 - f1-score (micro avg) 0.3226 2023-10-18 22:44:04,927 saving best model 2023-10-18 22:44:04,963 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:44:06,681 epoch 3 - iter 72/723 - loss 0.19702431 - time (sec): 1.72 - samples/sec: 9996.53 - lr: 0.000044 - momentum: 0.000000 2023-10-18 22:44:08,443 epoch 3 - iter 144/723 - loss 0.17881968 - time (sec): 3.48 - samples/sec: 10130.79 - lr: 0.000043 - momentum: 0.000000 2023-10-18 22:44:10,191 epoch 3 - iter 216/723 - loss 0.17487318 - time (sec): 5.23 - samples/sec: 10106.32 - lr: 0.000043 - momentum: 0.000000 2023-10-18 22:44:11,999 epoch 3 - iter 288/723 - loss 0.17461682 - time (sec): 7.03 - samples/sec: 10051.48 - lr: 0.000042 - momentum: 0.000000 2023-10-18 22:44:13,731 epoch 3 - iter 360/723 - loss 0.17716236 - time (sec): 8.77 - samples/sec: 9860.01 - lr: 0.000042 - momentum: 0.000000 2023-10-18 22:44:15,479 epoch 3 - iter 432/723 - loss 0.17710955 - time (sec): 10.52 - samples/sec: 9924.64 - lr: 0.000041 - momentum: 0.000000 2023-10-18 22:44:17,337 epoch 3 - iter 504/723 - loss 0.17624132 - time (sec): 12.37 - samples/sec: 9914.86 - lr: 0.000041 - momentum: 0.000000 2023-10-18 22:44:19,041 epoch 3 - iter 576/723 - loss 0.17758052 - time (sec): 14.08 - samples/sec: 9885.75 - lr: 0.000040 - momentum: 0.000000 2023-10-18 22:44:20,885 epoch 3 - iter 648/723 - loss 0.17810054 - time (sec): 15.92 - samples/sec: 9918.69 - lr: 0.000039 - momentum: 0.000000 2023-10-18 22:44:22,686 epoch 3 - iter 720/723 - loss 0.17299620 - time (sec): 17.72 - samples/sec: 9915.02 - lr: 0.000039 - momentum: 0.000000 2023-10-18 22:44:22,744 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:44:22,744 EPOCH 3 done: loss 0.1729 - lr: 0.000039 2023-10-18 22:44:24,496 DEV : loss 0.1987297236919403 - f1-score (micro avg) 0.4341 2023-10-18 22:44:24,511 saving best model 2023-10-18 22:44:24,547 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:44:26,337 epoch 4 - iter 72/723 - loss 0.17048445 - time (sec): 1.79 - samples/sec: 9924.04 - lr: 0.000038 - momentum: 0.000000 2023-10-18 22:44:28,064 epoch 4 - iter 144/723 - loss 0.16489378 - time (sec): 3.52 - samples/sec: 9691.52 - lr: 0.000038 - momentum: 0.000000 2023-10-18 22:44:29,826 epoch 4 - iter 216/723 - loss 0.16847935 - time (sec): 5.28 - samples/sec: 9868.72 - lr: 0.000037 - momentum: 0.000000 2023-10-18 22:44:31,639 epoch 4 - iter 288/723 - loss 0.16098872 - time (sec): 7.09 - samples/sec: 9874.07 - lr: 0.000037 - momentum: 0.000000 2023-10-18 22:44:33,422 epoch 4 - iter 360/723 - loss 0.15924263 - time (sec): 8.87 - samples/sec: 9975.87 - lr: 0.000036 - momentum: 0.000000 2023-10-18 22:44:35,582 epoch 4 - iter 432/723 - loss 0.15846004 - time (sec): 11.04 - samples/sec: 9631.71 - lr: 0.000036 - momentum: 0.000000 2023-10-18 22:44:37,332 epoch 4 - iter 504/723 - loss 0.15818452 - time (sec): 12.78 - samples/sec: 9620.61 - lr: 0.000035 - momentum: 0.000000 2023-10-18 22:44:39,105 epoch 4 - iter 576/723 - loss 0.15964872 - time (sec): 14.56 - samples/sec: 9659.36 - lr: 0.000034 - momentum: 0.000000 2023-10-18 22:44:40,775 epoch 4 - iter 648/723 - loss 0.15875186 - time (sec): 16.23 - samples/sec: 9759.65 - lr: 0.000034 - momentum: 0.000000 2023-10-18 22:44:42,510 epoch 4 - iter 720/723 - loss 0.15791226 - time (sec): 17.96 - samples/sec: 9779.75 - lr: 0.000033 - momentum: 0.000000 2023-10-18 22:44:42,579 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:44:42,579 EPOCH 4 done: loss 0.1577 - lr: 0.000033 2023-10-18 22:44:44,343 DEV : loss 0.1854134351015091 - f1-score (micro avg) 0.4725 2023-10-18 22:44:44,358 saving best model 2023-10-18 22:44:44,393 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:44:46,147 epoch 5 - iter 72/723 - loss 0.15250101 - time (sec): 1.75 - samples/sec: 9698.80 - lr: 0.000033 - momentum: 0.000000 2023-10-18 22:44:47,950 epoch 5 - iter 144/723 - loss 0.14050617 - time (sec): 3.56 - samples/sec: 9665.70 - lr: 0.000032 - momentum: 0.000000 2023-10-18 22:44:49,689 epoch 5 - iter 216/723 - loss 0.14132619 - time (sec): 5.30 - samples/sec: 9760.16 - lr: 0.000032 - momentum: 0.000000 2023-10-18 22:44:51,437 epoch 5 - iter 288/723 - loss 0.14512759 - time (sec): 7.04 - samples/sec: 9740.23 - lr: 0.000031 - momentum: 0.000000 2023-10-18 22:44:53,267 epoch 5 - iter 360/723 - loss 0.14365984 - time (sec): 8.87 - samples/sec: 9847.64 - lr: 0.000031 - momentum: 0.000000 2023-10-18 22:44:55,063 epoch 5 - iter 432/723 - loss 0.14185439 - time (sec): 10.67 - samples/sec: 9917.51 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:44:56,764 epoch 5 - iter 504/723 - loss 0.14221405 - time (sec): 12.37 - samples/sec: 9966.98 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:44:58,584 epoch 5 - iter 576/723 - loss 0.14589706 - time (sec): 14.19 - samples/sec: 9975.33 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:45:00,273 epoch 5 - iter 648/723 - loss 0.14657905 - time (sec): 15.88 - samples/sec: 9945.36 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:45:02,031 epoch 5 - iter 720/723 - loss 0.14462396 - time (sec): 17.64 - samples/sec: 9949.50 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:45:02,097 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:45:02,097 EPOCH 5 done: loss 0.1448 - lr: 0.000028 2023-10-18 22:45:03,883 DEV : loss 0.19970497488975525 - f1-score (micro avg) 0.4683 2023-10-18 22:45:03,898 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:45:05,529 epoch 6 - iter 72/723 - loss 0.13300970 - time (sec): 1.63 - samples/sec: 10306.00 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:45:07,306 epoch 6 - iter 144/723 - loss 0.13427013 - time (sec): 3.41 - samples/sec: 10181.61 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:45:09,059 epoch 6 - iter 216/723 - loss 0.14175302 - time (sec): 5.16 - samples/sec: 10128.20 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:45:11,268 epoch 6 - iter 288/723 - loss 0.14007196 - time (sec): 7.37 - samples/sec: 9608.57 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:45:13,103 epoch 6 - iter 360/723 - loss 0.14231502 - time (sec): 9.20 - samples/sec: 9787.59 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:45:14,810 epoch 6 - iter 432/723 - loss 0.14212426 - time (sec): 10.91 - samples/sec: 9730.58 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:45:16,542 epoch 6 - iter 504/723 - loss 0.13962804 - time (sec): 12.64 - samples/sec: 9770.83 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:45:18,333 epoch 6 - iter 576/723 - loss 0.13709184 - time (sec): 14.43 - samples/sec: 9707.71 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:45:20,140 epoch 6 - iter 648/723 - loss 0.13760070 - time (sec): 16.24 - samples/sec: 9701.92 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:45:21,942 epoch 6 - iter 720/723 - loss 0.13693784 - time (sec): 18.04 - samples/sec: 9735.43 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:45:22,001 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:45:22,001 EPOCH 6 done: loss 0.1371 - lr: 0.000022 2023-10-18 22:45:23,768 DEV : loss 0.17635029554367065 - f1-score (micro avg) 0.5417 2023-10-18 22:45:23,783 saving best model 2023-10-18 22:45:23,819 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:45:25,649 epoch 7 - iter 72/723 - loss 0.13294814 - time (sec): 1.83 - samples/sec: 10339.07 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:45:27,545 epoch 7 - iter 144/723 - loss 0.13526354 - time (sec): 3.73 - samples/sec: 10157.15 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:45:29,304 epoch 7 - iter 216/723 - loss 0.13203783 - time (sec): 5.48 - samples/sec: 9903.66 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:45:31,034 epoch 7 - iter 288/723 - loss 0.12916293 - time (sec): 7.21 - samples/sec: 10012.15 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:45:32,876 epoch 7 - iter 360/723 - loss 0.13060908 - time (sec): 9.06 - samples/sec: 10009.03 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:45:34,619 epoch 7 - iter 432/723 - loss 0.13087688 - time (sec): 10.80 - samples/sec: 9869.43 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:45:36,389 epoch 7 - iter 504/723 - loss 0.13106381 - time (sec): 12.57 - samples/sec: 9907.31 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:45:38,154 epoch 7 - iter 576/723 - loss 0.13065185 - time (sec): 14.33 - samples/sec: 9829.80 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:45:39,993 epoch 7 - iter 648/723 - loss 0.13200382 - time (sec): 16.17 - samples/sec: 9781.93 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:45:41,800 epoch 7 - iter 720/723 - loss 0.13012946 - time (sec): 17.98 - samples/sec: 9753.91 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:45:41,867 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:45:41,867 EPOCH 7 done: loss 0.1300 - lr: 0.000017 2023-10-18 22:45:43,672 DEV : loss 0.16803206503391266 - f1-score (micro avg) 0.5641 2023-10-18 22:45:43,688 saving best model 2023-10-18 22:45:43,726 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:45:45,910 epoch 8 - iter 72/723 - loss 0.11783336 - time (sec): 2.18 - samples/sec: 8607.37 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:45:47,689 epoch 8 - iter 144/723 - loss 0.12170558 - time (sec): 3.96 - samples/sec: 9010.59 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:45:49,461 epoch 8 - iter 216/723 - loss 0.11782802 - time (sec): 5.73 - samples/sec: 9443.99 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:45:51,350 epoch 8 - iter 288/723 - loss 0.12191216 - time (sec): 7.62 - samples/sec: 9517.60 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:45:53,137 epoch 8 - iter 360/723 - loss 0.12221038 - time (sec): 9.41 - samples/sec: 9527.29 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:45:54,901 epoch 8 - iter 432/723 - loss 0.12407488 - time (sec): 11.17 - samples/sec: 9594.08 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:45:56,653 epoch 8 - iter 504/723 - loss 0.12340825 - time (sec): 12.93 - samples/sec: 9523.39 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:45:58,562 epoch 8 - iter 576/723 - loss 0.12607957 - time (sec): 14.84 - samples/sec: 9548.59 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:46:00,328 epoch 8 - iter 648/723 - loss 0.12516568 - time (sec): 16.60 - samples/sec: 9544.46 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:46:02,073 epoch 8 - iter 720/723 - loss 0.12319951 - time (sec): 18.35 - samples/sec: 9581.62 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:46:02,134 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:46:02,134 EPOCH 8 done: loss 0.1237 - lr: 0.000011 2023-10-18 22:46:03,907 DEV : loss 0.17866294085979462 - f1-score (micro avg) 0.5468 2023-10-18 22:46:03,922 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:46:05,776 epoch 9 - iter 72/723 - loss 0.11630442 - time (sec): 1.85 - samples/sec: 10123.52 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:46:07,603 epoch 9 - iter 144/723 - loss 0.12178045 - time (sec): 3.68 - samples/sec: 9993.49 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:46:09,414 epoch 9 - iter 216/723 - loss 0.12086617 - time (sec): 5.49 - samples/sec: 9985.67 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:46:11,238 epoch 9 - iter 288/723 - loss 0.12242202 - time (sec): 7.32 - samples/sec: 9992.32 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:46:12,954 epoch 9 - iter 360/723 - loss 0.12264054 - time (sec): 9.03 - samples/sec: 9923.28 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:46:14,808 epoch 9 - iter 432/723 - loss 0.12187392 - time (sec): 10.89 - samples/sec: 9894.73 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:46:16,543 epoch 9 - iter 504/723 - loss 0.12199981 - time (sec): 12.62 - samples/sec: 9903.21 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:46:18,250 epoch 9 - iter 576/723 - loss 0.12101534 - time (sec): 14.33 - samples/sec: 9879.96 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:46:20,086 epoch 9 - iter 648/723 - loss 0.11898037 - time (sec): 16.16 - samples/sec: 9874.85 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:46:21,858 epoch 9 - iter 720/723 - loss 0.11963415 - time (sec): 17.94 - samples/sec: 9794.81 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:46:21,910 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:46:21,910 EPOCH 9 done: loss 0.1196 - lr: 0.000006 2023-10-18 22:46:24,057 DEV : loss 0.17154528200626373 - f1-score (micro avg) 0.5598 2023-10-18 22:46:24,072 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:46:25,834 epoch 10 - iter 72/723 - loss 0.11308311 - time (sec): 1.76 - samples/sec: 9908.86 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:46:27,647 epoch 10 - iter 144/723 - loss 0.10151669 - time (sec): 3.57 - samples/sec: 9898.04 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:46:29,413 epoch 10 - iter 216/723 - loss 0.11242870 - time (sec): 5.34 - samples/sec: 9963.54 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:46:30,991 epoch 10 - iter 288/723 - loss 0.11534369 - time (sec): 6.92 - samples/sec: 10160.00 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:46:32,768 epoch 10 - iter 360/723 - loss 0.11447912 - time (sec): 8.70 - samples/sec: 10218.57 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:46:34,561 epoch 10 - iter 432/723 - loss 0.11486553 - time (sec): 10.49 - samples/sec: 10128.00 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:46:36,315 epoch 10 - iter 504/723 - loss 0.11442936 - time (sec): 12.24 - samples/sec: 10009.55 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:46:38,113 epoch 10 - iter 576/723 - loss 0.11480242 - time (sec): 14.04 - samples/sec: 10031.80 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:46:39,940 epoch 10 - iter 648/723 - loss 0.11678930 - time (sec): 15.87 - samples/sec: 9929.44 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:46:41,735 epoch 10 - iter 720/723 - loss 0.11851043 - time (sec): 17.66 - samples/sec: 9937.05 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:46:41,799 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:46:41,800 EPOCH 10 done: loss 0.1182 - lr: 0.000000 2023-10-18 22:46:43,583 DEV : loss 0.17256775498390198 - f1-score (micro avg) 0.5637 2023-10-18 22:46:43,629 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:46:43,629 Loading model from best epoch ... 2023-10-18 22:46:43,712 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 22:46:45,085 Results: - F-score (micro) 0.5855 - F-score (macro) 0.4134 - Accuracy 0.4273 By class: precision recall f1-score support LOC 0.6505 0.6747 0.6624 458 PER 0.5891 0.5145 0.5493 482 ORG 1.0000 0.0145 0.0286 69 micro avg 0.6221 0.5530 0.5855 1009 macro avg 0.7465 0.4012 0.4134 1009 weighted avg 0.6451 0.5530 0.5650 1009 2023-10-18 22:46:45,086 ----------------------------------------------------------------------------------------------------