2023-10-17 15:10:51,606 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:51,608 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 15:10:51,608 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:51,608 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-17 15:10:51,608 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:51,608 Train: 3575 sentences 2023-10-17 15:10:51,609 (train_with_dev=False, train_with_test=False) 2023-10-17 15:10:51,609 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:51,609 Training Params: 2023-10-17 15:10:51,609 - learning_rate: "3e-05" 2023-10-17 15:10:51,609 - mini_batch_size: "8" 2023-10-17 15:10:51,609 - max_epochs: "10" 2023-10-17 15:10:51,609 - shuffle: "True" 2023-10-17 15:10:51,609 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:51,609 Plugins: 2023-10-17 15:10:51,609 - TensorboardLogger 2023-10-17 15:10:51,609 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 15:10:51,609 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:51,609 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 15:10:51,610 - metric: "('micro avg', 'f1-score')" 2023-10-17 15:10:51,610 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:51,610 Computation: 2023-10-17 15:10:51,610 - compute on device: cuda:0 2023-10-17 15:10:51,610 - embedding storage: none 2023-10-17 15:10:51,610 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:51,610 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 15:10:51,610 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:51,610 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:51,610 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 15:10:56,808 epoch 1 - iter 44/447 - loss 3.46567148 - time (sec): 5.20 - samples/sec: 1711.38 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:11:00,812 epoch 1 - iter 88/447 - loss 2.59735198 - time (sec): 9.20 - samples/sec: 1852.89 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:11:04,862 epoch 1 - iter 132/447 - loss 1.95725490 - time (sec): 13.25 - samples/sec: 1922.63 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:11:08,852 epoch 1 - iter 176/447 - loss 1.57353600 - time (sec): 17.24 - samples/sec: 2001.18 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:11:13,098 epoch 1 - iter 220/447 - loss 1.32853362 - time (sec): 21.49 - samples/sec: 2016.42 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:11:17,371 epoch 1 - iter 264/447 - loss 1.16250705 - time (sec): 25.76 - samples/sec: 2010.43 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:11:21,615 epoch 1 - iter 308/447 - loss 1.04918854 - time (sec): 30.00 - samples/sec: 1997.94 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:11:25,906 epoch 1 - iter 352/447 - loss 0.96192982 - time (sec): 34.29 - samples/sec: 1983.35 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:11:30,459 epoch 1 - iter 396/447 - loss 0.87615452 - time (sec): 38.85 - samples/sec: 1991.18 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:11:34,926 epoch 1 - iter 440/447 - loss 0.81319104 - time (sec): 43.31 - samples/sec: 1971.57 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:11:35,605 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:11:35,605 EPOCH 1 done: loss 0.8054 - lr: 0.000029 2023-10-17 15:11:41,993 DEV : loss 0.17819160223007202 - f1-score (micro avg) 0.5895 2023-10-17 15:11:42,045 saving best model 2023-10-17 15:11:42,637 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:11:46,828 epoch 2 - iter 44/447 - loss 0.21747760 - time (sec): 4.19 - samples/sec: 2038.97 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:11:51,276 epoch 2 - iter 88/447 - loss 0.19904981 - time (sec): 8.64 - samples/sec: 1919.57 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:11:55,382 epoch 2 - iter 132/447 - loss 0.19320002 - time (sec): 12.74 - samples/sec: 1949.21 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:11:59,652 epoch 2 - iter 176/447 - loss 0.17629144 - time (sec): 17.01 - samples/sec: 1973.11 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:12:03,802 epoch 2 - iter 220/447 - loss 0.16743674 - time (sec): 21.16 - samples/sec: 1979.03 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:12:07,900 epoch 2 - iter 264/447 - loss 0.16450320 - time (sec): 25.26 - samples/sec: 2014.54 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:12:12,274 epoch 2 - iter 308/447 - loss 0.16199472 - time (sec): 29.63 - samples/sec: 2010.23 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:12:16,738 epoch 2 - iter 352/447 - loss 0.15732773 - time (sec): 34.10 - samples/sec: 2008.44 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:12:21,021 epoch 2 - iter 396/447 - loss 0.15418639 - time (sec): 38.38 - samples/sec: 1989.87 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:12:25,441 epoch 2 - iter 440/447 - loss 0.14935900 - time (sec): 42.80 - samples/sec: 1991.28 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:12:26,065 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:12:26,066 EPOCH 2 done: loss 0.1497 - lr: 0.000027 2023-10-17 15:12:37,010 DEV : loss 0.121131531894207 - f1-score (micro avg) 0.717 2023-10-17 15:12:37,066 saving best model 2023-10-17 15:12:38,529 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:12:43,063 epoch 3 - iter 44/447 - loss 0.07330655 - time (sec): 4.53 - samples/sec: 2087.82 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:12:47,404 epoch 3 - iter 88/447 - loss 0.08406998 - time (sec): 8.87 - samples/sec: 2094.54 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:12:51,445 epoch 3 - iter 132/447 - loss 0.08303447 - time (sec): 12.91 - samples/sec: 2060.80 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:12:55,553 epoch 3 - iter 176/447 - loss 0.08452492 - time (sec): 17.02 - samples/sec: 2014.40 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:12:59,615 epoch 3 - iter 220/447 - loss 0.07986201 - time (sec): 21.08 - samples/sec: 2019.64 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:13:03,617 epoch 3 - iter 264/447 - loss 0.08080735 - time (sec): 25.08 - samples/sec: 2014.92 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:13:07,691 epoch 3 - iter 308/447 - loss 0.08052641 - time (sec): 29.16 - samples/sec: 2012.25 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:13:11,889 epoch 3 - iter 352/447 - loss 0.08174841 - time (sec): 33.36 - samples/sec: 2034.42 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:13:16,242 epoch 3 - iter 396/447 - loss 0.08169126 - time (sec): 37.71 - samples/sec: 2030.32 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:13:21,151 epoch 3 - iter 440/447 - loss 0.08007725 - time (sec): 42.62 - samples/sec: 2005.07 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:13:21,853 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:13:21,854 EPOCH 3 done: loss 0.0812 - lr: 0.000023 2023-10-17 15:13:32,439 DEV : loss 0.15675058960914612 - f1-score (micro avg) 0.7139 2023-10-17 15:13:32,498 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:13:36,551 epoch 4 - iter 44/447 - loss 0.05960453 - time (sec): 4.05 - samples/sec: 2032.10 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:13:40,410 epoch 4 - iter 88/447 - loss 0.05853170 - time (sec): 7.91 - samples/sec: 2075.57 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:13:44,159 epoch 4 - iter 132/447 - loss 0.05869183 - time (sec): 11.66 - samples/sec: 2053.57 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:13:48,179 epoch 4 - iter 176/447 - loss 0.05390845 - time (sec): 15.68 - samples/sec: 2080.79 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:13:52,777 epoch 4 - iter 220/447 - loss 0.05440630 - time (sec): 20.28 - samples/sec: 2117.44 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:13:56,642 epoch 4 - iter 264/447 - loss 0.05213091 - time (sec): 24.14 - samples/sec: 2122.45 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:14:00,645 epoch 4 - iter 308/447 - loss 0.05379991 - time (sec): 28.14 - samples/sec: 2127.14 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:14:04,887 epoch 4 - iter 352/447 - loss 0.05183150 - time (sec): 32.39 - samples/sec: 2119.64 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:14:09,178 epoch 4 - iter 396/447 - loss 0.05387250 - time (sec): 36.68 - samples/sec: 2101.13 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:14:13,147 epoch 4 - iter 440/447 - loss 0.05274153 - time (sec): 40.65 - samples/sec: 2095.95 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:14:13,788 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:14:13,789 EPOCH 4 done: loss 0.0527 - lr: 0.000020 2023-10-17 15:14:24,686 DEV : loss 0.1432371437549591 - f1-score (micro avg) 0.7652 2023-10-17 15:14:24,741 saving best model 2023-10-17 15:14:26,201 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:14:30,362 epoch 5 - iter 44/447 - loss 0.03425498 - time (sec): 4.15 - samples/sec: 2131.34 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:14:34,382 epoch 5 - iter 88/447 - loss 0.04218378 - time (sec): 8.17 - samples/sec: 2117.81 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:14:38,597 epoch 5 - iter 132/447 - loss 0.03552931 - time (sec): 12.39 - samples/sec: 2128.68 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:14:42,572 epoch 5 - iter 176/447 - loss 0.03486655 - time (sec): 16.36 - samples/sec: 2084.69 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:14:46,916 epoch 5 - iter 220/447 - loss 0.03783913 - time (sec): 20.71 - samples/sec: 2092.19 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:14:50,784 epoch 5 - iter 264/447 - loss 0.03764549 - time (sec): 24.58 - samples/sec: 2092.49 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:14:54,875 epoch 5 - iter 308/447 - loss 0.03645640 - time (sec): 28.67 - samples/sec: 2087.45 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:14:58,976 epoch 5 - iter 352/447 - loss 0.03547440 - time (sec): 32.77 - samples/sec: 2087.15 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:15:02,988 epoch 5 - iter 396/447 - loss 0.03520422 - time (sec): 36.78 - samples/sec: 2085.61 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:15:07,284 epoch 5 - iter 440/447 - loss 0.03609226 - time (sec): 41.07 - samples/sec: 2079.54 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:15:07,916 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:15:07,916 EPOCH 5 done: loss 0.0361 - lr: 0.000017 2023-10-17 15:15:19,049 DEV : loss 0.17960673570632935 - f1-score (micro avg) 0.7676 2023-10-17 15:15:19,119 saving best model 2023-10-17 15:15:19,772 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:15:23,817 epoch 6 - iter 44/447 - loss 0.01947292 - time (sec): 4.04 - samples/sec: 2052.75 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:15:28,029 epoch 6 - iter 88/447 - loss 0.02262571 - time (sec): 8.25 - samples/sec: 2033.87 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:15:32,196 epoch 6 - iter 132/447 - loss 0.02552581 - time (sec): 12.42 - samples/sec: 2048.82 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:15:36,588 epoch 6 - iter 176/447 - loss 0.02416652 - time (sec): 16.81 - samples/sec: 2033.48 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:15:41,376 epoch 6 - iter 220/447 - loss 0.02289846 - time (sec): 21.60 - samples/sec: 1992.25 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:15:45,381 epoch 6 - iter 264/447 - loss 0.02205499 - time (sec): 25.61 - samples/sec: 1982.77 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:15:49,840 epoch 6 - iter 308/447 - loss 0.02142818 - time (sec): 30.06 - samples/sec: 2011.66 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:15:53,927 epoch 6 - iter 352/447 - loss 0.02163330 - time (sec): 34.15 - samples/sec: 2018.37 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:15:58,088 epoch 6 - iter 396/447 - loss 0.02092477 - time (sec): 38.31 - samples/sec: 2018.93 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:16:02,149 epoch 6 - iter 440/447 - loss 0.02118031 - time (sec): 42.37 - samples/sec: 2012.21 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:16:02,784 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:16:02,784 EPOCH 6 done: loss 0.0214 - lr: 0.000013 2023-10-17 15:16:13,256 DEV : loss 0.19122734665870667 - f1-score (micro avg) 0.7825 2023-10-17 15:16:13,317 saving best model 2023-10-17 15:16:14,798 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:16:19,322 epoch 7 - iter 44/447 - loss 0.01563041 - time (sec): 4.52 - samples/sec: 2112.02 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:16:23,326 epoch 7 - iter 88/447 - loss 0.01611607 - time (sec): 8.52 - samples/sec: 2092.41 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:16:27,501 epoch 7 - iter 132/447 - loss 0.01689545 - time (sec): 12.70 - samples/sec: 2094.16 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:16:32,005 epoch 7 - iter 176/447 - loss 0.01747896 - time (sec): 17.20 - samples/sec: 2094.26 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:16:36,046 epoch 7 - iter 220/447 - loss 0.01716074 - time (sec): 21.24 - samples/sec: 2096.32 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:16:40,073 epoch 7 - iter 264/447 - loss 0.01541518 - time (sec): 25.27 - samples/sec: 2078.44 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:16:44,301 epoch 7 - iter 308/447 - loss 0.01645470 - time (sec): 29.50 - samples/sec: 2057.27 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:16:48,361 epoch 7 - iter 352/447 - loss 0.01683787 - time (sec): 33.56 - samples/sec: 2062.46 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:16:52,388 epoch 7 - iter 396/447 - loss 0.01616763 - time (sec): 37.58 - samples/sec: 2062.57 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:16:56,416 epoch 7 - iter 440/447 - loss 0.01561036 - time (sec): 41.61 - samples/sec: 2048.57 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:16:57,057 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:16:57,058 EPOCH 7 done: loss 0.0154 - lr: 0.000010 2023-10-17 15:17:07,982 DEV : loss 0.20203104615211487 - f1-score (micro avg) 0.791 2023-10-17 15:17:08,037 saving best model 2023-10-17 15:17:09,651 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:17:14,128 epoch 8 - iter 44/447 - loss 0.00477927 - time (sec): 4.48 - samples/sec: 1823.23 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:17:18,206 epoch 8 - iter 88/447 - loss 0.01009409 - time (sec): 8.55 - samples/sec: 1909.09 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:17:22,409 epoch 8 - iter 132/447 - loss 0.01037176 - time (sec): 12.76 - samples/sec: 1898.29 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:17:26,995 epoch 8 - iter 176/447 - loss 0.00910357 - time (sec): 17.34 - samples/sec: 1899.72 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:17:31,092 epoch 8 - iter 220/447 - loss 0.00883076 - time (sec): 21.44 - samples/sec: 1947.85 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:17:35,078 epoch 8 - iter 264/447 - loss 0.00841667 - time (sec): 25.42 - samples/sec: 1967.05 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:17:39,106 epoch 8 - iter 308/447 - loss 0.00937271 - time (sec): 29.45 - samples/sec: 1973.97 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:17:43,218 epoch 8 - iter 352/447 - loss 0.01054465 - time (sec): 33.56 - samples/sec: 1977.60 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:17:47,869 epoch 8 - iter 396/447 - loss 0.01105462 - time (sec): 38.22 - samples/sec: 1983.69 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:17:52,348 epoch 8 - iter 440/447 - loss 0.01082183 - time (sec): 42.69 - samples/sec: 1996.49 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:17:52,961 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:17:52,962 EPOCH 8 done: loss 0.0107 - lr: 0.000007 2023-10-17 15:18:03,854 DEV : loss 0.20561201870441437 - f1-score (micro avg) 0.7844 2023-10-17 15:18:03,914 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:18:08,009 epoch 9 - iter 44/447 - loss 0.00258760 - time (sec): 4.09 - samples/sec: 1856.78 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:18:12,008 epoch 9 - iter 88/447 - loss 0.00437519 - time (sec): 8.09 - samples/sec: 1977.19 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:18:16,094 epoch 9 - iter 132/447 - loss 0.00474581 - time (sec): 12.18 - samples/sec: 1996.66 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:18:20,052 epoch 9 - iter 176/447 - loss 0.00461352 - time (sec): 16.14 - samples/sec: 2023.33 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:18:24,813 epoch 9 - iter 220/447 - loss 0.00511861 - time (sec): 20.90 - samples/sec: 2058.16 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:18:28,845 epoch 9 - iter 264/447 - loss 0.00514456 - time (sec): 24.93 - samples/sec: 2081.35 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:18:32,841 epoch 9 - iter 308/447 - loss 0.00607397 - time (sec): 28.93 - samples/sec: 2074.45 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:18:37,079 epoch 9 - iter 352/447 - loss 0.00657719 - time (sec): 33.16 - samples/sec: 2052.89 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:18:41,085 epoch 9 - iter 396/447 - loss 0.00653218 - time (sec): 37.17 - samples/sec: 2061.49 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:18:45,228 epoch 9 - iter 440/447 - loss 0.00638953 - time (sec): 41.31 - samples/sec: 2055.17 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:18:45,868 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:18:45,868 EPOCH 9 done: loss 0.0063 - lr: 0.000003 2023-10-17 15:18:56,866 DEV : loss 0.21265345811843872 - f1-score (micro avg) 0.7963 2023-10-17 15:18:56,921 saving best model 2023-10-17 15:18:58,425 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:19:02,563 epoch 10 - iter 44/447 - loss 0.00272302 - time (sec): 4.13 - samples/sec: 2064.17 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:19:06,593 epoch 10 - iter 88/447 - loss 0.00325765 - time (sec): 8.16 - samples/sec: 2021.55 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:19:11,047 epoch 10 - iter 132/447 - loss 0.00362277 - time (sec): 12.61 - samples/sec: 2084.14 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:19:15,000 epoch 10 - iter 176/447 - loss 0.00528389 - time (sec): 16.57 - samples/sec: 2062.23 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:19:19,112 epoch 10 - iter 220/447 - loss 0.00509353 - time (sec): 20.68 - samples/sec: 2078.14 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:19:23,347 epoch 10 - iter 264/447 - loss 0.00465258 - time (sec): 24.91 - samples/sec: 2077.33 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:19:27,449 epoch 10 - iter 308/447 - loss 0.00478916 - time (sec): 29.01 - samples/sec: 2067.22 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:19:31,720 epoch 10 - iter 352/447 - loss 0.00507025 - time (sec): 33.29 - samples/sec: 2067.93 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:19:35,681 epoch 10 - iter 396/447 - loss 0.00469527 - time (sec): 37.25 - samples/sec: 2054.64 - lr: 0.000000 - momentum: 0.000000 2023-10-17 15:19:39,828 epoch 10 - iter 440/447 - loss 0.00471393 - time (sec): 41.39 - samples/sec: 2062.13 - lr: 0.000000 - momentum: 0.000000 2023-10-17 15:19:40,443 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:19:40,444 EPOCH 10 done: loss 0.0047 - lr: 0.000000 2023-10-17 15:19:51,679 DEV : loss 0.21538716554641724 - f1-score (micro avg) 0.7969 2023-10-17 15:19:51,731 saving best model 2023-10-17 15:19:53,070 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:19:53,072 Loading model from best epoch ... 2023-10-17 15:19:55,890 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-17 15:20:01,461 Results: - F-score (micro) 0.7606 - F-score (macro) 0.6789 - Accuracy 0.6344 By class: precision recall f1-score support loc 0.8482 0.8624 0.8552 596 pers 0.6778 0.7898 0.7295 333 org 0.5143 0.5455 0.5294 132 prod 0.6000 0.5000 0.5455 66 time 0.7347 0.7347 0.7347 49 micro avg 0.7415 0.7806 0.7606 1176 macro avg 0.6750 0.6865 0.6789 1176 weighted avg 0.7438 0.7806 0.7607 1176 2023-10-17 15:20:01,461 ----------------------------------------------------------------------------------------------------