stefan-it's picture
Upload folder using huggingface_hub
c47a330
2023-10-17 17:44:48,166 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:48,168 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 17:44:48,168 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:48,168 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-17 17:44:48,168 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:48,168 Train: 3575 sentences
2023-10-17 17:44:48,168 (train_with_dev=False, train_with_test=False)
2023-10-17 17:44:48,168 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:48,168 Training Params:
2023-10-17 17:44:48,168 - learning_rate: "5e-05"
2023-10-17 17:44:48,168 - mini_batch_size: "8"
2023-10-17 17:44:48,168 - max_epochs: "10"
2023-10-17 17:44:48,169 - shuffle: "True"
2023-10-17 17:44:48,169 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:48,169 Plugins:
2023-10-17 17:44:48,169 - TensorboardLogger
2023-10-17 17:44:48,169 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 17:44:48,169 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:48,169 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 17:44:48,169 - metric: "('micro avg', 'f1-score')"
2023-10-17 17:44:48,169 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:48,169 Computation:
2023-10-17 17:44:48,169 - compute on device: cuda:0
2023-10-17 17:44:48,169 - embedding storage: none
2023-10-17 17:44:48,169 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:48,169 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 17:44:48,169 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:48,170 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:48,170 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 17:44:52,654 epoch 1 - iter 44/447 - loss 3.53247519 - time (sec): 4.48 - samples/sec: 1962.45 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:44:57,042 epoch 1 - iter 88/447 - loss 2.31874710 - time (sec): 8.87 - samples/sec: 1971.29 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:45:01,148 epoch 1 - iter 132/447 - loss 1.75427682 - time (sec): 12.98 - samples/sec: 1973.78 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:45:05,253 epoch 1 - iter 176/447 - loss 1.42658690 - time (sec): 17.08 - samples/sec: 1990.33 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:45:09,368 epoch 1 - iter 220/447 - loss 1.21599396 - time (sec): 21.20 - samples/sec: 1982.09 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:45:13,538 epoch 1 - iter 264/447 - loss 1.06151430 - time (sec): 25.37 - samples/sec: 1993.82 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:45:17,496 epoch 1 - iter 308/447 - loss 0.94782877 - time (sec): 29.32 - samples/sec: 2013.04 - lr: 0.000034 - momentum: 0.000000
2023-10-17 17:45:21,524 epoch 1 - iter 352/447 - loss 0.86170498 - time (sec): 33.35 - samples/sec: 2021.61 - lr: 0.000039 - momentum: 0.000000
2023-10-17 17:45:26,668 epoch 1 - iter 396/447 - loss 0.77972033 - time (sec): 38.50 - samples/sec: 2006.47 - lr: 0.000044 - momentum: 0.000000
2023-10-17 17:45:30,614 epoch 1 - iter 440/447 - loss 0.72870630 - time (sec): 42.44 - samples/sec: 2009.37 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:45:31,257 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:31,257 EPOCH 1 done: loss 0.7216 - lr: 0.000049
2023-10-17 17:45:37,295 DEV : loss 0.17193150520324707 - f1-score (micro avg) 0.6114
2023-10-17 17:45:37,350 saving best model
2023-10-17 17:45:37,922 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:42,340 epoch 2 - iter 44/447 - loss 0.19212755 - time (sec): 4.42 - samples/sec: 1736.07 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:45:46,547 epoch 2 - iter 88/447 - loss 0.18829141 - time (sec): 8.62 - samples/sec: 1909.97 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:45:50,480 epoch 2 - iter 132/447 - loss 0.17009265 - time (sec): 12.56 - samples/sec: 1969.56 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:45:54,515 epoch 2 - iter 176/447 - loss 0.16569554 - time (sec): 16.59 - samples/sec: 2009.28 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:45:58,711 epoch 2 - iter 220/447 - loss 0.15747248 - time (sec): 20.79 - samples/sec: 2044.40 - lr: 0.000047 - momentum: 0.000000
2023-10-17 17:46:02,697 epoch 2 - iter 264/447 - loss 0.15702120 - time (sec): 24.77 - samples/sec: 2040.25 - lr: 0.000047 - momentum: 0.000000
2023-10-17 17:46:06,765 epoch 2 - iter 308/447 - loss 0.15884489 - time (sec): 28.84 - samples/sec: 2057.88 - lr: 0.000046 - momentum: 0.000000
2023-10-17 17:46:10,920 epoch 2 - iter 352/447 - loss 0.15406260 - time (sec): 33.00 - samples/sec: 2059.91 - lr: 0.000046 - momentum: 0.000000
2023-10-17 17:46:15,205 epoch 2 - iter 396/447 - loss 0.14959137 - time (sec): 37.28 - samples/sec: 2071.69 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:46:19,041 epoch 2 - iter 440/447 - loss 0.14633762 - time (sec): 41.12 - samples/sec: 2075.89 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:46:19,629 ----------------------------------------------------------------------------------------------------
2023-10-17 17:46:19,629 EPOCH 2 done: loss 0.1456 - lr: 0.000045
2023-10-17 17:46:30,392 DEV : loss 0.14541815221309662 - f1-score (micro avg) 0.6911
2023-10-17 17:46:30,444 saving best model
2023-10-17 17:46:31,832 ----------------------------------------------------------------------------------------------------
2023-10-17 17:46:35,959 epoch 3 - iter 44/447 - loss 0.09885066 - time (sec): 4.12 - samples/sec: 1899.38 - lr: 0.000044 - momentum: 0.000000
2023-10-17 17:46:39,789 epoch 3 - iter 88/447 - loss 0.08670680 - time (sec): 7.95 - samples/sec: 1930.35 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:46:43,693 epoch 3 - iter 132/447 - loss 0.08996882 - time (sec): 11.86 - samples/sec: 1948.35 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:46:47,693 epoch 3 - iter 176/447 - loss 0.08967241 - time (sec): 15.86 - samples/sec: 1982.97 - lr: 0.000042 - momentum: 0.000000
2023-10-17 17:46:51,668 epoch 3 - iter 220/447 - loss 0.08753443 - time (sec): 19.83 - samples/sec: 2017.86 - lr: 0.000042 - momentum: 0.000000
2023-10-17 17:46:55,783 epoch 3 - iter 264/447 - loss 0.08959986 - time (sec): 23.95 - samples/sec: 2032.83 - lr: 0.000041 - momentum: 0.000000
2023-10-17 17:47:00,057 epoch 3 - iter 308/447 - loss 0.08503972 - time (sec): 28.22 - samples/sec: 2050.68 - lr: 0.000041 - momentum: 0.000000
2023-10-17 17:47:04,697 epoch 3 - iter 352/447 - loss 0.08384251 - time (sec): 32.86 - samples/sec: 2062.53 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:47:08,798 epoch 3 - iter 396/447 - loss 0.08475909 - time (sec): 36.96 - samples/sec: 2064.75 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:47:12,845 epoch 3 - iter 440/447 - loss 0.08292084 - time (sec): 41.01 - samples/sec: 2081.99 - lr: 0.000039 - momentum: 0.000000
2023-10-17 17:47:13,443 ----------------------------------------------------------------------------------------------------
2023-10-17 17:47:13,444 EPOCH 3 done: loss 0.0828 - lr: 0.000039
2023-10-17 17:47:24,567 DEV : loss 0.18581056594848633 - f1-score (micro avg) 0.734
2023-10-17 17:47:24,619 saving best model
2023-10-17 17:47:25,193 ----------------------------------------------------------------------------------------------------
2023-10-17 17:47:29,381 epoch 4 - iter 44/447 - loss 0.06148268 - time (sec): 4.19 - samples/sec: 2073.49 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:47:33,657 epoch 4 - iter 88/447 - loss 0.05203947 - time (sec): 8.46 - samples/sec: 2178.64 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:47:37,580 epoch 4 - iter 132/447 - loss 0.05294970 - time (sec): 12.38 - samples/sec: 2117.16 - lr: 0.000037 - momentum: 0.000000
2023-10-17 17:47:41,665 epoch 4 - iter 176/447 - loss 0.05512343 - time (sec): 16.47 - samples/sec: 2105.44 - lr: 0.000037 - momentum: 0.000000
2023-10-17 17:47:45,882 epoch 4 - iter 220/447 - loss 0.05440061 - time (sec): 20.69 - samples/sec: 2074.33 - lr: 0.000036 - momentum: 0.000000
2023-10-17 17:47:49,882 epoch 4 - iter 264/447 - loss 0.05339531 - time (sec): 24.69 - samples/sec: 2058.85 - lr: 0.000036 - momentum: 0.000000
2023-10-17 17:47:54,034 epoch 4 - iter 308/447 - loss 0.05563142 - time (sec): 28.84 - samples/sec: 2085.27 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:47:58,609 epoch 4 - iter 352/447 - loss 0.05395382 - time (sec): 33.41 - samples/sec: 2068.53 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:48:03,017 epoch 4 - iter 396/447 - loss 0.05405361 - time (sec): 37.82 - samples/sec: 2040.32 - lr: 0.000034 - momentum: 0.000000
2023-10-17 17:48:07,090 epoch 4 - iter 440/447 - loss 0.05319170 - time (sec): 41.89 - samples/sec: 2035.85 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:48:07,746 ----------------------------------------------------------------------------------------------------
2023-10-17 17:48:07,747 EPOCH 4 done: loss 0.0530 - lr: 0.000033
2023-10-17 17:48:19,334 DEV : loss 0.16320064663887024 - f1-score (micro avg) 0.7421
2023-10-17 17:48:19,391 saving best model
2023-10-17 17:48:20,773 ----------------------------------------------------------------------------------------------------
2023-10-17 17:48:24,936 epoch 5 - iter 44/447 - loss 0.03339435 - time (sec): 4.16 - samples/sec: 2013.95 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:48:28,768 epoch 5 - iter 88/447 - loss 0.03083486 - time (sec): 7.99 - samples/sec: 2075.68 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:48:32,818 epoch 5 - iter 132/447 - loss 0.02828933 - time (sec): 12.04 - samples/sec: 2079.53 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:48:37,063 epoch 5 - iter 176/447 - loss 0.03313995 - time (sec): 16.29 - samples/sec: 2083.18 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:48:40,905 epoch 5 - iter 220/447 - loss 0.03539016 - time (sec): 20.13 - samples/sec: 2106.25 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:48:45,061 epoch 5 - iter 264/447 - loss 0.03731012 - time (sec): 24.28 - samples/sec: 2098.95 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:48:49,291 epoch 5 - iter 308/447 - loss 0.03763343 - time (sec): 28.51 - samples/sec: 2108.65 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:48:53,141 epoch 5 - iter 352/447 - loss 0.03771950 - time (sec): 32.36 - samples/sec: 2119.26 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:48:57,037 epoch 5 - iter 396/447 - loss 0.03610494 - time (sec): 36.26 - samples/sec: 2110.09 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:49:00,960 epoch 5 - iter 440/447 - loss 0.03458475 - time (sec): 40.18 - samples/sec: 2111.19 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:49:01,787 ----------------------------------------------------------------------------------------------------
2023-10-17 17:49:01,787 EPOCH 5 done: loss 0.0343 - lr: 0.000028
2023-10-17 17:49:12,745 DEV : loss 0.2096388190984726 - f1-score (micro avg) 0.7754
2023-10-17 17:49:12,801 saving best model
2023-10-17 17:49:13,365 ----------------------------------------------------------------------------------------------------
2023-10-17 17:49:17,391 epoch 6 - iter 44/447 - loss 0.02071263 - time (sec): 4.02 - samples/sec: 2067.62 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:49:21,287 epoch 6 - iter 88/447 - loss 0.02045657 - time (sec): 7.92 - samples/sec: 2084.19 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:49:25,310 epoch 6 - iter 132/447 - loss 0.02395735 - time (sec): 11.94 - samples/sec: 2103.18 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:49:29,245 epoch 6 - iter 176/447 - loss 0.02314660 - time (sec): 15.88 - samples/sec: 2120.89 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:49:33,267 epoch 6 - iter 220/447 - loss 0.02399633 - time (sec): 19.90 - samples/sec: 2074.51 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:49:37,652 epoch 6 - iter 264/447 - loss 0.02289993 - time (sec): 24.28 - samples/sec: 2079.34 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:49:41,757 epoch 6 - iter 308/447 - loss 0.02175241 - time (sec): 28.39 - samples/sec: 2078.16 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:49:46,034 epoch 6 - iter 352/447 - loss 0.02292294 - time (sec): 32.67 - samples/sec: 2060.30 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:49:50,018 epoch 6 - iter 396/447 - loss 0.02332494 - time (sec): 36.65 - samples/sec: 2058.40 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:49:54,610 epoch 6 - iter 440/447 - loss 0.02230257 - time (sec): 41.24 - samples/sec: 2069.91 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:49:55,243 ----------------------------------------------------------------------------------------------------
2023-10-17 17:49:55,243 EPOCH 6 done: loss 0.0220 - lr: 0.000022
2023-10-17 17:50:06,845 DEV : loss 0.21218937635421753 - f1-score (micro avg) 0.7601
2023-10-17 17:50:06,908 ----------------------------------------------------------------------------------------------------
2023-10-17 17:50:11,253 epoch 7 - iter 44/447 - loss 0.01285933 - time (sec): 4.34 - samples/sec: 2001.74 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:50:15,325 epoch 7 - iter 88/447 - loss 0.01388987 - time (sec): 8.42 - samples/sec: 1998.53 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:50:19,308 epoch 7 - iter 132/447 - loss 0.01336076 - time (sec): 12.40 - samples/sec: 2011.61 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:50:23,382 epoch 7 - iter 176/447 - loss 0.01261394 - time (sec): 16.47 - samples/sec: 2054.87 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:50:27,928 epoch 7 - iter 220/447 - loss 0.01097990 - time (sec): 21.02 - samples/sec: 2053.75 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:50:32,045 epoch 7 - iter 264/447 - loss 0.01111927 - time (sec): 25.14 - samples/sec: 2023.76 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:50:36,497 epoch 7 - iter 308/447 - loss 0.01188959 - time (sec): 29.59 - samples/sec: 2017.85 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:50:40,689 epoch 7 - iter 352/447 - loss 0.01124010 - time (sec): 33.78 - samples/sec: 2016.53 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:50:44,831 epoch 7 - iter 396/447 - loss 0.01118474 - time (sec): 37.92 - samples/sec: 2034.64 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:50:48,860 epoch 7 - iter 440/447 - loss 0.01185488 - time (sec): 41.95 - samples/sec: 2029.22 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:50:49,541 ----------------------------------------------------------------------------------------------------
2023-10-17 17:50:49,541 EPOCH 7 done: loss 0.0124 - lr: 0.000017
2023-10-17 17:51:00,437 DEV : loss 0.24467261135578156 - f1-score (micro avg) 0.7769
2023-10-17 17:51:00,497 saving best model
2023-10-17 17:51:01,918 ----------------------------------------------------------------------------------------------------
2023-10-17 17:51:05,955 epoch 8 - iter 44/447 - loss 0.00687414 - time (sec): 4.03 - samples/sec: 1946.11 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:51:10,038 epoch 8 - iter 88/447 - loss 0.00560644 - time (sec): 8.12 - samples/sec: 1995.96 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:51:14,085 epoch 8 - iter 132/447 - loss 0.00512284 - time (sec): 12.16 - samples/sec: 2000.44 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:51:18,460 epoch 8 - iter 176/447 - loss 0.00669502 - time (sec): 16.54 - samples/sec: 1975.13 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:51:23,370 epoch 8 - iter 220/447 - loss 0.00700551 - time (sec): 21.45 - samples/sec: 1936.87 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:51:27,445 epoch 8 - iter 264/447 - loss 0.00799623 - time (sec): 25.52 - samples/sec: 1967.09 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:51:31,651 epoch 8 - iter 308/447 - loss 0.00762820 - time (sec): 29.73 - samples/sec: 1973.87 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:51:35,768 epoch 8 - iter 352/447 - loss 0.00779306 - time (sec): 33.85 - samples/sec: 1982.90 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:51:39,774 epoch 8 - iter 396/447 - loss 0.00777938 - time (sec): 37.85 - samples/sec: 1994.88 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:51:44,165 epoch 8 - iter 440/447 - loss 0.00736500 - time (sec): 42.24 - samples/sec: 2019.04 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:51:44,797 ----------------------------------------------------------------------------------------------------
2023-10-17 17:51:44,798 EPOCH 8 done: loss 0.0073 - lr: 0.000011
2023-10-17 17:51:55,888 DEV : loss 0.24660176038742065 - f1-score (micro avg) 0.7942
2023-10-17 17:51:55,947 saving best model
2023-10-17 17:51:57,324 ----------------------------------------------------------------------------------------------------
2023-10-17 17:52:01,426 epoch 9 - iter 44/447 - loss 0.00110632 - time (sec): 4.10 - samples/sec: 2054.78 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:52:05,847 epoch 9 - iter 88/447 - loss 0.00403816 - time (sec): 8.52 - samples/sec: 2019.00 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:52:10,085 epoch 9 - iter 132/447 - loss 0.00297756 - time (sec): 12.76 - samples/sec: 2008.52 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:52:14,224 epoch 9 - iter 176/447 - loss 0.00426497 - time (sec): 16.90 - samples/sec: 1986.65 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:52:18,201 epoch 9 - iter 220/447 - loss 0.00546162 - time (sec): 20.87 - samples/sec: 2011.65 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:52:22,301 epoch 9 - iter 264/447 - loss 0.00547408 - time (sec): 24.97 - samples/sec: 2013.82 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:52:26,521 epoch 9 - iter 308/447 - loss 0.00608106 - time (sec): 29.19 - samples/sec: 2030.28 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:52:30,568 epoch 9 - iter 352/447 - loss 0.00557877 - time (sec): 33.24 - samples/sec: 2013.26 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:52:35,013 epoch 9 - iter 396/447 - loss 0.00549500 - time (sec): 37.68 - samples/sec: 2008.03 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:52:39,171 epoch 9 - iter 440/447 - loss 0.00576947 - time (sec): 41.84 - samples/sec: 2020.27 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:52:40,081 ----------------------------------------------------------------------------------------------------
2023-10-17 17:52:40,081 EPOCH 9 done: loss 0.0058 - lr: 0.000006
2023-10-17 17:52:51,569 DEV : loss 0.25680792331695557 - f1-score (micro avg) 0.7843
2023-10-17 17:52:51,627 ----------------------------------------------------------------------------------------------------
2023-10-17 17:52:55,624 epoch 10 - iter 44/447 - loss 0.00263929 - time (sec): 3.99 - samples/sec: 2143.95 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:52:59,919 epoch 10 - iter 88/447 - loss 0.00194551 - time (sec): 8.29 - samples/sec: 2079.76 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:53:04,432 epoch 10 - iter 132/447 - loss 0.00150227 - time (sec): 12.80 - samples/sec: 2103.00 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:53:08,350 epoch 10 - iter 176/447 - loss 0.00167795 - time (sec): 16.72 - samples/sec: 2102.13 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:53:12,303 epoch 10 - iter 220/447 - loss 0.00167062 - time (sec): 20.67 - samples/sec: 2102.03 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:53:16,389 epoch 10 - iter 264/447 - loss 0.00153587 - time (sec): 24.76 - samples/sec: 2086.20 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:53:20,663 epoch 10 - iter 308/447 - loss 0.00191345 - time (sec): 29.03 - samples/sec: 2079.68 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:53:24,664 epoch 10 - iter 352/447 - loss 0.00211922 - time (sec): 33.04 - samples/sec: 2074.21 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:53:28,695 epoch 10 - iter 396/447 - loss 0.00212426 - time (sec): 37.07 - samples/sec: 2075.22 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:53:33,014 epoch 10 - iter 440/447 - loss 0.00225616 - time (sec): 41.39 - samples/sec: 2062.25 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:53:33,689 ----------------------------------------------------------------------------------------------------
2023-10-17 17:53:33,690 EPOCH 10 done: loss 0.0025 - lr: 0.000000
2023-10-17 17:53:45,285 DEV : loss 0.264967143535614 - f1-score (micro avg) 0.7935
2023-10-17 17:53:45,874 ----------------------------------------------------------------------------------------------------
2023-10-17 17:53:45,876 Loading model from best epoch ...
2023-10-17 17:53:48,322 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-17 17:53:54,591
Results:
- F-score (micro) 0.7531
- F-score (macro) 0.6677
- Accuracy 0.6203
By class:
precision recall f1-score support
loc 0.8596 0.8523 0.8559 596
pers 0.6992 0.7538 0.7254 333
org 0.4615 0.5455 0.5000 132
prod 0.5806 0.5455 0.5625 66
time 0.7174 0.6735 0.6947 49
micro avg 0.7414 0.7653 0.7531 1176
macro avg 0.6637 0.6741 0.6677 1176
weighted avg 0.7479 0.7653 0.7558 1176
2023-10-17 17:53:54,591 ----------------------------------------------------------------------------------------------------