stefan-it's picture
Upload folder using huggingface_hub
fdf3c3d
2023-10-17 12:07:24,738 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:24,739 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:24,739 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:24,739 Train: 7936 sentences
2023-10-17 12:07:24,739 (train_with_dev=False, train_with_test=False)
2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:24,739 Training Params:
2023-10-17 12:07:24,739 - learning_rate: "3e-05"
2023-10-17 12:07:24,739 - mini_batch_size: "8"
2023-10-17 12:07:24,739 - max_epochs: "10"
2023-10-17 12:07:24,739 - shuffle: "True"
2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:24,739 Plugins:
2023-10-17 12:07:24,739 - TensorboardLogger
2023-10-17 12:07:24,739 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:24,740 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 12:07:24,740 - metric: "('micro avg', 'f1-score')"
2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:24,740 Computation:
2023-10-17 12:07:24,740 - compute on device: cuda:0
2023-10-17 12:07:24,740 - embedding storage: none
2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:24,740 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:24,740 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 12:07:30,633 epoch 1 - iter 99/992 - loss 2.81599692 - time (sec): 5.89 - samples/sec: 2704.36 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:07:37,189 epoch 1 - iter 198/992 - loss 1.62981031 - time (sec): 12.45 - samples/sec: 2621.75 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:07:43,470 epoch 1 - iter 297/992 - loss 1.19195552 - time (sec): 18.73 - samples/sec: 2606.95 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:07:49,903 epoch 1 - iter 396/992 - loss 0.94531236 - time (sec): 25.16 - samples/sec: 2599.13 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:07:56,328 epoch 1 - iter 495/992 - loss 0.78791628 - time (sec): 31.59 - samples/sec: 2619.30 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:08:02,769 epoch 1 - iter 594/992 - loss 0.69398148 - time (sec): 38.03 - samples/sec: 2604.33 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:08:09,093 epoch 1 - iter 693/992 - loss 0.61610072 - time (sec): 44.35 - samples/sec: 2617.66 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:08:15,034 epoch 1 - iter 792/992 - loss 0.56155201 - time (sec): 50.29 - samples/sec: 2618.04 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:08:20,806 epoch 1 - iter 891/992 - loss 0.51541319 - time (sec): 56.06 - samples/sec: 2632.79 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:08:26,516 epoch 1 - iter 990/992 - loss 0.47731115 - time (sec): 61.78 - samples/sec: 2650.78 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:08:26,615 ----------------------------------------------------------------------------------------------------
2023-10-17 12:08:26,615 EPOCH 1 done: loss 0.4767 - lr: 0.000030
2023-10-17 12:08:29,992 DEV : loss 0.09053196012973785 - f1-score (micro avg) 0.7046
2023-10-17 12:08:30,020 saving best model
2023-10-17 12:08:30,478 ----------------------------------------------------------------------------------------------------
2023-10-17 12:08:36,955 epoch 2 - iter 99/992 - loss 0.12503785 - time (sec): 6.47 - samples/sec: 2406.01 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:08:42,937 epoch 2 - iter 198/992 - loss 0.11258958 - time (sec): 12.46 - samples/sec: 2541.06 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:08:48,784 epoch 2 - iter 297/992 - loss 0.11476512 - time (sec): 18.30 - samples/sec: 2635.49 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:08:54,675 epoch 2 - iter 396/992 - loss 0.11579427 - time (sec): 24.19 - samples/sec: 2669.33 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:09:00,719 epoch 2 - iter 495/992 - loss 0.11485782 - time (sec): 30.24 - samples/sec: 2701.86 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:09:06,415 epoch 2 - iter 594/992 - loss 0.11334714 - time (sec): 35.93 - samples/sec: 2722.81 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:09:12,303 epoch 2 - iter 693/992 - loss 0.10946511 - time (sec): 41.82 - samples/sec: 2733.25 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:09:18,263 epoch 2 - iter 792/992 - loss 0.10780487 - time (sec): 47.78 - samples/sec: 2750.25 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:09:24,067 epoch 2 - iter 891/992 - loss 0.10649763 - time (sec): 53.59 - samples/sec: 2759.92 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:09:29,795 epoch 2 - iter 990/992 - loss 0.10582706 - time (sec): 59.31 - samples/sec: 2760.34 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:09:29,910 ----------------------------------------------------------------------------------------------------
2023-10-17 12:09:29,910 EPOCH 2 done: loss 0.1058 - lr: 0.000027
2023-10-17 12:09:34,536 DEV : loss 0.08198774605989456 - f1-score (micro avg) 0.7348
2023-10-17 12:09:34,563 saving best model
2023-10-17 12:09:35,254 ----------------------------------------------------------------------------------------------------
2023-10-17 12:09:41,960 epoch 3 - iter 99/992 - loss 0.07989941 - time (sec): 6.70 - samples/sec: 2509.07 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:09:48,280 epoch 3 - iter 198/992 - loss 0.07549915 - time (sec): 13.02 - samples/sec: 2531.43 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:09:54,730 epoch 3 - iter 297/992 - loss 0.07291906 - time (sec): 19.47 - samples/sec: 2557.87 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:10:00,729 epoch 3 - iter 396/992 - loss 0.07510826 - time (sec): 25.47 - samples/sec: 2542.56 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:10:06,998 epoch 3 - iter 495/992 - loss 0.07247563 - time (sec): 31.74 - samples/sec: 2560.55 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:10:13,353 epoch 3 - iter 594/992 - loss 0.07191519 - time (sec): 38.10 - samples/sec: 2559.42 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:10:20,186 epoch 3 - iter 693/992 - loss 0.07226050 - time (sec): 44.93 - samples/sec: 2567.25 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:10:26,714 epoch 3 - iter 792/992 - loss 0.07266178 - time (sec): 51.46 - samples/sec: 2563.10 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:10:32,557 epoch 3 - iter 891/992 - loss 0.07256785 - time (sec): 57.30 - samples/sec: 2570.93 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:10:38,603 epoch 3 - iter 990/992 - loss 0.07318580 - time (sec): 63.35 - samples/sec: 2583.60 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:10:38,731 ----------------------------------------------------------------------------------------------------
2023-10-17 12:10:38,732 EPOCH 3 done: loss 0.0731 - lr: 0.000023
2023-10-17 12:10:42,404 DEV : loss 0.08630654215812683 - f1-score (micro avg) 0.7676
2023-10-17 12:10:42,428 saving best model
2023-10-17 12:10:42,965 ----------------------------------------------------------------------------------------------------
2023-10-17 12:10:49,431 epoch 4 - iter 99/992 - loss 0.04416682 - time (sec): 6.46 - samples/sec: 2587.20 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:10:55,552 epoch 4 - iter 198/992 - loss 0.05047964 - time (sec): 12.58 - samples/sec: 2567.57 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:11:01,648 epoch 4 - iter 297/992 - loss 0.05348240 - time (sec): 18.68 - samples/sec: 2544.86 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:11:08,069 epoch 4 - iter 396/992 - loss 0.05389510 - time (sec): 25.10 - samples/sec: 2573.75 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:11:14,268 epoch 4 - iter 495/992 - loss 0.05314738 - time (sec): 31.30 - samples/sec: 2590.35 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:11:20,252 epoch 4 - iter 594/992 - loss 0.05290348 - time (sec): 37.28 - samples/sec: 2600.92 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:11:26,215 epoch 4 - iter 693/992 - loss 0.05269183 - time (sec): 43.25 - samples/sec: 2627.62 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:11:32,343 epoch 4 - iter 792/992 - loss 0.05275766 - time (sec): 49.37 - samples/sec: 2642.76 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:11:38,671 epoch 4 - iter 891/992 - loss 0.05126323 - time (sec): 55.70 - samples/sec: 2648.76 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:11:44,771 epoch 4 - iter 990/992 - loss 0.05192802 - time (sec): 61.80 - samples/sec: 2648.78 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:11:44,890 ----------------------------------------------------------------------------------------------------
2023-10-17 12:11:44,890 EPOCH 4 done: loss 0.0519 - lr: 0.000020
2023-10-17 12:11:48,459 DEV : loss 0.14457282423973083 - f1-score (micro avg) 0.7541
2023-10-17 12:11:48,482 ----------------------------------------------------------------------------------------------------
2023-10-17 12:11:54,656 epoch 5 - iter 99/992 - loss 0.04255254 - time (sec): 6.17 - samples/sec: 2746.18 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:12:00,775 epoch 5 - iter 198/992 - loss 0.03721103 - time (sec): 12.29 - samples/sec: 2715.66 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:12:06,602 epoch 5 - iter 297/992 - loss 0.03770170 - time (sec): 18.12 - samples/sec: 2739.09 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:12:12,537 epoch 5 - iter 396/992 - loss 0.04002274 - time (sec): 24.05 - samples/sec: 2729.65 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:12:18,393 epoch 5 - iter 495/992 - loss 0.04097590 - time (sec): 29.91 - samples/sec: 2731.15 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:12:24,013 epoch 5 - iter 594/992 - loss 0.04084997 - time (sec): 35.53 - samples/sec: 2731.78 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:12:30,275 epoch 5 - iter 693/992 - loss 0.04184421 - time (sec): 41.79 - samples/sec: 2728.74 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:12:36,500 epoch 5 - iter 792/992 - loss 0.04039337 - time (sec): 48.02 - samples/sec: 2724.95 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:12:42,422 epoch 5 - iter 891/992 - loss 0.04016748 - time (sec): 53.94 - samples/sec: 2723.89 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:12:48,687 epoch 5 - iter 990/992 - loss 0.04010609 - time (sec): 60.20 - samples/sec: 2718.44 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:12:48,815 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:48,815 EPOCH 5 done: loss 0.0400 - lr: 0.000017
2023-10-17 12:12:52,461 DEV : loss 0.1545393168926239 - f1-score (micro avg) 0.7457
2023-10-17 12:12:52,486 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:58,240 epoch 6 - iter 99/992 - loss 0.03053762 - time (sec): 5.75 - samples/sec: 2808.37 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:13:04,393 epoch 6 - iter 198/992 - loss 0.03077824 - time (sec): 11.91 - samples/sec: 2752.27 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:13:10,152 epoch 6 - iter 297/992 - loss 0.03119674 - time (sec): 17.66 - samples/sec: 2751.63 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:13:16,163 epoch 6 - iter 396/992 - loss 0.03144636 - time (sec): 23.68 - samples/sec: 2771.61 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:13:22,454 epoch 6 - iter 495/992 - loss 0.03099838 - time (sec): 29.97 - samples/sec: 2759.57 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:13:28,307 epoch 6 - iter 594/992 - loss 0.03087116 - time (sec): 35.82 - samples/sec: 2772.51 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:13:34,052 epoch 6 - iter 693/992 - loss 0.03073382 - time (sec): 41.56 - samples/sec: 2759.70 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:13:39,994 epoch 6 - iter 792/992 - loss 0.03046479 - time (sec): 47.51 - samples/sec: 2760.03 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:13:46,007 epoch 6 - iter 891/992 - loss 0.03053390 - time (sec): 53.52 - samples/sec: 2751.13 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:13:52,001 epoch 6 - iter 990/992 - loss 0.03080662 - time (sec): 59.51 - samples/sec: 2750.77 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:13:52,137 ----------------------------------------------------------------------------------------------------
2023-10-17 12:13:52,138 EPOCH 6 done: loss 0.0308 - lr: 0.000013
2023-10-17 12:13:55,802 DEV : loss 0.16906479001045227 - f1-score (micro avg) 0.7563
2023-10-17 12:13:55,829 ----------------------------------------------------------------------------------------------------
2023-10-17 12:14:02,579 epoch 7 - iter 99/992 - loss 0.03633557 - time (sec): 6.75 - samples/sec: 2451.08 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:14:08,575 epoch 7 - iter 198/992 - loss 0.02602492 - time (sec): 12.74 - samples/sec: 2612.37 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:14:14,831 epoch 7 - iter 297/992 - loss 0.02587295 - time (sec): 19.00 - samples/sec: 2642.57 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:14:20,871 epoch 7 - iter 396/992 - loss 0.02406875 - time (sec): 25.04 - samples/sec: 2668.88 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:14:26,876 epoch 7 - iter 495/992 - loss 0.02415969 - time (sec): 31.04 - samples/sec: 2694.00 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:14:32,581 epoch 7 - iter 594/992 - loss 0.02344326 - time (sec): 36.75 - samples/sec: 2705.12 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:14:38,429 epoch 7 - iter 693/992 - loss 0.02349137 - time (sec): 42.60 - samples/sec: 2700.29 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:14:44,666 epoch 7 - iter 792/992 - loss 0.02373935 - time (sec): 48.83 - samples/sec: 2687.24 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:14:50,524 epoch 7 - iter 891/992 - loss 0.02317591 - time (sec): 54.69 - samples/sec: 2685.02 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:14:56,875 epoch 7 - iter 990/992 - loss 0.02379480 - time (sec): 61.04 - samples/sec: 2681.55 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:14:56,989 ----------------------------------------------------------------------------------------------------
2023-10-17 12:14:56,989 EPOCH 7 done: loss 0.0239 - lr: 0.000010
2023-10-17 12:15:00,763 DEV : loss 0.19923286139965057 - f1-score (micro avg) 0.7584
2023-10-17 12:15:00,789 ----------------------------------------------------------------------------------------------------
2023-10-17 12:15:07,085 epoch 8 - iter 99/992 - loss 0.01766621 - time (sec): 6.29 - samples/sec: 2638.94 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:15:13,503 epoch 8 - iter 198/992 - loss 0.01948794 - time (sec): 12.71 - samples/sec: 2649.48 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:15:19,541 epoch 8 - iter 297/992 - loss 0.02060868 - time (sec): 18.75 - samples/sec: 2637.99 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:15:25,392 epoch 8 - iter 396/992 - loss 0.02079230 - time (sec): 24.60 - samples/sec: 2664.26 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:15:31,525 epoch 8 - iter 495/992 - loss 0.01946266 - time (sec): 30.73 - samples/sec: 2681.18 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:15:37,627 epoch 8 - iter 594/992 - loss 0.01905439 - time (sec): 36.84 - samples/sec: 2670.25 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:15:43,462 epoch 8 - iter 693/992 - loss 0.01877632 - time (sec): 42.67 - samples/sec: 2664.42 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:15:49,740 epoch 8 - iter 792/992 - loss 0.01826040 - time (sec): 48.95 - samples/sec: 2673.89 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:15:55,638 epoch 8 - iter 891/992 - loss 0.01771491 - time (sec): 54.85 - samples/sec: 2679.21 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:16:01,869 epoch 8 - iter 990/992 - loss 0.01711630 - time (sec): 61.08 - samples/sec: 2679.87 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:16:01,999 ----------------------------------------------------------------------------------------------------
2023-10-17 12:16:01,999 EPOCH 8 done: loss 0.0171 - lr: 0.000007
2023-10-17 12:16:05,753 DEV : loss 0.22457100450992584 - f1-score (micro avg) 0.7595
2023-10-17 12:16:05,780 ----------------------------------------------------------------------------------------------------
2023-10-17 12:16:11,813 epoch 9 - iter 99/992 - loss 0.00974185 - time (sec): 6.03 - samples/sec: 2591.16 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:16:17,738 epoch 9 - iter 198/992 - loss 0.01453937 - time (sec): 11.96 - samples/sec: 2666.01 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:16:24,024 epoch 9 - iter 297/992 - loss 0.01576870 - time (sec): 18.24 - samples/sec: 2645.28 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:16:30,258 epoch 9 - iter 396/992 - loss 0.01526670 - time (sec): 24.48 - samples/sec: 2658.71 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:16:36,310 epoch 9 - iter 495/992 - loss 0.01459426 - time (sec): 30.53 - samples/sec: 2658.40 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:16:42,624 epoch 9 - iter 594/992 - loss 0.01510246 - time (sec): 36.84 - samples/sec: 2670.70 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:16:48,677 epoch 9 - iter 693/992 - loss 0.01477998 - time (sec): 42.90 - samples/sec: 2684.08 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:16:54,722 epoch 9 - iter 792/992 - loss 0.01392476 - time (sec): 48.94 - samples/sec: 2680.28 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:17:00,939 epoch 9 - iter 891/992 - loss 0.01406745 - time (sec): 55.16 - samples/sec: 2677.23 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:17:07,036 epoch 9 - iter 990/992 - loss 0.01370319 - time (sec): 61.25 - samples/sec: 2673.04 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:17:07,145 ----------------------------------------------------------------------------------------------------
2023-10-17 12:17:07,145 EPOCH 9 done: loss 0.0137 - lr: 0.000003
2023-10-17 12:17:10,873 DEV : loss 0.2310420423746109 - f1-score (micro avg) 0.7613
2023-10-17 12:17:10,897 ----------------------------------------------------------------------------------------------------
2023-10-17 12:17:17,103 epoch 10 - iter 99/992 - loss 0.01103320 - time (sec): 6.20 - samples/sec: 2698.89 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:17:23,243 epoch 10 - iter 198/992 - loss 0.01140684 - time (sec): 12.34 - samples/sec: 2673.86 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:17:29,396 epoch 10 - iter 297/992 - loss 0.01229238 - time (sec): 18.50 - samples/sec: 2699.56 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:17:35,321 epoch 10 - iter 396/992 - loss 0.01136142 - time (sec): 24.42 - samples/sec: 2692.86 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:17:41,585 epoch 10 - iter 495/992 - loss 0.01008313 - time (sec): 30.69 - samples/sec: 2704.21 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:17:47,624 epoch 10 - iter 594/992 - loss 0.01014504 - time (sec): 36.72 - samples/sec: 2718.86 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:17:53,556 epoch 10 - iter 693/992 - loss 0.01023189 - time (sec): 42.66 - samples/sec: 2728.14 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:17:59,429 epoch 10 - iter 792/992 - loss 0.01009792 - time (sec): 48.53 - samples/sec: 2724.82 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:18:05,512 epoch 10 - iter 891/992 - loss 0.01058406 - time (sec): 54.61 - samples/sec: 2706.44 - lr: 0.000000 - momentum: 0.000000
2023-10-17 12:18:11,517 epoch 10 - iter 990/992 - loss 0.01066306 - time (sec): 60.62 - samples/sec: 2701.38 - lr: 0.000000 - momentum: 0.000000
2023-10-17 12:18:11,627 ----------------------------------------------------------------------------------------------------
2023-10-17 12:18:11,628 EPOCH 10 done: loss 0.0106 - lr: 0.000000
2023-10-17 12:18:15,865 DEV : loss 0.23796458542346954 - f1-score (micro avg) 0.7672
2023-10-17 12:18:16,334 ----------------------------------------------------------------------------------------------------
2023-10-17 12:18:16,336 Loading model from best epoch ...
2023-10-17 12:18:17,913 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 12:18:21,757
Results:
- F-score (micro) 0.7771
- F-score (macro) 0.697
- Accuracy 0.6629
By class:
precision recall f1-score support
LOC 0.8358 0.8626 0.8490 655
PER 0.6897 0.8072 0.7438 223
ORG 0.4494 0.5591 0.4982 127
micro avg 0.7452 0.8119 0.7771 1005
macro avg 0.6583 0.7429 0.6970 1005
weighted avg 0.7545 0.8119 0.7813 1005
2023-10-17 12:18:21,758 ----------------------------------------------------------------------------------------------------