2023-10-17 12:07:24,738 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:24,739 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 12:07:24,739 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:24,739 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-17 12:07:24,739 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:24,739 Train: 7936 sentences 2023-10-17 12:07:24,739 (train_with_dev=False, train_with_test=False) 2023-10-17 12:07:24,739 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:24,739 Training Params: 2023-10-17 12:07:24,739 - learning_rate: "3e-05" 2023-10-17 12:07:24,739 - mini_batch_size: "8" 2023-10-17 12:07:24,739 - max_epochs: "10" 2023-10-17 12:07:24,739 - shuffle: "True" 2023-10-17 12:07:24,739 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:24,739 Plugins: 2023-10-17 12:07:24,739 - TensorboardLogger 2023-10-17 12:07:24,739 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 12:07:24,740 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:24,740 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 12:07:24,740 - metric: "('micro avg', 'f1-score')" 2023-10-17 12:07:24,740 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:24,740 Computation: 2023-10-17 12:07:24,740 - compute on device: cuda:0 2023-10-17 12:07:24,740 - embedding storage: none 2023-10-17 12:07:24,740 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:24,740 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 12:07:24,740 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:24,740 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:24,740 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 12:07:30,633 epoch 1 - iter 99/992 - loss 2.81599692 - time (sec): 5.89 - samples/sec: 2704.36 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:07:37,189 epoch 1 - iter 198/992 - loss 1.62981031 - time (sec): 12.45 - samples/sec: 2621.75 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:07:43,470 epoch 1 - iter 297/992 - loss 1.19195552 - time (sec): 18.73 - samples/sec: 2606.95 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:07:49,903 epoch 1 - iter 396/992 - loss 0.94531236 - time (sec): 25.16 - samples/sec: 2599.13 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:07:56,328 epoch 1 - iter 495/992 - loss 0.78791628 - time (sec): 31.59 - samples/sec: 2619.30 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:08:02,769 epoch 1 - iter 594/992 - loss 0.69398148 - time (sec): 38.03 - samples/sec: 2604.33 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:08:09,093 epoch 1 - iter 693/992 - loss 0.61610072 - time (sec): 44.35 - samples/sec: 2617.66 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:08:15,034 epoch 1 - iter 792/992 - loss 0.56155201 - time (sec): 50.29 - samples/sec: 2618.04 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:08:20,806 epoch 1 - iter 891/992 - loss 0.51541319 - time (sec): 56.06 - samples/sec: 2632.79 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:08:26,516 epoch 1 - iter 990/992 - loss 0.47731115 - time (sec): 61.78 - samples/sec: 2650.78 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:08:26,615 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:08:26,615 EPOCH 1 done: loss 0.4767 - lr: 0.000030 2023-10-17 12:08:29,992 DEV : loss 0.09053196012973785 - f1-score (micro avg) 0.7046 2023-10-17 12:08:30,020 saving best model 2023-10-17 12:08:30,478 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:08:36,955 epoch 2 - iter 99/992 - loss 0.12503785 - time (sec): 6.47 - samples/sec: 2406.01 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:08:42,937 epoch 2 - iter 198/992 - loss 0.11258958 - time (sec): 12.46 - samples/sec: 2541.06 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:08:48,784 epoch 2 - iter 297/992 - loss 0.11476512 - time (sec): 18.30 - samples/sec: 2635.49 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:08:54,675 epoch 2 - iter 396/992 - loss 0.11579427 - time (sec): 24.19 - samples/sec: 2669.33 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:09:00,719 epoch 2 - iter 495/992 - loss 0.11485782 - time (sec): 30.24 - samples/sec: 2701.86 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:09:06,415 epoch 2 - iter 594/992 - loss 0.11334714 - time (sec): 35.93 - samples/sec: 2722.81 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:09:12,303 epoch 2 - iter 693/992 - loss 0.10946511 - time (sec): 41.82 - samples/sec: 2733.25 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:09:18,263 epoch 2 - iter 792/992 - loss 0.10780487 - time (sec): 47.78 - samples/sec: 2750.25 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:09:24,067 epoch 2 - iter 891/992 - loss 0.10649763 - time (sec): 53.59 - samples/sec: 2759.92 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:09:29,795 epoch 2 - iter 990/992 - loss 0.10582706 - time (sec): 59.31 - samples/sec: 2760.34 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:09:29,910 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:09:29,910 EPOCH 2 done: loss 0.1058 - lr: 0.000027 2023-10-17 12:09:34,536 DEV : loss 0.08198774605989456 - f1-score (micro avg) 0.7348 2023-10-17 12:09:34,563 saving best model 2023-10-17 12:09:35,254 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:09:41,960 epoch 3 - iter 99/992 - loss 0.07989941 - time (sec): 6.70 - samples/sec: 2509.07 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:09:48,280 epoch 3 - iter 198/992 - loss 0.07549915 - time (sec): 13.02 - samples/sec: 2531.43 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:09:54,730 epoch 3 - iter 297/992 - loss 0.07291906 - time (sec): 19.47 - samples/sec: 2557.87 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:10:00,729 epoch 3 - iter 396/992 - loss 0.07510826 - time (sec): 25.47 - samples/sec: 2542.56 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:10:06,998 epoch 3 - iter 495/992 - loss 0.07247563 - time (sec): 31.74 - samples/sec: 2560.55 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:10:13,353 epoch 3 - iter 594/992 - loss 0.07191519 - time (sec): 38.10 - samples/sec: 2559.42 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:10:20,186 epoch 3 - iter 693/992 - loss 0.07226050 - time (sec): 44.93 - samples/sec: 2567.25 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:10:26,714 epoch 3 - iter 792/992 - loss 0.07266178 - time (sec): 51.46 - samples/sec: 2563.10 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:10:32,557 epoch 3 - iter 891/992 - loss 0.07256785 - time (sec): 57.30 - samples/sec: 2570.93 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:10:38,603 epoch 3 - iter 990/992 - loss 0.07318580 - time (sec): 63.35 - samples/sec: 2583.60 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:10:38,731 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:38,732 EPOCH 3 done: loss 0.0731 - lr: 0.000023 2023-10-17 12:10:42,404 DEV : loss 0.08630654215812683 - f1-score (micro avg) 0.7676 2023-10-17 12:10:42,428 saving best model 2023-10-17 12:10:42,965 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:49,431 epoch 4 - iter 99/992 - loss 0.04416682 - time (sec): 6.46 - samples/sec: 2587.20 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:10:55,552 epoch 4 - iter 198/992 - loss 0.05047964 - time (sec): 12.58 - samples/sec: 2567.57 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:11:01,648 epoch 4 - iter 297/992 - loss 0.05348240 - time (sec): 18.68 - samples/sec: 2544.86 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:11:08,069 epoch 4 - iter 396/992 - loss 0.05389510 - time (sec): 25.10 - samples/sec: 2573.75 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:11:14,268 epoch 4 - iter 495/992 - loss 0.05314738 - time (sec): 31.30 - samples/sec: 2590.35 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:11:20,252 epoch 4 - iter 594/992 - loss 0.05290348 - time (sec): 37.28 - samples/sec: 2600.92 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:11:26,215 epoch 4 - iter 693/992 - loss 0.05269183 - time (sec): 43.25 - samples/sec: 2627.62 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:11:32,343 epoch 4 - iter 792/992 - loss 0.05275766 - time (sec): 49.37 - samples/sec: 2642.76 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:11:38,671 epoch 4 - iter 891/992 - loss 0.05126323 - time (sec): 55.70 - samples/sec: 2648.76 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:11:44,771 epoch 4 - iter 990/992 - loss 0.05192802 - time (sec): 61.80 - samples/sec: 2648.78 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:11:44,890 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:11:44,890 EPOCH 4 done: loss 0.0519 - lr: 0.000020 2023-10-17 12:11:48,459 DEV : loss 0.14457282423973083 - f1-score (micro avg) 0.7541 2023-10-17 12:11:48,482 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:11:54,656 epoch 5 - iter 99/992 - loss 0.04255254 - time (sec): 6.17 - samples/sec: 2746.18 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:12:00,775 epoch 5 - iter 198/992 - loss 0.03721103 - time (sec): 12.29 - samples/sec: 2715.66 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:12:06,602 epoch 5 - iter 297/992 - loss 0.03770170 - time (sec): 18.12 - samples/sec: 2739.09 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:12:12,537 epoch 5 - iter 396/992 - loss 0.04002274 - time (sec): 24.05 - samples/sec: 2729.65 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:12:18,393 epoch 5 - iter 495/992 - loss 0.04097590 - time (sec): 29.91 - samples/sec: 2731.15 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:12:24,013 epoch 5 - iter 594/992 - loss 0.04084997 - time (sec): 35.53 - samples/sec: 2731.78 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:12:30,275 epoch 5 - iter 693/992 - loss 0.04184421 - time (sec): 41.79 - samples/sec: 2728.74 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:12:36,500 epoch 5 - iter 792/992 - loss 0.04039337 - time (sec): 48.02 - samples/sec: 2724.95 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:12:42,422 epoch 5 - iter 891/992 - loss 0.04016748 - time (sec): 53.94 - samples/sec: 2723.89 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:12:48,687 epoch 5 - iter 990/992 - loss 0.04010609 - time (sec): 60.20 - samples/sec: 2718.44 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:12:48,815 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:48,815 EPOCH 5 done: loss 0.0400 - lr: 0.000017 2023-10-17 12:12:52,461 DEV : loss 0.1545393168926239 - f1-score (micro avg) 0.7457 2023-10-17 12:12:52,486 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:58,240 epoch 6 - iter 99/992 - loss 0.03053762 - time (sec): 5.75 - samples/sec: 2808.37 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:13:04,393 epoch 6 - iter 198/992 - loss 0.03077824 - time (sec): 11.91 - samples/sec: 2752.27 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:13:10,152 epoch 6 - iter 297/992 - loss 0.03119674 - time (sec): 17.66 - samples/sec: 2751.63 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:13:16,163 epoch 6 - iter 396/992 - loss 0.03144636 - time (sec): 23.68 - samples/sec: 2771.61 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:13:22,454 epoch 6 - iter 495/992 - loss 0.03099838 - time (sec): 29.97 - samples/sec: 2759.57 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:13:28,307 epoch 6 - iter 594/992 - loss 0.03087116 - time (sec): 35.82 - samples/sec: 2772.51 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:13:34,052 epoch 6 - iter 693/992 - loss 0.03073382 - time (sec): 41.56 - samples/sec: 2759.70 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:13:39,994 epoch 6 - iter 792/992 - loss 0.03046479 - time (sec): 47.51 - samples/sec: 2760.03 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:13:46,007 epoch 6 - iter 891/992 - loss 0.03053390 - time (sec): 53.52 - samples/sec: 2751.13 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:13:52,001 epoch 6 - iter 990/992 - loss 0.03080662 - time (sec): 59.51 - samples/sec: 2750.77 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:13:52,137 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:13:52,138 EPOCH 6 done: loss 0.0308 - lr: 0.000013 2023-10-17 12:13:55,802 DEV : loss 0.16906479001045227 - f1-score (micro avg) 0.7563 2023-10-17 12:13:55,829 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:14:02,579 epoch 7 - iter 99/992 - loss 0.03633557 - time (sec): 6.75 - samples/sec: 2451.08 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:14:08,575 epoch 7 - iter 198/992 - loss 0.02602492 - time (sec): 12.74 - samples/sec: 2612.37 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:14:14,831 epoch 7 - iter 297/992 - loss 0.02587295 - time (sec): 19.00 - samples/sec: 2642.57 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:14:20,871 epoch 7 - iter 396/992 - loss 0.02406875 - time (sec): 25.04 - samples/sec: 2668.88 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:14:26,876 epoch 7 - iter 495/992 - loss 0.02415969 - time (sec): 31.04 - samples/sec: 2694.00 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:14:32,581 epoch 7 - iter 594/992 - loss 0.02344326 - time (sec): 36.75 - samples/sec: 2705.12 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:14:38,429 epoch 7 - iter 693/992 - loss 0.02349137 - time (sec): 42.60 - samples/sec: 2700.29 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:14:44,666 epoch 7 - iter 792/992 - loss 0.02373935 - time (sec): 48.83 - samples/sec: 2687.24 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:14:50,524 epoch 7 - iter 891/992 - loss 0.02317591 - time (sec): 54.69 - samples/sec: 2685.02 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:14:56,875 epoch 7 - iter 990/992 - loss 0.02379480 - time (sec): 61.04 - samples/sec: 2681.55 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:14:56,989 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:14:56,989 EPOCH 7 done: loss 0.0239 - lr: 0.000010 2023-10-17 12:15:00,763 DEV : loss 0.19923286139965057 - f1-score (micro avg) 0.7584 2023-10-17 12:15:00,789 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:15:07,085 epoch 8 - iter 99/992 - loss 0.01766621 - time (sec): 6.29 - samples/sec: 2638.94 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:15:13,503 epoch 8 - iter 198/992 - loss 0.01948794 - time (sec): 12.71 - samples/sec: 2649.48 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:15:19,541 epoch 8 - iter 297/992 - loss 0.02060868 - time (sec): 18.75 - samples/sec: 2637.99 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:15:25,392 epoch 8 - iter 396/992 - loss 0.02079230 - time (sec): 24.60 - samples/sec: 2664.26 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:15:31,525 epoch 8 - iter 495/992 - loss 0.01946266 - time (sec): 30.73 - samples/sec: 2681.18 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:15:37,627 epoch 8 - iter 594/992 - loss 0.01905439 - time (sec): 36.84 - samples/sec: 2670.25 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:15:43,462 epoch 8 - iter 693/992 - loss 0.01877632 - time (sec): 42.67 - samples/sec: 2664.42 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:15:49,740 epoch 8 - iter 792/992 - loss 0.01826040 - time (sec): 48.95 - samples/sec: 2673.89 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:15:55,638 epoch 8 - iter 891/992 - loss 0.01771491 - time (sec): 54.85 - samples/sec: 2679.21 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:16:01,869 epoch 8 - iter 990/992 - loss 0.01711630 - time (sec): 61.08 - samples/sec: 2679.87 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:16:01,999 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:16:01,999 EPOCH 8 done: loss 0.0171 - lr: 0.000007 2023-10-17 12:16:05,753 DEV : loss 0.22457100450992584 - f1-score (micro avg) 0.7595 2023-10-17 12:16:05,780 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:16:11,813 epoch 9 - iter 99/992 - loss 0.00974185 - time (sec): 6.03 - samples/sec: 2591.16 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:16:17,738 epoch 9 - iter 198/992 - loss 0.01453937 - time (sec): 11.96 - samples/sec: 2666.01 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:16:24,024 epoch 9 - iter 297/992 - loss 0.01576870 - time (sec): 18.24 - samples/sec: 2645.28 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:16:30,258 epoch 9 - iter 396/992 - loss 0.01526670 - time (sec): 24.48 - samples/sec: 2658.71 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:16:36,310 epoch 9 - iter 495/992 - loss 0.01459426 - time (sec): 30.53 - samples/sec: 2658.40 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:16:42,624 epoch 9 - iter 594/992 - loss 0.01510246 - time (sec): 36.84 - samples/sec: 2670.70 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:16:48,677 epoch 9 - iter 693/992 - loss 0.01477998 - time (sec): 42.90 - samples/sec: 2684.08 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:16:54,722 epoch 9 - iter 792/992 - loss 0.01392476 - time (sec): 48.94 - samples/sec: 2680.28 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:17:00,939 epoch 9 - iter 891/992 - loss 0.01406745 - time (sec): 55.16 - samples/sec: 2677.23 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:17:07,036 epoch 9 - iter 990/992 - loss 0.01370319 - time (sec): 61.25 - samples/sec: 2673.04 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:17:07,145 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:17:07,145 EPOCH 9 done: loss 0.0137 - lr: 0.000003 2023-10-17 12:17:10,873 DEV : loss 0.2310420423746109 - f1-score (micro avg) 0.7613 2023-10-17 12:17:10,897 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:17:17,103 epoch 10 - iter 99/992 - loss 0.01103320 - time (sec): 6.20 - samples/sec: 2698.89 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:17:23,243 epoch 10 - iter 198/992 - loss 0.01140684 - time (sec): 12.34 - samples/sec: 2673.86 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:17:29,396 epoch 10 - iter 297/992 - loss 0.01229238 - time (sec): 18.50 - samples/sec: 2699.56 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:17:35,321 epoch 10 - iter 396/992 - loss 0.01136142 - time (sec): 24.42 - samples/sec: 2692.86 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:17:41,585 epoch 10 - iter 495/992 - loss 0.01008313 - time (sec): 30.69 - samples/sec: 2704.21 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:17:47,624 epoch 10 - iter 594/992 - loss 0.01014504 - time (sec): 36.72 - samples/sec: 2718.86 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:17:53,556 epoch 10 - iter 693/992 - loss 0.01023189 - time (sec): 42.66 - samples/sec: 2728.14 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:17:59,429 epoch 10 - iter 792/992 - loss 0.01009792 - time (sec): 48.53 - samples/sec: 2724.82 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:18:05,512 epoch 10 - iter 891/992 - loss 0.01058406 - time (sec): 54.61 - samples/sec: 2706.44 - lr: 0.000000 - momentum: 0.000000 2023-10-17 12:18:11,517 epoch 10 - iter 990/992 - loss 0.01066306 - time (sec): 60.62 - samples/sec: 2701.38 - lr: 0.000000 - momentum: 0.000000 2023-10-17 12:18:11,627 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:18:11,628 EPOCH 10 done: loss 0.0106 - lr: 0.000000 2023-10-17 12:18:15,865 DEV : loss 0.23796458542346954 - f1-score (micro avg) 0.7672 2023-10-17 12:18:16,334 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:18:16,336 Loading model from best epoch ... 2023-10-17 12:18:17,913 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 12:18:21,757 Results: - F-score (micro) 0.7771 - F-score (macro) 0.697 - Accuracy 0.6629 By class: precision recall f1-score support LOC 0.8358 0.8626 0.8490 655 PER 0.6897 0.8072 0.7438 223 ORG 0.4494 0.5591 0.4982 127 micro avg 0.7452 0.8119 0.7771 1005 macro avg 0.6583 0.7429 0.6970 1005 weighted avg 0.7545 0.8119 0.7813 1005 2023-10-17 12:18:21,758 ----------------------------------------------------------------------------------------------------