MalyO2 commited on
Commit
4ff1976
·
verified ·
1 Parent(s): 1ec2292

MalyO2/detr_finetune_aug_no_scheduler

Browse files
README.md CHANGED
@@ -16,23 +16,19 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [facebook/detr-resnet-50-dc5](https://huggingface.co/facebook/detr-resnet-50-dc5) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.7887
20
- - Map: 0.55
21
- - Map 50: 0.6825
22
- - Map 75: 0.5932
23
  - Map Small: 0.0
24
- - Map Medium: 0.5352
25
- - Map Large: 0.7531
26
- - Mar 1: 0.1882
27
- - Mar 10: 0.6735
28
- - Mar 100: 0.7588
29
  - Mar Small: 0.0
30
- - Mar Medium: 0.7158
31
- - Mar Large: 0.9385
32
- - Map Object: -1.0
33
- - Mar 100 Object: -1.0
34
- - Map Balloon: 0.55
35
- - Mar 100 Balloon: 0.7588
36
 
37
  ## Model description
38
 
@@ -51,31 +47,59 @@ More information needed
51
  ### Training hyperparameters
52
 
53
  The following hyperparameters were used during training:
54
- - learning_rate: 3e-05
55
  - train_batch_size: 4
56
  - eval_batch_size: 4
57
  - seed: 42
58
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
59
  - lr_scheduler_type: linear
60
- - training_steps: 125
61
  - mixed_precision_training: Native AMP
62
 
63
  ### Training results
64
 
65
- | Training Loss | Epoch | Step | Validation Loss | Map | Map 50 | Map 75 | Map Small | Map Medium | Map Large | Mar 1 | Mar 10 | Mar 100 | Mar Small | Mar Medium | Mar Large | Map Object | Mar 100 Object | Map Balloon | Mar 100 Balloon |
66
- |:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:----------:|:---------:|:------:|:------:|:-------:|:---------:|:----------:|:---------:|:----------:|:--------------:|:-----------:|:---------------:|
67
- | 2.1236 | 0.7692 | 10 | 1.3396 | 0.0768 | 0.1002 | 0.0897 | 0.0 | 0.0966 | 0.1387 | 0.0765 | 0.3735 | 0.5647 | 0.0 | 0.3789 | 0.9231 | -1.0 | -1.0 | 0.0768 | 0.5647 |
68
- | 1.5088 | 1.5385 | 20 | 1.2730 | 0.1472 | 0.1875 | 0.1691 | 0.0 | 0.1297 | 0.2723 | 0.1059 | 0.3647 | 0.6618 | 0.0 | 0.5684 | 0.9 | -1.0 | -1.0 | 0.1472 | 0.6618 |
69
- | 1.3182 | 2.3077 | 30 | 1.2273 | 0.1816 | 0.2322 | 0.1918 | 0.0 | 0.2368 | 0.3423 | 0.1088 | 0.3941 | 0.6647 | 0.0 | 0.6053 | 0.8538 | -1.0 | -1.0 | 0.1816 | 0.6647 |
70
- | 1.365 | 3.0769 | 40 | 1.0452 | 0.2476 | 0.3019 | 0.2823 | 0.0 | 0.3035 | 0.4146 | 0.1118 | 0.4882 | 0.7559 | 0.0 | 0.7158 | 0.9308 | -1.0 | -1.0 | 0.2476 | 0.7559 |
71
- | 1.2013 | 3.8462 | 50 | 0.9825 | 0.3006 | 0.3891 | 0.3233 | 0.0 | 0.3747 | 0.496 | 0.1324 | 0.5265 | 0.7324 | 0.0 | 0.6737 | 0.9308 | -1.0 | -1.0 | 0.3006 | 0.7324 |
72
- | 1.3605 | 4.6154 | 60 | 0.9307 | 0.3655 | 0.4809 | 0.4024 | 0.0 | 0.3706 | 0.5922 | 0.1324 | 0.5471 | 0.7294 | 0.0 | 0.6684 | 0.9308 | -1.0 | -1.0 | 0.3655 | 0.7294 |
73
- | 1.0117 | 5.3846 | 70 | 0.8867 | 0.3834 | 0.5044 | 0.4222 | 0.0 | 0.4086 | 0.5963 | 0.1294 | 0.5882 | 0.7324 | 0.0 | 0.6737 | 0.9308 | -1.0 | -1.0 | 0.3834 | 0.7324 |
74
- | 1.1224 | 6.1538 | 80 | 0.8413 | 0.478 | 0.6138 | 0.5427 | 0.0 | 0.472 | 0.7053 | 0.1676 | 0.6265 | 0.7529 | 0.0 | 0.7053 | 0.9385 | -1.0 | -1.0 | 0.478 | 0.7529 |
75
- | 1.0109 | 6.9231 | 90 | 0.8210 | 0.5281 | 0.6515 | 0.5817 | 0.0 | 0.5391 | 0.7497 | 0.1559 | 0.6441 | 0.7735 | 0.0 | 0.7316 | 0.9538 | -1.0 | -1.0 | 0.5281 | 0.7735 |
76
- | 1.0771 | 7.6923 | 100 | 0.8153 | 0.5506 | 0.6859 | 0.604 | 0.0 | 0.5638 | 0.7373 | 0.1794 | 0.6618 | 0.7676 | 0.0 | 0.7263 | 0.9462 | -1.0 | -1.0 | 0.5506 | 0.7676 |
77
- | 0.9122 | 8.4615 | 110 | 0.7948 | 0.5551 | 0.6839 | 0.6097 | 0.0 | 0.5603 | 0.7503 | 0.1853 | 0.6618 | 0.7824 | 0.0 | 0.7526 | 0.9462 | -1.0 | -1.0 | 0.5551 | 0.7824 |
78
- | 0.9918 | 9.2308 | 120 | 0.7887 | 0.55 | 0.6825 | 0.5932 | 0.0 | 0.5352 | 0.7531 | 0.1882 | 0.6735 | 0.7588 | 0.0 | 0.7158 | 0.9385 | -1.0 | -1.0 | 0.55 | 0.7588 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
 
80
 
81
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [facebook/detr-resnet-50-dc5](https://huggingface.co/facebook/detr-resnet-50-dc5) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.5836
20
+ - Map: 0.5257
21
+ - Map 50: 0.6508
22
+ - Map 75: 0.6241
23
  - Map Small: 0.0
24
+ - Map Medium: 0.4752
25
+ - Map Large: 0.7513
26
+ - Mar 1: 0.1853
27
+ - Mar 10: 0.6
28
+ - Mar 100: 0.7147
29
  - Mar Small: 0.0
30
+ - Mar Medium: 0.6684
31
+ - Mar Large: 0.8923
 
 
 
 
32
 
33
  ## Model description
34
 
 
47
  ### Training hyperparameters
48
 
49
  The following hyperparameters were used during training:
50
+ - learning_rate: 5e-05
51
  - train_batch_size: 4
52
  - eval_batch_size: 4
53
  - seed: 42
54
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
55
  - lr_scheduler_type: linear
56
+ - training_steps: 400
57
  - mixed_precision_training: Native AMP
58
 
59
  ### Training results
60
 
61
+ | Training Loss | Epoch | Step | Validation Loss | Map | Map 50 | Map 75 | Map Small | Map Medium | Map Large | Mar 1 | Mar 10 | Mar 100 | Mar Small | Mar Medium | Mar Large |
62
+ |:-------------:|:-------:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:----------:|:---------:|:------:|:------:|:-------:|:---------:|:----------:|:---------:|
63
+ | 4.1002 | 0.7692 | 10 | 4.1741 | 0.0003 | 0.001 | 0.0003 | 0.0 | 0.0062 | 0.0002 | 0.0 | 0.0 | 0.0441 | 0.0 | 0.0474 | 0.0462 |
64
+ | 1.772 | 1.5385 | 20 | 1.4577 | 0.0298 | 0.05 | 0.0286 | 0.0 | 0.0185 | 0.0656 | 0.0294 | 0.1206 | 0.4882 | 0.0 | 0.3421 | 0.7769 |
65
+ | 1.5665 | 2.3077 | 30 | 1.3869 | 0.0339 | 0.0549 | 0.0351 | 0.0 | 0.0407 | 0.0516 | 0.0029 | 0.0824 | 0.6059 | 0.0 | 0.5158 | 0.8308 |
66
+ | 2.0258 | 3.0769 | 40 | 1.2246 | 0.0561 | 0.0797 | 0.0593 | 0.0 | 0.0398 | 0.1166 | 0.0265 | 0.1206 | 0.6441 | 0.0 | 0.5789 | 0.8385 |
67
+ | 1.5082 | 3.8462 | 50 | 1.1988 | 0.0477 | 0.0869 | 0.0542 | 0.0 | 0.0927 | 0.063 | 0.0235 | 0.0853 | 0.6471 | 0.0 | 0.6316 | 0.7692 |
68
+ | 1.3716 | 4.6154 | 60 | 1.1917 | 0.0549 | 0.1014 | 0.0602 | 0.0 | 0.0902 | 0.0761 | 0.0588 | 0.1618 | 0.5971 | 0.0 | 0.5421 | 0.7692 |
69
+ | 1.2398 | 5.3846 | 70 | 1.0554 | 0.1329 | 0.1674 | 0.1485 | 0.0 | 0.1462 | 0.1957 | 0.0765 | 0.1882 | 0.7294 | 0.0 | 0.7474 | 0.8154 |
70
+ | 1.401 | 6.1538 | 80 | 0.9179 | 0.1176 | 0.1821 | 0.1315 | 0.0 | 0.0835 | 0.2295 | 0.0529 | 0.1794 | 0.7294 | 0.0 | 0.7211 | 0.8538 |
71
+ | 2.0328 | 6.9231 | 90 | 0.9198 | 0.1361 | 0.2109 | 0.1554 | 0.0 | 0.0937 | 0.2424 | 0.0559 | 0.2088 | 0.6882 | 0.0 | 0.6368 | 0.8692 |
72
+ | 1.6358 | 7.6923 | 100 | 0.9298 | 0.2252 | 0.2898 | 0.2523 | 0.0 | 0.2279 | 0.3487 | 0.1059 | 0.3176 | 0.6882 | 0.0 | 0.6263 | 0.8846 |
73
+ | 0.8849 | 8.4615 | 110 | 0.8894 | 0.1893 | 0.2435 | 0.2248 | 0.0 | 0.1438 | 0.3337 | 0.0971 | 0.2265 | 0.7265 | 0.0 | 0.7263 | 0.8385 |
74
+ | 1.1906 | 9.2308 | 120 | 0.8505 | 0.2105 | 0.2704 | 0.2598 | 0.0 | 0.1879 | 0.3317 | 0.1324 | 0.2706 | 0.6853 | 0.0 | 0.6474 | 0.8462 |
75
+ | 1.0404 | 10.0 | 130 | 0.7320 | 0.2508 | 0.2998 | 0.29 | 0.0 | 0.2031 | 0.4149 | 0.1588 | 0.2971 | 0.7471 | 0.0 | 0.7421 | 0.8692 |
76
+ | 1.1534 | 10.7692 | 140 | 0.7996 | 0.2832 | 0.374 | 0.3479 | 0.0 | 0.2502 | 0.411 | 0.1676 | 0.3647 | 0.6647 | 0.0 | 0.6263 | 0.8231 |
77
+ | 1.1725 | 11.5385 | 150 | 0.7990 | 0.3115 | 0.4464 | 0.3745 | 0.0 | 0.2972 | 0.4147 | 0.1294 | 0.3735 | 0.6588 | 0.0 | 0.6158 | 0.8231 |
78
+ | 0.891 | 12.3077 | 160 | 0.9007 | 0.2856 | 0.3519 | 0.3449 | 0.0 | 0.2607 | 0.3788 | 0.1029 | 0.3529 | 0.6735 | 0.0 | 0.6263 | 0.8462 |
79
+ | 1.1 | 13.0769 | 170 | 0.7376 | 0.2642 | 0.3608 | 0.3377 | 0.0 | 0.2281 | 0.4018 | 0.1176 | 0.3676 | 0.7176 | 0.0 | 0.7 | 0.8538 |
80
+ | 1.2631 | 13.8462 | 180 | 0.7162 | 0.306 | 0.4363 | 0.3899 | 0.0 | 0.2997 | 0.3933 | 0.1412 | 0.45 | 0.7059 | 0.0 | 0.7053 | 0.8154 |
81
+ | 1.0496 | 14.6154 | 190 | 0.7276 | 0.2811 | 0.3866 | 0.3483 | 0.0 | 0.3061 | 0.3685 | 0.1471 | 0.3882 | 0.7235 | 0.0 | 0.7316 | 0.8231 |
82
+ | 0.8883 | 15.3846 | 200 | 0.6855 | 0.3373 | 0.4578 | 0.4385 | 0.0 | 0.3441 | 0.4654 | 0.15 | 0.4824 | 0.7412 | 0.0 | 0.7579 | 0.8308 |
83
+ | 0.8471 | 16.1538 | 210 | 0.6733 | 0.4351 | 0.5932 | 0.5367 | 0.0 | 0.3702 | 0.6215 | 0.15 | 0.5412 | 0.7206 | 0.0 | 0.7158 | 0.8385 |
84
+ | 0.9084 | 16.9231 | 220 | 0.6526 | 0.4279 | 0.5632 | 0.4848 | 0.0 | 0.4011 | 0.572 | 0.1824 | 0.5647 | 0.7294 | 0.0 | 0.7105 | 0.8692 |
85
+ | 0.8872 | 17.6923 | 230 | 0.6218 | 0.4376 | 0.5753 | 0.5274 | 0.0 | 0.3879 | 0.6215 | 0.1559 | 0.5853 | 0.7382 | 0.0 | 0.7263 | 0.8692 |
86
+ | 0.9739 | 18.4615 | 240 | 0.6590 | 0.4494 | 0.6293 | 0.505 | 0.0 | 0.3889 | 0.65 | 0.1471 | 0.5853 | 0.7029 | 0.0 | 0.6895 | 0.8308 |
87
+ | 0.7596 | 19.2308 | 250 | 0.6367 | 0.4625 | 0.6229 | 0.5322 | 0.0 | 0.4106 | 0.6581 | 0.1529 | 0.5853 | 0.7118 | 0.0 | 0.7053 | 0.8308 |
88
+ | 0.7124 | 20.0 | 260 | 0.6601 | 0.4619 | 0.6411 | 0.5327 | 0.0 | 0.39 | 0.6852 | 0.1559 | 0.5765 | 0.6794 | 0.0 | 0.6421 | 0.8385 |
89
+ | 0.8369 | 20.7692 | 270 | 0.6363 | 0.4736 | 0.64 | 0.5738 | 0.0 | 0.3993 | 0.737 | 0.1559 | 0.5853 | 0.6853 | 0.0 | 0.6474 | 0.8462 |
90
+ | 0.8608 | 21.5385 | 280 | 0.6304 | 0.496 | 0.6406 | 0.5583 | 0.0 | 0.4484 | 0.6973 | 0.1588 | 0.5912 | 0.7 | 0.0 | 0.6579 | 0.8692 |
91
+ | 0.6174 | 22.3077 | 290 | 0.6825 | 0.4808 | 0.6714 | 0.5569 | 0.0 | 0.4264 | 0.6738 | 0.1529 | 0.5765 | 0.6735 | 0.0 | 0.6158 | 0.8615 |
92
+ | 0.5903 | 23.0769 | 300 | 0.6037 | 0.5187 | 0.6804 | 0.6126 | 0.0 | 0.4604 | 0.709 | 0.1824 | 0.6118 | 0.7206 | 0.0 | 0.6842 | 0.8846 |
93
+ | 0.6325 | 23.8462 | 310 | 0.6373 | 0.529 | 0.6819 | 0.6246 | 0.0 | 0.4489 | 0.7601 | 0.1765 | 0.5941 | 0.7088 | 0.0 | 0.6579 | 0.8923 |
94
+ | 0.8569 | 24.6154 | 320 | 0.6131 | 0.5382 | 0.6684 | 0.6357 | 0.0 | 0.4862 | 0.7382 | 0.1794 | 0.6147 | 0.7294 | 0.0 | 0.7 | 0.8846 |
95
+ | 0.7056 | 25.3846 | 330 | 0.5700 | 0.5244 | 0.6545 | 0.6089 | 0.0 | 0.4891 | 0.6871 | 0.1824 | 0.6176 | 0.75 | 0.0 | 0.7421 | 0.8769 |
96
+ | 0.5988 | 26.1538 | 340 | 0.5738 | 0.5437 | 0.7119 | 0.651 | 0.0 | 0.5362 | 0.6823 | 0.1853 | 0.6206 | 0.7529 | 0.0 | 0.7579 | 0.8615 |
97
+ | 0.5209 | 26.9231 | 350 | 0.6136 | 0.5153 | 0.6944 | 0.6047 | 0.0 | 0.4772 | 0.7054 | 0.1824 | 0.5882 | 0.7059 | 0.0 | 0.6789 | 0.8538 |
98
+ | 0.6547 | 27.6923 | 360 | 0.6338 | 0.5166 | 0.6645 | 0.6224 | 0.0 | 0.4842 | 0.7072 | 0.1882 | 0.5971 | 0.7088 | 0.0 | 0.6842 | 0.8538 |
99
+ | 0.6324 | 28.4615 | 370 | 0.6083 | 0.5143 | 0.6543 | 0.6279 | 0.0 | 0.4683 | 0.729 | 0.1853 | 0.6 | 0.7118 | 0.0 | 0.6789 | 0.8692 |
100
+ | 0.6323 | 29.2308 | 380 | 0.5748 | 0.529 | 0.6552 | 0.637 | 0.0 | 0.48 | 0.7529 | 0.1853 | 0.6088 | 0.7206 | 0.0 | 0.6842 | 0.8846 |
101
+ | 0.4509 | 30.0 | 390 | 0.5758 | 0.5311 | 0.652 | 0.6325 | 0.0 | 0.4923 | 0.7454 | 0.1882 | 0.6206 | 0.7324 | 0.0 | 0.7053 | 0.8846 |
102
+ | 0.8259 | 30.7692 | 400 | 0.5836 | 0.5257 | 0.6508 | 0.6241 | 0.0 | 0.4752 | 0.7513 | 0.1853 | 0.6 | 0.7147 | 0.0 | 0.6684 | 0.8923 |
103
 
104
 
105
  ### Framework versions
wandb/debug-internal.log CHANGED
@@ -14,3 +14,5 @@
14
  {"time":"2024-11-27T18:49:16.641816205Z","level":"INFO","msg":"Resuming system monitor"}
15
  {"time":"2024-11-27T18:49:17.996060396Z","level":"INFO","msg":"Pausing system monitor"}
16
  {"time":"2024-11-27T18:49:18.259742288Z","level":"INFO","msg":"Resuming system monitor"}
 
 
 
14
  {"time":"2024-11-27T18:49:16.641816205Z","level":"INFO","msg":"Resuming system monitor"}
15
  {"time":"2024-11-27T18:49:17.996060396Z","level":"INFO","msg":"Pausing system monitor"}
16
  {"time":"2024-11-27T18:49:18.259742288Z","level":"INFO","msg":"Resuming system monitor"}
17
+ {"time":"2024-11-27T19:02:06.908291212Z","level":"INFO","msg":"Pausing system monitor"}
18
+ {"time":"2024-11-27T19:02:07.755593607Z","level":"INFO","msg":"Resuming system monitor"}
wandb/debug.log CHANGED
@@ -40,3 +40,6 @@ config: {'batch_size': 4, 'learning_rate': 0.0003, 'num_epochs': 10}
40
  2024-11-27 18:49:18,647 INFO MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb None None {'use_timm_backbone': True, 'backbone_config': None, 'num_channels': 3, 'num_queries': 100, 'd_model': 256, 'encoder_ffn_dim': 2048, 'encoder_layers': 6, 'encoder_attention_heads': 8, 'decoder_ffn_dim': 2048, 'decoder_layers': 6, 'decoder_attention_heads': 8, 'dropout': 0.1, 'attention_dropout': 0.0, 'activation_dropout': 0.0, 'activation_function': 'relu', 'init_std': 0.02, 'init_xavier_std': 1.0, 'encoder_layerdrop': 0.0, 'decoder_layerdrop': 0.0, 'num_hidden_layers': 6, 'auxiliary_loss': False, 'position_embedding_type': 'sine', 'backbone': 'resnet50', 'use_pretrained_backbone': True, 'backbone_kwargs': {'output_stride': 16, 'out_indices': [1, 2, 3, 4], 'in_chans': 3}, 'dilation': True, 'class_cost': 1, 'bbox_cost': 5, 'giou_cost': 2, 'mask_loss_coefficient': 1, 'dice_loss_coefficient': 1, 'bbox_loss_coefficient': 5, 'giou_loss_coefficient': 2, 'eos_coefficient': 0.1, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': None, 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': True, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': True, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['DetrForObjectDetection'], 'finetuning_task': None, 'id2label': {0: 'object', 1: 'balloon'}, 'label2id': {'object': 0, 'balloon': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': None, 'pad_token_id': None, 'eos_token_id': None, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'facebook/detr-resnet-50-dc5', '_attn_implementation_autoset': True, 'transformers_version': '4.46.3', 'classifier_dropout': 0.0, 'max_position_embeddings': 1024, 'model_type': 'detr', 'scale_embedding': False, 'output_dir': '.', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'eval_strategy': 'steps', 'prediction_loss_only': False, 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'eval_delay': 0, 'torch_empty_cache_steps': None, 'learning_rate': 5e-05, 'weight_decay': 0.0001, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 3.0, 'max_steps': 400, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 0, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': './runs/Nov27_18-49-17_f5b68522d064', 'logging_strategy': 'steps', 'logging_first_step': False, 'logging_steps': 5, 'logging_nan_inf_filter': True, 'save_strategy': 'steps', 'save_steps': 10, 'save_total_limit': 2, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'restore_callback_states_from_checkpoint': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '.', 'disable_tqdm': False, 'remove_unused_columns': False, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'adamw_torch', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'include_for_metrics': [], 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': 'steps', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None, 'batch_eval_metrics': True, 'eval_on_start': False, 'use_liger_kernel': False, 'eval_use_gather_object': False, 'average_tokens_across_devices': False}
41
  2024-11-27 18:49:18,653 INFO MainThread:2090 [wandb_config.py:__setitem__():154] config set model/num_parameters = 41501895 - <bound method Run._config_callback of <wandb.sdk.wandb_run.Run object at 0x7f507463aad0>>
42
  2024-11-27 18:49:18,653 INFO MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb model/num_parameters 41501895 None
 
 
 
 
40
  2024-11-27 18:49:18,647 INFO MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb None None {'use_timm_backbone': True, 'backbone_config': None, 'num_channels': 3, 'num_queries': 100, 'd_model': 256, 'encoder_ffn_dim': 2048, 'encoder_layers': 6, 'encoder_attention_heads': 8, 'decoder_ffn_dim': 2048, 'decoder_layers': 6, 'decoder_attention_heads': 8, 'dropout': 0.1, 'attention_dropout': 0.0, 'activation_dropout': 0.0, 'activation_function': 'relu', 'init_std': 0.02, 'init_xavier_std': 1.0, 'encoder_layerdrop': 0.0, 'decoder_layerdrop': 0.0, 'num_hidden_layers': 6, 'auxiliary_loss': False, 'position_embedding_type': 'sine', 'backbone': 'resnet50', 'use_pretrained_backbone': True, 'backbone_kwargs': {'output_stride': 16, 'out_indices': [1, 2, 3, 4], 'in_chans': 3}, 'dilation': True, 'class_cost': 1, 'bbox_cost': 5, 'giou_cost': 2, 'mask_loss_coefficient': 1, 'dice_loss_coefficient': 1, 'bbox_loss_coefficient': 5, 'giou_loss_coefficient': 2, 'eos_coefficient': 0.1, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': None, 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': True, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': True, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['DetrForObjectDetection'], 'finetuning_task': None, 'id2label': {0: 'object', 1: 'balloon'}, 'label2id': {'object': 0, 'balloon': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': None, 'pad_token_id': None, 'eos_token_id': None, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'facebook/detr-resnet-50-dc5', '_attn_implementation_autoset': True, 'transformers_version': '4.46.3', 'classifier_dropout': 0.0, 'max_position_embeddings': 1024, 'model_type': 'detr', 'scale_embedding': False, 'output_dir': '.', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'eval_strategy': 'steps', 'prediction_loss_only': False, 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'eval_delay': 0, 'torch_empty_cache_steps': None, 'learning_rate': 5e-05, 'weight_decay': 0.0001, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 3.0, 'max_steps': 400, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 0, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': './runs/Nov27_18-49-17_f5b68522d064', 'logging_strategy': 'steps', 'logging_first_step': False, 'logging_steps': 5, 'logging_nan_inf_filter': True, 'save_strategy': 'steps', 'save_steps': 10, 'save_total_limit': 2, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'restore_callback_states_from_checkpoint': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '.', 'disable_tqdm': False, 'remove_unused_columns': False, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'adamw_torch', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'include_for_metrics': [], 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': 'steps', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None, 'batch_eval_metrics': True, 'eval_on_start': False, 'use_liger_kernel': False, 'eval_use_gather_object': False, 'average_tokens_across_devices': False}
41
  2024-11-27 18:49:18,653 INFO MainThread:2090 [wandb_config.py:__setitem__():154] config set model/num_parameters = 41501895 - <bound method Run._config_callback of <wandb.sdk.wandb_run.Run object at 0x7f507463aad0>>
42
  2024-11-27 18:49:18,653 INFO MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb model/num_parameters 41501895 None
43
+ 2024-11-27 19:02:06,907 INFO MainThread:2090 [jupyter.py:save_ipynb():387] not saving jupyter notebook
44
+ 2024-11-27 19:02:06,907 INFO MainThread:2090 [wandb_init.py:_pause_backend():444] pausing backend
45
+ 2024-11-27 19:02:07,754 INFO MainThread:2090 [wandb_init.py:_resume_backend():449] resuming backend
wandb/run-20241127_184914-lig8s4o3/files/output.log CHANGED
@@ -6,3 +6,7 @@
6
  self.scaler = torch.cuda.amp.GradScaler(**kwargs)
7
  max_steps is given, it will override any value given in num_train_epochs
8
  wandb: WARNING The `run_name` is currently set to the same value as `TrainingArguments.output_dir`. If this was not intended, please specify a different run name by setting the `TrainingArguments.run_name` parameter.
 
 
 
 
 
6
  self.scaler = torch.cuda.amp.GradScaler(**kwargs)
7
  max_steps is given, it will override any value given in num_train_epochs
8
  wandb: WARNING The `run_name` is currently set to the same value as `TrainingArguments.output_dir`. If this was not intended, please specify a different run name by setting the `TrainingArguments.run_name` parameter.
9
+ The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
10
+ Token is valid (permission: write).
11
+ Your token has been saved to /root/.cache/huggingface/token
12
+ Login successful
wandb/run-20241127_184914-lig8s4o3/logs/debug-internal.log CHANGED
@@ -14,3 +14,5 @@
14
  {"time":"2024-11-27T18:49:16.641816205Z","level":"INFO","msg":"Resuming system monitor"}
15
  {"time":"2024-11-27T18:49:17.996060396Z","level":"INFO","msg":"Pausing system monitor"}
16
  {"time":"2024-11-27T18:49:18.259742288Z","level":"INFO","msg":"Resuming system monitor"}
 
 
 
14
  {"time":"2024-11-27T18:49:16.641816205Z","level":"INFO","msg":"Resuming system monitor"}
15
  {"time":"2024-11-27T18:49:17.996060396Z","level":"INFO","msg":"Pausing system monitor"}
16
  {"time":"2024-11-27T18:49:18.259742288Z","level":"INFO","msg":"Resuming system monitor"}
17
+ {"time":"2024-11-27T19:02:06.908291212Z","level":"INFO","msg":"Pausing system monitor"}
18
+ {"time":"2024-11-27T19:02:07.755593607Z","level":"INFO","msg":"Resuming system monitor"}
wandb/run-20241127_184914-lig8s4o3/logs/debug.log CHANGED
@@ -40,3 +40,6 @@ config: {'batch_size': 4, 'learning_rate': 0.0003, 'num_epochs': 10}
40
  2024-11-27 18:49:18,647 INFO MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb None None {'use_timm_backbone': True, 'backbone_config': None, 'num_channels': 3, 'num_queries': 100, 'd_model': 256, 'encoder_ffn_dim': 2048, 'encoder_layers': 6, 'encoder_attention_heads': 8, 'decoder_ffn_dim': 2048, 'decoder_layers': 6, 'decoder_attention_heads': 8, 'dropout': 0.1, 'attention_dropout': 0.0, 'activation_dropout': 0.0, 'activation_function': 'relu', 'init_std': 0.02, 'init_xavier_std': 1.0, 'encoder_layerdrop': 0.0, 'decoder_layerdrop': 0.0, 'num_hidden_layers': 6, 'auxiliary_loss': False, 'position_embedding_type': 'sine', 'backbone': 'resnet50', 'use_pretrained_backbone': True, 'backbone_kwargs': {'output_stride': 16, 'out_indices': [1, 2, 3, 4], 'in_chans': 3}, 'dilation': True, 'class_cost': 1, 'bbox_cost': 5, 'giou_cost': 2, 'mask_loss_coefficient': 1, 'dice_loss_coefficient': 1, 'bbox_loss_coefficient': 5, 'giou_loss_coefficient': 2, 'eos_coefficient': 0.1, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': None, 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': True, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': True, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['DetrForObjectDetection'], 'finetuning_task': None, 'id2label': {0: 'object', 1: 'balloon'}, 'label2id': {'object': 0, 'balloon': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': None, 'pad_token_id': None, 'eos_token_id': None, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'facebook/detr-resnet-50-dc5', '_attn_implementation_autoset': True, 'transformers_version': '4.46.3', 'classifier_dropout': 0.0, 'max_position_embeddings': 1024, 'model_type': 'detr', 'scale_embedding': False, 'output_dir': '.', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'eval_strategy': 'steps', 'prediction_loss_only': False, 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'eval_delay': 0, 'torch_empty_cache_steps': None, 'learning_rate': 5e-05, 'weight_decay': 0.0001, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 3.0, 'max_steps': 400, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 0, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': './runs/Nov27_18-49-17_f5b68522d064', 'logging_strategy': 'steps', 'logging_first_step': False, 'logging_steps': 5, 'logging_nan_inf_filter': True, 'save_strategy': 'steps', 'save_steps': 10, 'save_total_limit': 2, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'restore_callback_states_from_checkpoint': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '.', 'disable_tqdm': False, 'remove_unused_columns': False, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'adamw_torch', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'include_for_metrics': [], 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': 'steps', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None, 'batch_eval_metrics': True, 'eval_on_start': False, 'use_liger_kernel': False, 'eval_use_gather_object': False, 'average_tokens_across_devices': False}
41
  2024-11-27 18:49:18,653 INFO MainThread:2090 [wandb_config.py:__setitem__():154] config set model/num_parameters = 41501895 - <bound method Run._config_callback of <wandb.sdk.wandb_run.Run object at 0x7f507463aad0>>
42
  2024-11-27 18:49:18,653 INFO MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb model/num_parameters 41501895 None
 
 
 
 
40
  2024-11-27 18:49:18,647 INFO MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb None None {'use_timm_backbone': True, 'backbone_config': None, 'num_channels': 3, 'num_queries': 100, 'd_model': 256, 'encoder_ffn_dim': 2048, 'encoder_layers': 6, 'encoder_attention_heads': 8, 'decoder_ffn_dim': 2048, 'decoder_layers': 6, 'decoder_attention_heads': 8, 'dropout': 0.1, 'attention_dropout': 0.0, 'activation_dropout': 0.0, 'activation_function': 'relu', 'init_std': 0.02, 'init_xavier_std': 1.0, 'encoder_layerdrop': 0.0, 'decoder_layerdrop': 0.0, 'num_hidden_layers': 6, 'auxiliary_loss': False, 'position_embedding_type': 'sine', 'backbone': 'resnet50', 'use_pretrained_backbone': True, 'backbone_kwargs': {'output_stride': 16, 'out_indices': [1, 2, 3, 4], 'in_chans': 3}, 'dilation': True, 'class_cost': 1, 'bbox_cost': 5, 'giou_cost': 2, 'mask_loss_coefficient': 1, 'dice_loss_coefficient': 1, 'bbox_loss_coefficient': 5, 'giou_loss_coefficient': 2, 'eos_coefficient': 0.1, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': None, 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': True, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': True, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['DetrForObjectDetection'], 'finetuning_task': None, 'id2label': {0: 'object', 1: 'balloon'}, 'label2id': {'object': 0, 'balloon': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': None, 'pad_token_id': None, 'eos_token_id': None, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'facebook/detr-resnet-50-dc5', '_attn_implementation_autoset': True, 'transformers_version': '4.46.3', 'classifier_dropout': 0.0, 'max_position_embeddings': 1024, 'model_type': 'detr', 'scale_embedding': False, 'output_dir': '.', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'eval_strategy': 'steps', 'prediction_loss_only': False, 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'eval_delay': 0, 'torch_empty_cache_steps': None, 'learning_rate': 5e-05, 'weight_decay': 0.0001, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 3.0, 'max_steps': 400, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 0, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': './runs/Nov27_18-49-17_f5b68522d064', 'logging_strategy': 'steps', 'logging_first_step': False, 'logging_steps': 5, 'logging_nan_inf_filter': True, 'save_strategy': 'steps', 'save_steps': 10, 'save_total_limit': 2, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'restore_callback_states_from_checkpoint': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '.', 'disable_tqdm': False, 'remove_unused_columns': False, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'adamw_torch', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'include_for_metrics': [], 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': 'steps', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None, 'batch_eval_metrics': True, 'eval_on_start': False, 'use_liger_kernel': False, 'eval_use_gather_object': False, 'average_tokens_across_devices': False}
41
  2024-11-27 18:49:18,653 INFO MainThread:2090 [wandb_config.py:__setitem__():154] config set model/num_parameters = 41501895 - <bound method Run._config_callback of <wandb.sdk.wandb_run.Run object at 0x7f507463aad0>>
42
  2024-11-27 18:49:18,653 INFO MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb model/num_parameters 41501895 None
43
+ 2024-11-27 19:02:06,907 INFO MainThread:2090 [jupyter.py:save_ipynb():387] not saving jupyter notebook
44
+ 2024-11-27 19:02:06,907 INFO MainThread:2090 [wandb_init.py:_pause_backend():444] pausing backend
45
+ 2024-11-27 19:02:07,754 INFO MainThread:2090 [wandb_init.py:_resume_backend():449] resuming backend