MalyO2/detr_finetune_aug_no_scheduler

Browse files

Files changed (6) hide show

README.md +55 -31
wandb/debug-internal.log +2 -0
wandb/debug.log +3 -0
wandb/run-20241127_184914-lig8s4o3/files/output.log +4 -0
wandb/run-20241127_184914-lig8s4o3/logs/debug-internal.log +2 -0
wandb/run-20241127_184914-lig8s4o3/logs/debug.log +3 -0

README.md CHANGED Viewed

@@ -16,23 +16,19 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [facebook/detr-resnet-50-dc5](https://huggingface.co/facebook/detr-resnet-50-dc5) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.7887
-- Map: 0.55
-- Map 50: 0.6825
-- Map 75: 0.5932
 - Map Small: 0.0
-- Map Medium: 0.5352
-- Map Large: 0.7531
-- Mar 1: 0.1882
-- Mar 10: 0.6735
-- Mar 100: 0.7588
 - Mar Small: 0.0
-- Mar Medium: 0.7158
-- Mar Large: 0.9385
-- Map Object: -1.0
-- Mar 100 Object: -1.0
-- Map Balloon: 0.55
-- Mar 100 Balloon: 0.7588
 ## Model description
@@ -51,31 +47,59 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 3e-05
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- training_steps: 125
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss | Map    | Map 50 | Map 75 | Map Small | Map Medium | Map Large | Mar 1  | Mar 10 | Mar 100 | Mar Small | Mar Medium | Mar Large | Map Object | Mar 100 Object | Map Balloon | Mar 100 Balloon |
-|:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:----------:|:---------:|:------:|:------:|:-------:|:---------:|:----------:|:---------:|:----------:|:--------------:|:-----------:|:---------------:|
-| 2.1236        | 0.7692 | 10   | 1.3396          | 0.0768 | 0.1002 | 0.0897 | 0.0       | 0.0966     | 0.1387    | 0.0765 | 0.3735 | 0.5647  | 0.0       | 0.3789     | 0.9231    | -1.0       | -1.0           | 0.0768      | 0.5647          |
-| 1.5088        | 1.5385 | 20   | 1.2730          | 0.1472 | 0.1875 | 0.1691 | 0.0       | 0.1297     | 0.2723    | 0.1059 | 0.3647 | 0.6618  | 0.0       | 0.5684     | 0.9       | -1.0       | -1.0           | 0.1472      | 0.6618          |
-| 1.3182        | 2.3077 | 30   | 1.2273          | 0.1816 | 0.2322 | 0.1918 | 0.0       | 0.2368     | 0.3423    | 0.1088 | 0.3941 | 0.6647  | 0.0       | 0.6053     | 0.8538    | -1.0       | -1.0           | 0.1816      | 0.6647          |
-| 1.365         | 3.0769 | 40   | 1.0452          | 0.2476 | 0.3019 | 0.2823 | 0.0       | 0.3035     | 0.4146    | 0.1118 | 0.4882 | 0.7559  | 0.0       | 0.7158     | 0.9308    | -1.0       | -1.0           | 0.2476      | 0.7559          |
-| 1.2013        | 3.8462 | 50   | 0.9825          | 0.3006 | 0.3891 | 0.3233 | 0.0       | 0.3747     | 0.496     | 0.1324 | 0.5265 | 0.7324  | 0.0       | 0.6737     | 0.9308    | -1.0       | -1.0           | 0.3006      | 0.7324          |
-| 1.3605        | 4.6154 | 60   | 0.9307          | 0.3655 | 0.4809 | 0.4024 | 0.0       | 0.3706     | 0.5922    | 0.1324 | 0.5471 | 0.7294  | 0.0       | 0.6684     | 0.9308    | -1.0       | -1.0           | 0.3655      | 0.7294          |
-| 1.0117        | 5.3846 | 70   | 0.8867          | 0.3834 | 0.5044 | 0.4222 | 0.0       | 0.4086     | 0.5963    | 0.1294 | 0.5882 | 0.7324  | 0.0       | 0.6737     | 0.9308    | -1.0       | -1.0           | 0.3834      | 0.7324          |
-| 1.1224        | 6.1538 | 80   | 0.8413          | 0.478  | 0.6138 | 0.5427 | 0.0       | 0.472      | 0.7053    | 0.1676 | 0.6265 | 0.7529  | 0.0       | 0.7053     | 0.9385    | -1.0       | -1.0           | 0.478       | 0.7529          |
-| 1.0109        | 6.9231 | 90   | 0.8210          | 0.5281 | 0.6515 | 0.5817 | 0.0       | 0.5391     | 0.7497    | 0.1559 | 0.6441 | 0.7735  | 0.0       | 0.7316     | 0.9538    | -1.0       | -1.0           | 0.5281      | 0.7735          |
-| 1.0771        | 7.6923 | 100  | 0.8153          | 0.5506 | 0.6859 | 0.604  | 0.0       | 0.5638     | 0.7373    | 0.1794 | 0.6618 | 0.7676  | 0.0       | 0.7263     | 0.9462    | -1.0       | -1.0           | 0.5506      | 0.7676          |
-| 0.9122        | 8.4615 | 110  | 0.7948          | 0.5551 | 0.6839 | 0.6097 | 0.0       | 0.5603     | 0.7503    | 0.1853 | 0.6618 | 0.7824  | 0.0       | 0.7526     | 0.9462    | -1.0       | -1.0           | 0.5551      | 0.7824          |
-| 0.9918        | 9.2308 | 120  | 0.7887          | 0.55   | 0.6825 | 0.5932 | 0.0       | 0.5352     | 0.7531    | 0.1882 | 0.6735 | 0.7588  | 0.0       | 0.7158     | 0.9385    | -1.0       | -1.0           | 0.55        | 0.7588          |
 ### Framework versions

 This model is a fine-tuned version of [facebook/detr-resnet-50-dc5](https://huggingface.co/facebook/detr-resnet-50-dc5) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5836
+- Map: 0.5257
+- Map 50: 0.6508
+- Map 75: 0.6241
 - Map Small: 0.0
+- Map Medium: 0.4752
+- Map Large: 0.7513
+- Mar 1: 0.1853
+- Mar 10: 0.6
+- Mar 100: 0.7147
 - Mar Small: 0.0
+- Mar Medium: 0.6684
+- Mar Large: 0.8923
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-05
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- training_steps: 400
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch   | Step | Validation Loss | Map    | Map 50 | Map 75 | Map Small | Map Medium | Map Large | Mar 1  | Mar 10 | Mar 100 | Mar Small | Mar Medium | Mar Large |
+|:-------------:|:-------:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:----------:|:---------:|:------:|:------:|:-------:|:---------:|:----------:|:---------:|
+| 4.1002        | 0.7692  | 10   | 4.1741          | 0.0003 | 0.001  | 0.0003 | 0.0       | 0.0062     | 0.0002    | 0.0    | 0.0    | 0.0441  | 0.0       | 0.0474     | 0.0462    |
+| 1.772         | 1.5385  | 20   | 1.4577          | 0.0298 | 0.05   | 0.0286 | 0.0       | 0.0185     | 0.0656    | 0.0294 | 0.1206 | 0.4882  | 0.0       | 0.3421     | 0.7769    |
+| 1.5665        | 2.3077  | 30   | 1.3869          | 0.0339 | 0.0549 | 0.0351 | 0.0       | 0.0407     | 0.0516    | 0.0029 | 0.0824 | 0.6059  | 0.0       | 0.5158     | 0.8308    |
+| 2.0258        | 3.0769  | 40   | 1.2246          | 0.0561 | 0.0797 | 0.0593 | 0.0       | 0.0398     | 0.1166    | 0.0265 | 0.1206 | 0.6441  | 0.0       | 0.5789     | 0.8385    |
+| 1.5082        | 3.8462  | 50   | 1.1988          | 0.0477 | 0.0869 | 0.0542 | 0.0       | 0.0927     | 0.063     | 0.0235 | 0.0853 | 0.6471  | 0.0       | 0.6316     | 0.7692    |
+| 1.3716        | 4.6154  | 60   | 1.1917          | 0.0549 | 0.1014 | 0.0602 | 0.0       | 0.0902     | 0.0761    | 0.0588 | 0.1618 | 0.5971  | 0.0       | 0.5421     | 0.7692    |
+| 1.2398        | 5.3846  | 70   | 1.0554          | 0.1329 | 0.1674 | 0.1485 | 0.0       | 0.1462     | 0.1957    | 0.0765 | 0.1882 | 0.7294  | 0.0       | 0.7474     | 0.8154    |
+| 1.401         | 6.1538  | 80   | 0.9179          | 0.1176 | 0.1821 | 0.1315 | 0.0       | 0.0835     | 0.2295    | 0.0529 | 0.1794 | 0.7294  | 0.0       | 0.7211     | 0.8538    |
+| 2.0328        | 6.9231  | 90   | 0.9198          | 0.1361 | 0.2109 | 0.1554 | 0.0       | 0.0937     | 0.2424    | 0.0559 | 0.2088 | 0.6882  | 0.0       | 0.6368     | 0.8692    |
+| 1.6358        | 7.6923  | 100  | 0.9298          | 0.2252 | 0.2898 | 0.2523 | 0.0       | 0.2279     | 0.3487    | 0.1059 | 0.3176 | 0.6882  | 0.0       | 0.6263     | 0.8846    |
+| 0.8849        | 8.4615  | 110  | 0.8894          | 0.1893 | 0.2435 | 0.2248 | 0.0       | 0.1438     | 0.3337    | 0.0971 | 0.2265 | 0.7265  | 0.0       | 0.7263     | 0.8385    |
+| 1.1906        | 9.2308  | 120  | 0.8505          | 0.2105 | 0.2704 | 0.2598 | 0.0       | 0.1879     | 0.3317    | 0.1324 | 0.2706 | 0.6853  | 0.0       | 0.6474     | 0.8462    |
+| 1.0404        | 10.0    | 130  | 0.7320          | 0.2508 | 0.2998 | 0.29   | 0.0       | 0.2031     | 0.4149    | 0.1588 | 0.2971 | 0.7471  | 0.0       | 0.7421     | 0.8692    |
+| 1.1534        | 10.7692 | 140  | 0.7996          | 0.2832 | 0.374  | 0.3479 | 0.0       | 0.2502     | 0.411     | 0.1676 | 0.3647 | 0.6647  | 0.0       | 0.6263     | 0.8231    |
+| 1.1725        | 11.5385 | 150  | 0.7990          | 0.3115 | 0.4464 | 0.3745 | 0.0       | 0.2972     | 0.4147    | 0.1294 | 0.3735 | 0.6588  | 0.0       | 0.6158     | 0.8231    |
+| 0.891         | 12.3077 | 160  | 0.9007          | 0.2856 | 0.3519 | 0.3449 | 0.0       | 0.2607     | 0.3788    | 0.1029 | 0.3529 | 0.6735  | 0.0       | 0.6263     | 0.8462    |
+| 1.1           | 13.0769 | 170  | 0.7376          | 0.2642 | 0.3608 | 0.3377 | 0.0       | 0.2281     | 0.4018    | 0.1176 | 0.3676 | 0.7176  | 0.0       | 0.7        | 0.8538    |
+| 1.2631        | 13.8462 | 180  | 0.7162          | 0.306  | 0.4363 | 0.3899 | 0.0       | 0.2997     | 0.3933    | 0.1412 | 0.45   | 0.7059  | 0.0       | 0.7053     | 0.8154    |
+| 1.0496        | 14.6154 | 190  | 0.7276          | 0.2811 | 0.3866 | 0.3483 | 0.0       | 0.3061     | 0.3685    | 0.1471 | 0.3882 | 0.7235  | 0.0       | 0.7316     | 0.8231    |
+| 0.8883        | 15.3846 | 200  | 0.6855          | 0.3373 | 0.4578 | 0.4385 | 0.0       | 0.3441     | 0.4654    | 0.15   | 0.4824 | 0.7412  | 0.0       | 0.7579     | 0.8308    |
+| 0.8471        | 16.1538 | 210  | 0.6733          | 0.4351 | 0.5932 | 0.5367 | 0.0       | 0.3702     | 0.6215    | 0.15   | 0.5412 | 0.7206  | 0.0       | 0.7158     | 0.8385    |
+| 0.9084        | 16.9231 | 220  | 0.6526          | 0.4279 | 0.5632 | 0.4848 | 0.0       | 0.4011     | 0.572     | 0.1824 | 0.5647 | 0.7294  | 0.0       | 0.7105     | 0.8692    |
+| 0.8872        | 17.6923 | 230  | 0.6218          | 0.4376 | 0.5753 | 0.5274 | 0.0       | 0.3879     | 0.6215    | 0.1559 | 0.5853 | 0.7382  | 0.0       | 0.7263     | 0.8692    |
+| 0.9739        | 18.4615 | 240  | 0.6590          | 0.4494 | 0.6293 | 0.505  | 0.0       | 0.3889     | 0.65      | 0.1471 | 0.5853 | 0.7029  | 0.0       | 0.6895     | 0.8308    |
+| 0.7596        | 19.2308 | 250  | 0.6367          | 0.4625 | 0.6229 | 0.5322 | 0.0       | 0.4106     | 0.6581    | 0.1529 | 0.5853 | 0.7118  | 0.0       | 0.7053     | 0.8308    |
+| 0.7124        | 20.0    | 260  | 0.6601          | 0.4619 | 0.6411 | 0.5327 | 0.0       | 0.39       | 0.6852    | 0.1559 | 0.5765 | 0.6794  | 0.0       | 0.6421     | 0.8385    |
+| 0.8369        | 20.7692 | 270  | 0.6363          | 0.4736 | 0.64   | 0.5738 | 0.0       | 0.3993     | 0.737     | 0.1559 | 0.5853 | 0.6853  | 0.0       | 0.6474     | 0.8462    |
+| 0.8608        | 21.5385 | 280  | 0.6304          | 0.496  | 0.6406 | 0.5583 | 0.0       | 0.4484     | 0.6973    | 0.1588 | 0.5912 | 0.7     | 0.0       | 0.6579     | 0.8692    |
+| 0.6174        | 22.3077 | 290  | 0.6825          | 0.4808 | 0.6714 | 0.5569 | 0.0       | 0.4264     | 0.6738    | 0.1529 | 0.5765 | 0.6735  | 0.0       | 0.6158     | 0.8615    |
+| 0.5903        | 23.0769 | 300  | 0.6037          | 0.5187 | 0.6804 | 0.6126 | 0.0       | 0.4604     | 0.709     | 0.1824 | 0.6118 | 0.7206  | 0.0       | 0.6842     | 0.8846    |
+| 0.6325        | 23.8462 | 310  | 0.6373          | 0.529  | 0.6819 | 0.6246 | 0.0       | 0.4489     | 0.7601    | 0.1765 | 0.5941 | 0.7088  | 0.0       | 0.6579     | 0.8923    |
+| 0.8569        | 24.6154 | 320  | 0.6131          | 0.5382 | 0.6684 | 0.6357 | 0.0       | 0.4862     | 0.7382    | 0.1794 | 0.6147 | 0.7294  | 0.0       | 0.7        | 0.8846    |
+| 0.7056        | 25.3846 | 330  | 0.5700          | 0.5244 | 0.6545 | 0.6089 | 0.0       | 0.4891     | 0.6871    | 0.1824 | 0.6176 | 0.75    | 0.0       | 0.7421     | 0.8769    |
+| 0.5988        | 26.1538 | 340  | 0.5738          | 0.5437 | 0.7119 | 0.651  | 0.0       | 0.5362     | 0.6823    | 0.1853 | 0.6206 | 0.7529  | 0.0       | 0.7579     | 0.8615    |
+| 0.5209        | 26.9231 | 350  | 0.6136          | 0.5153 | 0.6944 | 0.6047 | 0.0       | 0.4772     | 0.7054    | 0.1824 | 0.5882 | 0.7059  | 0.0       | 0.6789     | 0.8538    |
+| 0.6547        | 27.6923 | 360  | 0.6338          | 0.5166 | 0.6645 | 0.6224 | 0.0       | 0.4842     | 0.7072    | 0.1882 | 0.5971 | 0.7088  | 0.0       | 0.6842     | 0.8538    |
+| 0.6324        | 28.4615 | 370  | 0.6083          | 0.5143 | 0.6543 | 0.6279 | 0.0       | 0.4683     | 0.729     | 0.1853 | 0.6    | 0.7118  | 0.0       | 0.6789     | 0.8692    |
+| 0.6323        | 29.2308 | 380  | 0.5748          | 0.529  | 0.6552 | 0.637  | 0.0       | 0.48       | 0.7529    | 0.1853 | 0.6088 | 0.7206  | 0.0       | 0.6842     | 0.8846    |
+| 0.4509        | 30.0    | 390  | 0.5758          | 0.5311 | 0.652  | 0.6325 | 0.0       | 0.4923     | 0.7454    | 0.1882 | 0.6206 | 0.7324  | 0.0       | 0.7053     | 0.8846    |
+| 0.8259        | 30.7692 | 400  | 0.5836          | 0.5257 | 0.6508 | 0.6241 | 0.0       | 0.4752     | 0.7513    | 0.1853 | 0.6    | 0.7147  | 0.0       | 0.6684     | 0.8923    |
 ### Framework versions

wandb/debug-internal.log CHANGED Viewed

@@ -14,3 +14,5 @@
 {"time":"2024-11-27T18:49:16.641816205Z","level":"INFO","msg":"Resuming system monitor"}
 {"time":"2024-11-27T18:49:17.996060396Z","level":"INFO","msg":"Pausing system monitor"}
 {"time":"2024-11-27T18:49:18.259742288Z","level":"INFO","msg":"Resuming system monitor"}

 {"time":"2024-11-27T18:49:16.641816205Z","level":"INFO","msg":"Resuming system monitor"}
 {"time":"2024-11-27T18:49:17.996060396Z","level":"INFO","msg":"Pausing system monitor"}
 {"time":"2024-11-27T18:49:18.259742288Z","level":"INFO","msg":"Resuming system monitor"}
+{"time":"2024-11-27T19:02:06.908291212Z","level":"INFO","msg":"Pausing system monitor"}
+{"time":"2024-11-27T19:02:07.755593607Z","level":"INFO","msg":"Resuming system monitor"}

wandb/debug.log CHANGED Viewed

@@ -40,3 +40,6 @@ config: {'batch_size': 4, 'learning_rate': 0.0003, 'num_epochs': 10}
 2024-11-27 18:49:18,647 INFO    MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb None None {'use_timm_backbone': True, 'backbone_config': None, 'num_channels': 3, 'num_queries': 100, 'd_model': 256, 'encoder_ffn_dim': 2048, 'encoder_layers': 6, 'encoder_attention_heads': 8, 'decoder_ffn_dim': 2048, 'decoder_layers': 6, 'decoder_attention_heads': 8, 'dropout': 0.1, 'attention_dropout': 0.0, 'activation_dropout': 0.0, 'activation_function': 'relu', 'init_std': 0.02, 'init_xavier_std': 1.0, 'encoder_layerdrop': 0.0, 'decoder_layerdrop': 0.0, 'num_hidden_layers': 6, 'auxiliary_loss': False, 'position_embedding_type': 'sine', 'backbone': 'resnet50', 'use_pretrained_backbone': True, 'backbone_kwargs': {'output_stride': 16, 'out_indices': [1, 2, 3, 4], 'in_chans': 3}, 'dilation': True, 'class_cost': 1, 'bbox_cost': 5, 'giou_cost': 2, 'mask_loss_coefficient': 1, 'dice_loss_coefficient': 1, 'bbox_loss_coefficient': 5, 'giou_loss_coefficient': 2, 'eos_coefficient': 0.1, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': None, 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': True, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': True, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['DetrForObjectDetection'], 'finetuning_task': None, 'id2label': {0: 'object', 1: 'balloon'}, 'label2id': {'object': 0, 'balloon': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': None, 'pad_token_id': None, 'eos_token_id': None, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'facebook/detr-resnet-50-dc5', '_attn_implementation_autoset': True, 'transformers_version': '4.46.3', 'classifier_dropout': 0.0, 'max_position_embeddings': 1024, 'model_type': 'detr', 'scale_embedding': False, 'output_dir': '.', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'eval_strategy': 'steps', 'prediction_loss_only': False, 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'eval_delay': 0, 'torch_empty_cache_steps': None, 'learning_rate': 5e-05, 'weight_decay': 0.0001, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 3.0, 'max_steps': 400, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 0, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': './runs/Nov27_18-49-17_f5b68522d064', 'logging_strategy': 'steps', 'logging_first_step': False, 'logging_steps': 5, 'logging_nan_inf_filter': True, 'save_strategy': 'steps', 'save_steps': 10, 'save_total_limit': 2, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'restore_callback_states_from_checkpoint': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '.', 'disable_tqdm': False, 'remove_unused_columns': False, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'adamw_torch', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'include_for_metrics': [], 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': 'steps', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None, 'batch_eval_metrics': True, 'eval_on_start': False, 'use_liger_kernel': False, 'eval_use_gather_object': False, 'average_tokens_across_devices': False}
 2024-11-27 18:49:18,653 INFO    MainThread:2090 [wandb_config.py:__setitem__():154] config set model/num_parameters = 41501895 - <bound method Run._config_callback of <wandb.sdk.wandb_run.Run object at 0x7f507463aad0>>
 2024-11-27 18:49:18,653 INFO    MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb model/num_parameters 41501895 None

 2024-11-27 18:49:18,647 INFO    MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb None None {'use_timm_backbone': True, 'backbone_config': None, 'num_channels': 3, 'num_queries': 100, 'd_model': 256, 'encoder_ffn_dim': 2048, 'encoder_layers': 6, 'encoder_attention_heads': 8, 'decoder_ffn_dim': 2048, 'decoder_layers': 6, 'decoder_attention_heads': 8, 'dropout': 0.1, 'attention_dropout': 0.0, 'activation_dropout': 0.0, 'activation_function': 'relu', 'init_std': 0.02, 'init_xavier_std': 1.0, 'encoder_layerdrop': 0.0, 'decoder_layerdrop': 0.0, 'num_hidden_layers': 6, 'auxiliary_loss': False, 'position_embedding_type': 'sine', 'backbone': 'resnet50', 'use_pretrained_backbone': True, 'backbone_kwargs': {'output_stride': 16, 'out_indices': [1, 2, 3, 4], 'in_chans': 3}, 'dilation': True, 'class_cost': 1, 'bbox_cost': 5, 'giou_cost': 2, 'mask_loss_coefficient': 1, 'dice_loss_coefficient': 1, 'bbox_loss_coefficient': 5, 'giou_loss_coefficient': 2, 'eos_coefficient': 0.1, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': None, 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': True, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': True, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['DetrForObjectDetection'], 'finetuning_task': None, 'id2label': {0: 'object', 1: 'balloon'}, 'label2id': {'object': 0, 'balloon': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': None, 'pad_token_id': None, 'eos_token_id': None, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'facebook/detr-resnet-50-dc5', '_attn_implementation_autoset': True, 'transformers_version': '4.46.3', 'classifier_dropout': 0.0, 'max_position_embeddings': 1024, 'model_type': 'detr', 'scale_embedding': False, 'output_dir': '.', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'eval_strategy': 'steps', 'prediction_loss_only': False, 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'eval_delay': 0, 'torch_empty_cache_steps': None, 'learning_rate': 5e-05, 'weight_decay': 0.0001, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 3.0, 'max_steps': 400, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 0, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': './runs/Nov27_18-49-17_f5b68522d064', 'logging_strategy': 'steps', 'logging_first_step': False, 'logging_steps': 5, 'logging_nan_inf_filter': True, 'save_strategy': 'steps', 'save_steps': 10, 'save_total_limit': 2, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'restore_callback_states_from_checkpoint': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '.', 'disable_tqdm': False, 'remove_unused_columns': False, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'adamw_torch', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'include_for_metrics': [], 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': 'steps', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None, 'batch_eval_metrics': True, 'eval_on_start': False, 'use_liger_kernel': False, 'eval_use_gather_object': False, 'average_tokens_across_devices': False}
 2024-11-27 18:49:18,653 INFO    MainThread:2090 [wandb_config.py:__setitem__():154] config set model/num_parameters = 41501895 - <bound method Run._config_callback of <wandb.sdk.wandb_run.Run object at 0x7f507463aad0>>
 2024-11-27 18:49:18,653 INFO    MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb model/num_parameters 41501895 None
+2024-11-27 19:02:06,907 INFO    MainThread:2090 [jupyter.py:save_ipynb():387] not saving jupyter notebook
+2024-11-27 19:02:06,907 INFO    MainThread:2090 [wandb_init.py:_pause_backend():444] pausing backend
+2024-11-27 19:02:07,754 INFO    MainThread:2090 [wandb_init.py:_resume_backend():449] resuming backend

wandb/run-20241127_184914-lig8s4o3/files/output.log CHANGED Viewed

@@ -6,3 +6,7 @@
   self.scaler = torch.cuda.amp.GradScaler(**kwargs)
 max_steps is given, it will override any value given in num_train_epochs
 [34m[1mwandb[0m: [33mWARNING[0m The `run_name` is currently set to the same value as `TrainingArguments.output_dir`. If this was not intended, please specify a different run name by setting the `TrainingArguments.run_name` parameter.

   self.scaler = torch.cuda.amp.GradScaler(**kwargs)
 max_steps is given, it will override any value given in num_train_epochs
 [34m[1mwandb[0m: [33mWARNING[0m The `run_name` is currently set to the same value as `TrainingArguments.output_dir`. If this was not intended, please specify a different run name by setting the `TrainingArguments.run_name` parameter.
+The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
+Token is valid (permission: write).
+Your token has been saved to /root/.cache/huggingface/token
+Login successful

wandb/run-20241127_184914-lig8s4o3/logs/debug-internal.log CHANGED Viewed

@@ -14,3 +14,5 @@
 {"time":"2024-11-27T18:49:16.641816205Z","level":"INFO","msg":"Resuming system monitor"}
 {"time":"2024-11-27T18:49:17.996060396Z","level":"INFO","msg":"Pausing system monitor"}
 {"time":"2024-11-27T18:49:18.259742288Z","level":"INFO","msg":"Resuming system monitor"}

 {"time":"2024-11-27T18:49:16.641816205Z","level":"INFO","msg":"Resuming system monitor"}
 {"time":"2024-11-27T18:49:17.996060396Z","level":"INFO","msg":"Pausing system monitor"}
 {"time":"2024-11-27T18:49:18.259742288Z","level":"INFO","msg":"Resuming system monitor"}
+{"time":"2024-11-27T19:02:06.908291212Z","level":"INFO","msg":"Pausing system monitor"}
+{"time":"2024-11-27T19:02:07.755593607Z","level":"INFO","msg":"Resuming system monitor"}

wandb/run-20241127_184914-lig8s4o3/logs/debug.log CHANGED Viewed

@@ -40,3 +40,6 @@ config: {'batch_size': 4, 'learning_rate': 0.0003, 'num_epochs': 10}
 2024-11-27 18:49:18,647 INFO    MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb None None {'use_timm_backbone': True, 'backbone_config': None, 'num_channels': 3, 'num_queries': 100, 'd_model': 256, 'encoder_ffn_dim': 2048, 'encoder_layers': 6, 'encoder_attention_heads': 8, 'decoder_ffn_dim': 2048, 'decoder_layers': 6, 'decoder_attention_heads': 8, 'dropout': 0.1, 'attention_dropout': 0.0, 'activation_dropout': 0.0, 'activation_function': 'relu', 'init_std': 0.02, 'init_xavier_std': 1.0, 'encoder_layerdrop': 0.0, 'decoder_layerdrop': 0.0, 'num_hidden_layers': 6, 'auxiliary_loss': False, 'position_embedding_type': 'sine', 'backbone': 'resnet50', 'use_pretrained_backbone': True, 'backbone_kwargs': {'output_stride': 16, 'out_indices': [1, 2, 3, 4], 'in_chans': 3}, 'dilation': True, 'class_cost': 1, 'bbox_cost': 5, 'giou_cost': 2, 'mask_loss_coefficient': 1, 'dice_loss_coefficient': 1, 'bbox_loss_coefficient': 5, 'giou_loss_coefficient': 2, 'eos_coefficient': 0.1, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': None, 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': True, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': True, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['DetrForObjectDetection'], 'finetuning_task': None, 'id2label': {0: 'object', 1: 'balloon'}, 'label2id': {'object': 0, 'balloon': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': None, 'pad_token_id': None, 'eos_token_id': None, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'facebook/detr-resnet-50-dc5', '_attn_implementation_autoset': True, 'transformers_version': '4.46.3', 'classifier_dropout': 0.0, 'max_position_embeddings': 1024, 'model_type': 'detr', 'scale_embedding': False, 'output_dir': '.', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'eval_strategy': 'steps', 'prediction_loss_only': False, 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'eval_delay': 0, 'torch_empty_cache_steps': None, 'learning_rate': 5e-05, 'weight_decay': 0.0001, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 3.0, 'max_steps': 400, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 0, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': './runs/Nov27_18-49-17_f5b68522d064', 'logging_strategy': 'steps', 'logging_first_step': False, 'logging_steps': 5, 'logging_nan_inf_filter': True, 'save_strategy': 'steps', 'save_steps': 10, 'save_total_limit': 2, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'restore_callback_states_from_checkpoint': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '.', 'disable_tqdm': False, 'remove_unused_columns': False, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'adamw_torch', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'include_for_metrics': [], 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': 'steps', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None, 'batch_eval_metrics': True, 'eval_on_start': False, 'use_liger_kernel': False, 'eval_use_gather_object': False, 'average_tokens_across_devices': False}
 2024-11-27 18:49:18,653 INFO    MainThread:2090 [wandb_config.py:__setitem__():154] config set model/num_parameters = 41501895 - <bound method Run._config_callback of <wandb.sdk.wandb_run.Run object at 0x7f507463aad0>>
 2024-11-27 18:49:18,653 INFO    MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb model/num_parameters 41501895 None

 2024-11-27 18:49:18,647 INFO    MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb None None {'use_timm_backbone': True, 'backbone_config': None, 'num_channels': 3, 'num_queries': 100, 'd_model': 256, 'encoder_ffn_dim': 2048, 'encoder_layers': 6, 'encoder_attention_heads': 8, 'decoder_ffn_dim': 2048, 'decoder_layers': 6, 'decoder_attention_heads': 8, 'dropout': 0.1, 'attention_dropout': 0.0, 'activation_dropout': 0.0, 'activation_function': 'relu', 'init_std': 0.02, 'init_xavier_std': 1.0, 'encoder_layerdrop': 0.0, 'decoder_layerdrop': 0.0, 'num_hidden_layers': 6, 'auxiliary_loss': False, 'position_embedding_type': 'sine', 'backbone': 'resnet50', 'use_pretrained_backbone': True, 'backbone_kwargs': {'output_stride': 16, 'out_indices': [1, 2, 3, 4], 'in_chans': 3}, 'dilation': True, 'class_cost': 1, 'bbox_cost': 5, 'giou_cost': 2, 'mask_loss_coefficient': 1, 'dice_loss_coefficient': 1, 'bbox_loss_coefficient': 5, 'giou_loss_coefficient': 2, 'eos_coefficient': 0.1, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': None, 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': True, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': True, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['DetrForObjectDetection'], 'finetuning_task': None, 'id2label': {0: 'object', 1: 'balloon'}, 'label2id': {'object': 0, 'balloon': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': None, 'pad_token_id': None, 'eos_token_id': None, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'facebook/detr-resnet-50-dc5', '_attn_implementation_autoset': True, 'transformers_version': '4.46.3', 'classifier_dropout': 0.0, 'max_position_embeddings': 1024, 'model_type': 'detr', 'scale_embedding': False, 'output_dir': '.', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'eval_strategy': 'steps', 'prediction_loss_only': False, 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'eval_delay': 0, 'torch_empty_cache_steps': None, 'learning_rate': 5e-05, 'weight_decay': 0.0001, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 3.0, 'max_steps': 400, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 0, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': './runs/Nov27_18-49-17_f5b68522d064', 'logging_strategy': 'steps', 'logging_first_step': False, 'logging_steps': 5, 'logging_nan_inf_filter': True, 'save_strategy': 'steps', 'save_steps': 10, 'save_total_limit': 2, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'restore_callback_states_from_checkpoint': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '.', 'disable_tqdm': False, 'remove_unused_columns': False, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'adamw_torch', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'include_for_metrics': [], 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': 'steps', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None, 'batch_eval_metrics': True, 'eval_on_start': False, 'use_liger_kernel': False, 'eval_use_gather_object': False, 'average_tokens_across_devices': False}
 2024-11-27 18:49:18,653 INFO    MainThread:2090 [wandb_config.py:__setitem__():154] config set model/num_parameters = 41501895 - <bound method Run._config_callback of <wandb.sdk.wandb_run.Run object at 0x7f507463aad0>>
 2024-11-27 18:49:18,653 INFO    MainThread:2090 [wandb_run.py:_config_callback():1387] config_cb model/num_parameters 41501895 None
+2024-11-27 19:02:06,907 INFO    MainThread:2090 [jupyter.py:save_ipynb():387] not saving jupyter notebook
+2024-11-27 19:02:06,907 INFO    MainThread:2090 [wandb_init.py:_pause_backend():444] pausing backend
+2024-11-27 19:02:07,754 INFO    MainThread:2090 [wandb_init.py:_resume_backend():449] resuming backend