|
2024-04-24 15:43:39,468 INFO StreamThr :1848599 [internal.py:wandb_internal():86] W&B internal server running at pid: 1848599, started at: 2024-04-24 15:43:39.467078 |
|
2024-04-24 15:43:39,469 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: status |
|
2024-04-24 15:43:39,473 INFO WriterThread:1848599 [datastore.py:open_for_write():85] open: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/run-mwp0iutr.wandb |
|
2024-04-24 15:43:39,476 DEBUG SenderThread:1848599 [sender.py:send():382] send: header |
|
2024-04-24 15:43:39,521 DEBUG SenderThread:1848599 [sender.py:send():382] send: run |
|
2024-04-24 15:43:39,793 INFO SenderThread:1848599 [dir_watcher.py:__init__():211] watching files in: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files |
|
2024-04-24 15:43:39,793 INFO SenderThread:1848599 [sender.py:_start_run_threads():1136] run started: mwp0iutr with start time 1713973419.470656 |
|
2024-04-24 15:43:39,798 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: check_version |
|
2024-04-24 15:43:39,799 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: check_version |
|
2024-04-24 15:43:39,851 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: run_start |
|
2024-04-24 15:43:39,908 DEBUG HandlerThread:1848599 [system_info.py:__init__():32] System info init |
|
2024-04-24 15:43:39,908 DEBUG HandlerThread:1848599 [system_info.py:__init__():47] System info init done |
|
2024-04-24 15:43:39,908 INFO HandlerThread:1848599 [system_monitor.py:start():194] Starting system monitor |
|
2024-04-24 15:43:39,908 INFO SystemMonitor:1848599 [system_monitor.py:_start():158] Starting system asset monitoring threads |
|
2024-04-24 15:43:39,909 INFO HandlerThread:1848599 [system_monitor.py:probe():214] Collecting system info |
|
2024-04-24 15:43:39,909 INFO SystemMonitor:1848599 [interfaces.py:start():190] Started cpu monitoring |
|
2024-04-24 15:43:39,909 INFO SystemMonitor:1848599 [interfaces.py:start():190] Started disk monitoring |
|
2024-04-24 15:43:39,910 INFO SystemMonitor:1848599 [interfaces.py:start():190] Started gpu monitoring |
|
2024-04-24 15:43:39,911 INFO SystemMonitor:1848599 [interfaces.py:start():190] Started memory monitoring |
|
2024-04-24 15:43:39,911 INFO SystemMonitor:1848599 [interfaces.py:start():190] Started network monitoring |
|
2024-04-24 15:43:39,965 DEBUG HandlerThread:1848599 [system_info.py:probe():196] Probing system |
|
2024-04-24 15:43:39,967 DEBUG HandlerThread:1848599 [system_info.py:_probe_git():181] Probing git |
|
2024-04-24 15:43:39,987 DEBUG HandlerThread:1848599 [system_info.py:_probe_git():189] Probing git done |
|
2024-04-24 15:43:39,987 DEBUG HandlerThread:1848599 [system_info.py:probe():244] Probing system done |
|
2024-04-24 15:43:39,987 DEBUG HandlerThread:1848599 [system_monitor.py:probe():223] {'os': 'Linux-5.15.0-1048-aws-x86_64-with-glibc2.31', 'python': '3.11.5', 'heartbeatAt': '2024-04-24T15:43:39.965097', 'startedAt': '2024-04-24T15:43:39.449266', 'docker': None, 'cuda': None, 'args': ('config_full.yaml',), 'state': 'running', 'program': '/fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/run_sft.py', 'codePathLocal': 'run_sft.py', 'codePath': 'run_sft.py', 'git': {'remote': 'https://huggingface.co/sanchit-gandhi/distil-zephyr-1.5b-ssft-ultrachat', 'commit': 'cbea69c6b95c970317a1e47c3f614b55b33f8ed9'}, 'email': None, 'root': '/fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat', 'host': 'ip-26-0-162-233', 'username': 'sanchit', 'executable': '/fsx/sanchit/miniconda3/envs/venv/bin/python', 'cpu_count': 96, 'cpu_count_logical': 96, 'cpu_freq': {'current': 2721.9698645833337, 'min': 0.0, 'max': 0.0}, 'cpu_freq_per_core': [{'current': 3590.538, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 3595.996, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 3597.59, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 3399.936, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 3598.273, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 3597.284, 'min': 0.0, 'max': 0.0}, {'current': 3036.337, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 3597.887, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 3598.442, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}, {'current': 2650.0, 'min': 0.0, 'max': 0.0}], 'disk': {'/': {'total': 290.7472343444824, 'used': 59.263893127441406}}, 'gpu': 'NVIDIA H100 80GB HBM3', 'gpu_count': 8, 'gpu_devices': [{'name': 'NVIDIA H100 80GB HBM3', 'memory_total': 85520809984}, {'name': 'NVIDIA H100 80GB HBM3', 'memory_total': 85520809984}, {'name': 'NVIDIA H100 80GB HBM3', 'memory_total': 85520809984}, {'name': 'NVIDIA H100 80GB HBM3', 'memory_total': 85520809984}, {'name': 'NVIDIA H100 80GB HBM3', 'memory_total': 85520809984}, {'name': 'NVIDIA H100 80GB HBM3', 'memory_total': 85520809984}, {'name': 'NVIDIA H100 80GB HBM3', 'memory_total': 85520809984}, {'name': 'NVIDIA H100 80GB HBM3', 'memory_total': 85520809984}], 'memory': {'total': 1999.9855270385742}} |
|
2024-04-24 15:43:39,988 INFO HandlerThread:1848599 [system_monitor.py:probe():224] Finished collecting system info |
|
2024-04-24 15:43:39,988 INFO HandlerThread:1848599 [system_monitor.py:probe():227] Publishing system info |
|
2024-04-24 15:43:39,988 DEBUG HandlerThread:1848599 [system_info.py:_save_pip():52] Saving list of pip packages installed into the current environment |
|
2024-04-24 15:43:39,989 DEBUG HandlerThread:1848599 [system_info.py:_save_pip():68] Saving pip packages done |
|
2024-04-24 15:43:39,990 DEBUG HandlerThread:1848599 [system_info.py:_save_conda():75] Saving list of conda packages installed into the current environment |
|
2024-04-24 15:43:40,795 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_created():271] file/dir created: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/conda-environment.yaml |
|
2024-04-24 15:43:40,796 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_created():271] file/dir created: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/requirements.txt |
|
2024-04-24 15:43:45,799 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_modified():288] file/dir modified: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/conda-environment.yaml |
|
2024-04-24 15:43:45,805 DEBUG HandlerThread:1848599 [system_info.py:_save_conda():87] Saving conda packages done |
|
2024-04-24 15:43:45,807 INFO HandlerThread:1848599 [system_monitor.py:probe():229] Finished publishing system info |
|
2024-04-24 15:43:45,857 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: status_report |
|
2024-04-24 15:43:45,857 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: keepalive |
|
2024-04-24 15:43:45,858 DEBUG SenderThread:1848599 [sender.py:send():382] send: files |
|
2024-04-24 15:43:45,858 INFO SenderThread:1848599 [sender.py:_save_file():1392] saving file wandb-metadata.json with policy now |
|
2024-04-24 15:43:45,864 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: stop_status |
|
2024-04-24 15:43:45,865 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: stop_status |
|
2024-04-24 15:43:45,867 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: internal_messages |
|
2024-04-24 15:43:45,993 DEBUG SenderThread:1848599 [sender.py:send():382] send: telemetry |
|
2024-04-24 15:43:45,993 DEBUG SenderThread:1848599 [sender.py:send():382] send: config |
|
2024-04-24 15:43:45,993 DEBUG SenderThread:1848599 [sender.py:send():382] send: metric |
|
2024-04-24 15:43:45,994 DEBUG SenderThread:1848599 [sender.py:send():382] send: telemetry |
|
2024-04-24 15:43:45,994 DEBUG SenderThread:1848599 [sender.py:send():382] send: metric |
|
2024-04-24 15:43:45,994 WARNING SenderThread:1848599 [sender.py:send_metric():1343] Seen metric with glob (shouldn't happen) |
|
2024-04-24 15:43:45,994 DEBUG SenderThread:1848599 [sender.py:send():382] send: telemetry |
|
2024-04-24 15:43:46,179 INFO wandb-upload_0:1848599 [upload_job.py:push():131] Uploaded file /tmp/tmphsb5r9cdwandb/sgr8lmob-wandb-metadata.json |
|
2024-04-24 15:43:46,800 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_created():271] file/dir created: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/wandb-metadata.json |
|
2024-04-24 15:43:46,801 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_created():271] file/dir created: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/output.log |
|
2024-04-24 15:43:48,803 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_modified():288] file/dir modified: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/output.log |
|
2024-04-24 15:43:50,251 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: status_report |
|
2024-04-24 15:43:52,783 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: partial_history |
|
2024-04-24 15:43:52,785 DEBUG SenderThread:1848599 [sender.py:send():382] send: metric |
|
2024-04-24 15:43:52,785 DEBUG SenderThread:1848599 [sender.py:send():382] send: metric |
|
2024-04-24 15:43:52,786 DEBUG SenderThread:1848599 [sender.py:send():382] send: metric |
|
2024-04-24 15:43:52,786 DEBUG SenderThread:1848599 [sender.py:send():382] send: metric |
|
2024-04-24 15:43:52,786 DEBUG SenderThread:1848599 [sender.py:send():382] send: history |
|
2024-04-24 15:43:52,786 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: summary_record |
|
2024-04-24 15:43:52,788 INFO SenderThread:1848599 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end |
|
2024-04-24 15:43:52,807 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_created():271] file/dir created: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/wandb-summary.json |
|
2024-04-24 15:43:54,212 DEBUG SenderThread:1848599 [sender.py:send():382] send: exit |
|
2024-04-24 15:43:54,212 INFO SenderThread:1848599 [sender.py:send_exit():589] handling exit code: 1 |
|
2024-04-24 15:43:54,212 INFO SenderThread:1848599 [sender.py:send_exit():591] handling runtime: 14 |
|
2024-04-24 15:43:54,213 INFO SenderThread:1848599 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end |
|
2024-04-24 15:43:54,213 INFO SenderThread:1848599 [sender.py:send_exit():597] send defer |
|
2024-04-24 15:43:54,213 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:43:54,213 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 0 |
|
2024-04-24 15:43:54,214 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:43:54,214 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 0 |
|
2024-04-24 15:43:54,214 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 1 |
|
2024-04-24 15:43:54,214 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:43:54,214 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 1 |
|
2024-04-24 15:43:54,214 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:43:54,214 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 1 |
|
2024-04-24 15:43:54,214 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 2 |
|
2024-04-24 15:43:54,214 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:43:54,214 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 2 |
|
2024-04-24 15:43:54,214 INFO HandlerThread:1848599 [system_monitor.py:finish():203] Stopping system monitor |
|
2024-04-24 15:43:54,214 DEBUG SystemMonitor:1848599 [system_monitor.py:_start():172] Starting system metrics aggregation loop |
|
2024-04-24 15:43:54,215 DEBUG SystemMonitor:1848599 [system_monitor.py:_start():179] Finished system metrics aggregation loop |
|
2024-04-24 15:43:54,215 DEBUG SystemMonitor:1848599 [system_monitor.py:_start():183] Publishing last batch of metrics |
|
2024-04-24 15:43:54,215 INFO HandlerThread:1848599 [interfaces.py:finish():202] Joined cpu monitor |
|
2024-04-24 15:43:54,217 INFO HandlerThread:1848599 [interfaces.py:finish():202] Joined disk monitor |
|
2024-04-24 15:43:54,810 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_modified():288] file/dir modified: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/wandb-summary.json |
|
2024-04-24 15:43:54,810 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_modified():288] file/dir modified: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/output.log |
|
2024-04-24 15:43:56,812 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_modified():288] file/dir modified: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/output.log |
|
2024-04-24 15:43:57,141 INFO HandlerThread:1848599 [interfaces.py:finish():202] Joined gpu monitor |
|
2024-04-24 15:43:57,142 INFO HandlerThread:1848599 [interfaces.py:finish():202] Joined memory monitor |
|
2024-04-24 15:43:57,142 INFO HandlerThread:1848599 [interfaces.py:finish():202] Joined network monitor |
|
2024-04-24 15:43:57,142 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: poll_exit |
|
2024-04-24 15:43:57,143 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: status_report |
|
2024-04-24 15:43:57,143 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:43:57,143 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 2 |
|
2024-04-24 15:43:57,143 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 3 |
|
2024-04-24 15:43:57,144 DEBUG SenderThread:1848599 [sender.py:send():382] send: stats |
|
2024-04-24 15:43:57,144 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:43:57,144 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: poll_exit |
|
2024-04-24 15:43:57,145 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 3 |
|
2024-04-24 15:43:57,146 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:43:57,146 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 3 |
|
2024-04-24 15:43:57,146 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 4 |
|
2024-04-24 15:43:57,146 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:43:57,146 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 4 |
|
2024-04-24 15:43:57,147 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:43:57,147 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 4 |
|
2024-04-24 15:43:57,147 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 5 |
|
2024-04-24 15:43:57,147 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:43:57,147 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 5 |
|
2024-04-24 15:43:57,147 DEBUG SenderThread:1848599 [sender.py:send():382] send: summary |
|
2024-04-24 15:43:57,149 INFO SenderThread:1848599 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end |
|
2024-04-24 15:43:57,149 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:43:57,149 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 5 |
|
2024-04-24 15:43:57,149 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 6 |
|
2024-04-24 15:43:57,149 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:43:57,149 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 6 |
|
2024-04-24 15:43:57,149 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:43:57,149 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 6 |
|
2024-04-24 15:43:57,152 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: status_report |
|
2024-04-24 15:43:57,275 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 7 |
|
2024-04-24 15:43:57,275 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:43:57,275 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 7 |
|
2024-04-24 15:43:57,275 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:43:57,275 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 7 |
|
2024-04-24 15:43:57,814 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_modified():288] file/dir modified: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/config.yaml |
|
2024-04-24 15:43:57,814 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_modified():288] file/dir modified: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/wandb-summary.json |
|
2024-04-24 15:43:58,791 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 8 |
|
2024-04-24 15:43:58,792 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:43:58,792 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 8 |
|
2024-04-24 15:43:58,792 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:43:58,792 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 8 |
|
2024-04-24 15:43:58,792 INFO SenderThread:1848599 [job_builder.py:build():298] Attempting to build job artifact |
|
2024-04-24 15:43:58,794 INFO SenderThread:1848599 [job_builder.py:_get_source_type():428] is repo sourced job |
|
2024-04-24 15:43:58,815 INFO Thread-12 :1848599 [dir_watcher.py:_on_file_modified():288] file/dir modified: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/output.log |
|
2024-04-24 15:43:58,832 INFO SenderThread:1848599 [job_builder.py:build():404] adding wandb-job metadata file |
|
2024-04-24 15:43:58,858 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 9 |
|
2024-04-24 15:43:58,859 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:43:58,859 DEBUG SenderThread:1848599 [sender.py:send():382] send: artifact |
|
2024-04-24 15:43:58,859 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 9 |
|
2024-04-24 15:43:59,524 INFO wandb-upload_0:1848599 [upload_job.py:push():89] Uploaded file /admin/home/sanchit/.local/share/wandb/artifacts/staging/tmp1vajxumh |
|
2024-04-24 15:43:59,530 INFO wandb-upload_1:1848599 [upload_job.py:push():89] Uploaded file /admin/home/sanchit/.local/share/wandb/artifacts/staging/tmp824ipvc5 |
|
2024-04-24 15:44:00,093 INFO SenderThread:1848599 [sender.py:send_artifact():1470] sent artifact job-https___huggingface.co_sanchit-gandhi_distil-zephyr-1.5b-ssft-ultrachat_run_sft.py - {'id': 'QXJ0aWZhY3Q6ODA4NTQyNDIx', 'state': 'PENDING', 'artifactSequence': {'id': 'QXJ0aWZhY3RDb2xsZWN0aW9uOjE2NjI0NzU4Nw==', 'latestArtifact': None}} |
|
2024-04-24 15:44:00,093 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:44:00,093 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 9 |
|
2024-04-24 15:44:00,093 INFO SenderThread:1848599 [dir_watcher.py:finish():358] shutting down directory watcher |
|
2024-04-24 15:44:00,213 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: keepalive |
|
2024-04-24 15:44:00,816 INFO SenderThread:1848599 [dir_watcher.py:finish():388] scan: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files |
|
2024-04-24 15:44:00,817 INFO SenderThread:1848599 [dir_watcher.py:finish():402] scan save: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/conda-environment.yaml conda-environment.yaml |
|
2024-04-24 15:44:00,817 INFO SenderThread:1848599 [dir_watcher.py:finish():402] scan save: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/wandb-summary.json wandb-summary.json |
|
2024-04-24 15:44:00,817 INFO SenderThread:1848599 [dir_watcher.py:finish():402] scan save: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/output.log output.log |
|
2024-04-24 15:44:00,821 INFO SenderThread:1848599 [dir_watcher.py:finish():402] scan save: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/config.yaml config.yaml |
|
2024-04-24 15:44:00,824 INFO SenderThread:1848599 [dir_watcher.py:finish():402] scan save: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/requirements.txt requirements.txt |
|
2024-04-24 15:44:00,826 INFO SenderThread:1848599 [dir_watcher.py:finish():402] scan save: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/wandb-metadata.json wandb-metadata.json |
|
2024-04-24 15:44:00,826 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 10 |
|
2024-04-24 15:44:00,828 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:44:00,828 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 10 |
|
2024-04-24 15:44:00,828 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:44:00,828 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 10 |
|
2024-04-24 15:44:00,828 INFO SenderThread:1848599 [file_pusher.py:finish():175] shutting down file pusher |
|
2024-04-24 15:44:01,006 INFO wandb-upload_0:1848599 [upload_job.py:push():131] Uploaded file /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/conda-environment.yaml |
|
2024-04-24 15:44:01,059 INFO wandb-upload_1:1848599 [upload_job.py:push():131] Uploaded file /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/wandb-summary.json |
|
2024-04-24 15:44:01,161 INFO wandb-upload_2:1848599 [upload_job.py:push():131] Uploaded file /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/output.log |
|
2024-04-24 15:44:01,169 INFO wandb-upload_3:1848599 [upload_job.py:push():131] Uploaded file /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/config.yaml |
|
2024-04-24 15:44:01,184 INFO wandb-upload_4:1848599 [upload_job.py:push():131] Uploaded file /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/files/requirements.txt |
|
2024-04-24 15:44:01,384 INFO Thread-11 (_thread_body):1848599 [sender.py:transition_state():617] send defer: 11 |
|
2024-04-24 15:44:01,385 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:44:01,385 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 11 |
|
2024-04-24 15:44:01,385 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:44:01,385 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 11 |
|
2024-04-24 15:44:01,385 INFO SenderThread:1848599 [file_pusher.py:join():181] waiting for file pusher |
|
2024-04-24 15:44:01,385 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 12 |
|
2024-04-24 15:44:01,385 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:44:01,385 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 12 |
|
2024-04-24 15:44:01,385 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:44:01,385 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 12 |
|
2024-04-24 15:44:01,386 INFO SenderThread:1848599 [file_stream.py:finish():595] file stream finish called |
|
2024-04-24 15:44:01,445 INFO SenderThread:1848599 [file_stream.py:finish():599] file stream finish is done |
|
2024-04-24 15:44:01,445 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 13 |
|
2024-04-24 15:44:01,445 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:44:01,445 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 13 |
|
2024-04-24 15:44:01,445 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:44:01,445 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 13 |
|
2024-04-24 15:44:01,445 INFO SenderThread:1848599 [sender.py:transition_state():617] send defer: 14 |
|
2024-04-24 15:44:01,446 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: defer |
|
2024-04-24 15:44:01,446 DEBUG SenderThread:1848599 [sender.py:send():382] send: final |
|
2024-04-24 15:44:01,446 INFO HandlerThread:1848599 [handler.py:handle_request_defer():172] handle defer: 14 |
|
2024-04-24 15:44:01,446 DEBUG SenderThread:1848599 [sender.py:send():382] send: footer |
|
2024-04-24 15:44:01,446 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: defer |
|
2024-04-24 15:44:01,446 INFO SenderThread:1848599 [sender.py:send_request_defer():613] handle sender defer: 14 |
|
2024-04-24 15:44:01,447 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: poll_exit |
|
2024-04-24 15:44:01,447 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: poll_exit |
|
2024-04-24 15:44:01,447 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: server_info |
|
2024-04-24 15:44:01,447 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: get_summary |
|
2024-04-24 15:44:01,448 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: server_info |
|
2024-04-24 15:44:01,449 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: sampled_history |
|
2024-04-24 15:44:01,449 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: internal_messages |
|
2024-04-24 15:44:01,450 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: job_info |
|
2024-04-24 15:44:01,507 DEBUG SenderThread:1848599 [sender.py:send_request():409] send_request: job_info |
|
2024-04-24 15:44:01,508 INFO MainThread:1848599 [wandb_run.py:_footer_history_summary_info():3837] rendering history |
|
2024-04-24 15:44:01,508 INFO MainThread:1848599 [wandb_run.py:_footer_history_summary_info():3869] rendering summary |
|
2024-04-24 15:44:01,508 INFO MainThread:1848599 [wandb_run.py:_footer_sync_info():3796] logging synced files |
|
2024-04-24 15:44:01,508 DEBUG HandlerThread:1848599 [handler.py:handle_request():146] handle_request: shutdown |
|
2024-04-24 15:44:01,508 INFO HandlerThread:1848599 [handler.py:finish():866] shutting down handler |
|
2024-04-24 15:44:02,450 INFO WriterThread:1848599 [datastore.py:close():294] close: /fsx/sanchit/distil-zephyr-1.5b-ssft-ultrachat/wandb/run-20240424_154339-mwp0iutr/run-mwp0iutr.wandb |
|
2024-04-24 15:44:02,508 INFO SenderThread:1848599 [sender.py:finish():1548] shutting down sender |
|
2024-04-24 15:44:02,508 INFO SenderThread:1848599 [file_pusher.py:finish():175] shutting down file pusher |
|
2024-04-24 15:44:02,508 INFO SenderThread:1848599 [file_pusher.py:join():181] waiting for file pusher |
|
|