TyurinYuriRost commited on
Commit
136678f
·
verified ·
1 Parent(s): a9f6378

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -7
README.md CHANGED
@@ -21,17 +21,44 @@ model-index:
21
  verified: false
22
  ---
23
 
24
- # **PPO** Agent playing **LunarLander-v2**
25
- This is a trained model of a **PPO** agent playing **LunarLander-v2**
26
- using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
27
 
28
  ## Usage (with Stable-baselines3)
29
- TODO: Add your code
 
 
 
 
30
 
31
 
32
  ```python
33
- from stable_baselines3 import ...
34
  from huggingface_sb3 import load_from_hub
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
- ...
37
- ```
 
21
  verified: false
22
  ---
23
 
24
+ # PPO Agent playing LunarLander-v2
25
+
26
+ This is a trained model of a **PPO** agent playing **LunarLander-v2** using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
27
 
28
  ## Usage (with Stable-baselines3)
29
+
30
+ To use this model, you need to have `stable-baselines3` and `huggingface_sb3` installed. You can install them using pip:
31
+
32
+ ```bash
33
+ pip install stable-baselines3 huggingface_sb3 gymnasium
34
 
35
 
36
  ```python
 
37
  from huggingface_sb3 import load_from_hub
38
+ from stable_baselines3 import PPO
39
+ import gymnasium as gym
40
+
41
+ # Identifier for the repository and model file name
42
+ repo_id = "TyurinYuriRost/ppo-LunarLander-v2"
43
+ filename = "ppo-LunarLander-v2.zip"
44
+
45
+ # Load the model checkpoint from Hugging Face Hub
46
+ checkpoint = load_from_hub(repo_id=repo_id, filename=filename)
47
+
48
+ # Load the PPO model
49
+ model = PPO.load(checkpoint)
50
+
51
+ # Create the environment for evaluation
52
+ env = gym.make("LunarLander-v3", render_mode="human")
53
+ obs = env.reset()
54
+
55
+ # Visualize the model's performance
56
+ for _ in range(1000):
57
+ action, _states = model.predict(obs)
58
+ obs, rewards, dones, info = env.step(action)
59
+ env.render()
60
+ if dones:
61
+ obs = env.reset()
62
 
63
+ # Close the environment
64
+ env.close()