igorcheb commited on
Commit
d64d8a0
1 Parent(s): efb8a0a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -18
README.md CHANGED
@@ -30,21 +30,23 @@ Numbers on X axis are average over 40 episodes, each lasting for about 500 times
30
  Learning rate decay schedule: <code>torch.optim.lr_scheduler.StepLR(opt, step_size=4000, gamma=0.7)</code>
31
 
32
  Minimal code to use the agent:</br>
33
- <pre><code>
34
- import gym</br>
35
- </br>
36
- env_name = 'LunarLanderContinuous-v2'</br>
37
- env = gym.make(env_name)</br>
38
- agent = torch.load('best_models/best_reinforce_lunar_lander_cont_model_269.402.pt')</br>
39
- render = True</br>
40
- observation = env.reset()</br>
41
- while True:</br>
42
- if render:</br>
43
- env.render()</br>
44
- action = agent.act(observation)</br>
45
- observation, reward, done, info = env.step(action)</br>
46
- </br>
47
- if done:</br>
48
- break</br>
49
- env.close()</br>
50
- </code></pre>
 
 
 
30
  Learning rate decay schedule: <code>torch.optim.lr_scheduler.StepLR(opt, step_size=4000, gamma=0.7)</code>
31
 
32
  Minimal code to use the agent:</br>
33
+ ```
34
+ import gym
35
+ from agent_class import ParameterisedPolicy
36
+
37
+ env_name = 'LunarLanderContinuous-v2'
38
+ env = gym.make(env_name)
39
+ agent = torch.load('best_reinforce_lunar_lander_cont_model_269.402.pt')
40
+ render = True
41
+
42
+ observation = env.reset()
43
+ while True:
44
+ if render:
45
+ env.render()
46
+ action = agent.act(observation)
47
+ observation, reward, done, info = env.step(action)
48
+
49
+ if done:
50
+ break
51
+ env.close()
52
+ ```