Relative Content

Tag Archive for reinforcement-learning

Dự án AI Dreamer của Google: Nhìn vào quá khứ để dự báo tương lai

Mới đây, các nhà nghiên cứu đến từ dự án DeepMind của Google đã phối hợp cùng với trường Đại học Toronto với dự án AI mang tên “Dreamer” – nhằm thử nghiệm tính hiệu quả của phương thức học tăng cường (reinforcement learning) đối với trí tuệ nhân tạo. Thiết kế của Dreamer tập trung vào việc dựa vào những gì máy nghiên cứu và học được trong quá khứ để đưa ra những lựa chọn cho vấn đề dựa theo “suy đoán” về kết quả tương lai.

Chó robot của Thụy Sỹ không ngã khi bị người đạp, nếu ngã biết tự lật đứng dậy

Các kỹ sư Thụy Sỹ cho biết đã tìm ra cách để giúp những chú robot bốn chân thoát kiếp bị con người… hành hạ.

Reinforcement Learning – Temporal Difference (Driving example from Sutton and Barto) [closed]

Closed 11 hours ago.

To which value does converge the objective: target =$ (r + gamma * Q(s’,*)) / 2$

Conditions:

Tkinter python program seem to work without a mainloop, why?

With this small PYTHON code:
import time
def toto(num=-1):
for i in range(10):
print (‘toto:’,num,’ :: loop:’,i,flush=True)
time.sleep(1)

Learning agent in custom gymnasium enviroment with stable_baseline3 make change this envirment

I customize a gymnasium enviroment and train it with stable_baseline3. But leaning process change my enviroment.

What should I do with my Deep RL agent if the agents keeps going on the same wrong path?

I am trying to write a Deep RL agent for MegaMan Nes using stable-baselines3 and Open AI Gym Retro. I have tried a lot of things. I used reward shaping and tried a lot of hyperparameters with PPO, but no matter what I do, my agent keep going to the right and hit the enemies that fly towards him. I modified all the hyperparameters including batch_size, n_steps, gamma, learning_rate, ent_coef, clip_range, n_epochs, gae_lambda, max_grad_norm and vf_coef.

Thiết kế website giá rẻ

Danh mục