Reducing Q-Value Estimation Bias via Mutual Estimation and Softmax Operation in MADRL

Zheng Li

Xinkai Chen

Jiaqing Fu

Ning Xie and Tingting Zhao

Resumen

With the development of electronic game technology, the content of electronic games presents a larger number of units, richer unit attributes, more complex game mechanisms, and more diverse team strategies. Multi-agent deep reinforcement learning shines brightly in this type of team electronic game, achieving results that surpass professional human players. Reinforcement learning algorithms based on Q-value estimation often suffer from Q-value overestimation, which may seriously affect the performance of AI in multi-agent scenarios. We propose a multi-agent mutual evaluation method and a multi-agent softmax method to reduce the estimation bias of Q values in multi-agent scenarios, and have tested them in both the particle multi-agent environment and the multi-agent tank environment we constructed. The multi-agent tank environment we have built has achieved a good balance between experimental verification efficiency and multi-agent game task simulation. It can be easily extended for different multi-agent cooperation or competition tasks. We hope that it can be promoted in the research of multi-agent deep reinforcement learning.

Palabras claves

reinforcement learning - game AI - multi-agent Q-network mutual estimation - softmax bellman operation - reinforcement learning environment

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 17 Parte: 1 (2024)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

DOI

https://doi.org/10.3390/a17010036

Reducing Q-Value Estimation Bias via Mutual Estimation and Softmax Operation in MADRL

Artículos similares

Revistas destacadas