Improve Performance of Attack in the Team Robots Soccer using Reinforcement Learning

Authors

Computer Engineering Department, Faculty of Engineering, Yazd University, Yazd, Iran

Abstract

Due to the impossibility of predicting all possible states for agents in a wide dynamic multi-agent system, machine learning methods are useful tools to control agent behavior. Simulated Robot Soccer is a well known multi agent benchmark to evaluate machine learning algorithms. In this paper, QV-Learning algorithm (a well known reinforcement learning algorithm) is used to improve the performance of the attack in 2D robots soccer team. The reinforcement signal is defined based on the players involved in the attack can reach the ball in front of goal or lose the ball; They receive positive and negative reward according to the mentioned status, respectively.  We use the idea of division the reinforcement signal proportional to the amount of expertness (knowledge) of  agents to improve the performance. Here, the expertise is defined as the difference between highest action value and lowest action value in the each state. The simulation results show using the idea of expertise improves the train speed and the performance.

Keywords


[1] F. Almeida, N. Lau, and L. P. Reis, ”A Survey on Coordination Methodologies for Simulated Robotic Soccer Teams,” RoboCup Symposium,2010.
[2] M. Alavi, M. F. Tarazkouhi, A. Azaran, A. Nouri, S. Zolfaghari, and H. R. S. Boroujeni, ”Robocup 2012- Soccer Simulation League 2D Soccer Simulation Riton,” Robot Soccer World Cup, Springer Berlin Heidelberg, 2013.
[3] M. Chen et. al., (2003) RoboCup Soccer Server for Soccer Server Tersion cefc and later, [Online], Available: http://wwfc.cs.virginia.edu/documentation/manual.pdf [jol. 11, 2015]
[4] J. R. F. Neri, M. R. Zatelli, C. H. F. dos Santos, and J. A. Fabro, ”A Proposal of QLearning to Control the Attack of a 2D Robot Soccer Simulation Team,” Robotics Symposium and Latin American Robotics Symposium (SBR-LARS), pp. 174–178, 2012.
[5] M. Ghazanfari, S. O. Shirkhorshidi, and F. Samsamipour, ”Axiom 2013 Team Description Paper,” Robot Soccer World Cup, Springer Berlin Heidelberg, vol. 8371, 2014.
[6] S. Kalyanakrishnan, Y. Liu, and P. Stone, ”Half field offense in RoboCup soccer: A multiagent reinforcement learning case study,” Robot Soccer World Cup, Springer Berlin Heidelberg, vol. 4434, pp. 72–85, 2008.
[7] H. Akiyama, T. Nakashima, and K. Yamashita, ”Helios2013 team description paper,” Robot Soccer World Cup, Springer Berlin Heidelberg, vol. 8371, 2014.
[8] T. Sirinivasan, K. Aarthi, S. A. Meenakshi, and M. Kausalya, ”Cbrrobosoc: An efficient planning strategy for robotic soccer using case based reasoning,” International Conference on Computational Intelligence for Modeling Control and Automation, and International Conference on Intelligent Agents, Web Technologies and Internet Commerce,  pp. 113–119, 2006.
[9] A. Bai, H. Zhang, G. Lu, M. Jiang, and X. Chen, ”WrightEagle 2D Soccer Simulation Team Description,” Robot Soccer World Cup, Springer Berlin Heidelberg, vol. 7500, 2013.
[10] S. Marian, D. Luca, B. Sarac, and O. Cotarlea, ”OXSY 2014 Team Description,” Robot Soccer World Cup, Springer Berlin Heidelberg, 2015.
[11] H. Akiyama, T. Nakashima, and K. Yamashita, ”HELIOS2014 Team Description Paper,”  Robot Soccer World Cup, Springer Berlin Heidelberg, 2015.
[12] M. Yoon, Developing basic soccer skills using reinforcement learning for the RoboCup Small Size League, Master Thesis, Stellenbosch University, pp.11, March 2015.
[13] مینا خاکسار، ولی درهمی و مهدی رضائیان، «بهبود عملکرد حمله در تیم رباتهای شبیه‌ساز فوتبال با استفاده از یادگیری تقویتی»، دومین کنفرانس محاسبات تکاملی و هوش جمعی، دانشگاه شهید باهنر، کرمان، اسفند 95.
[14] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, MIT press Cambridge, 1998.
[15] عادل اکبری مجد، حسین شایقی، حمید محمدنژاد، عبداله یونسی، «کنترل کننده مقاوم تطبیقی بار فرکانس مبتنی بر یادگیری تقویتی برای یک سیستم قدرت به‌هم پیوسته شامل SMES» مجله مهندسی برق دانشگاه تبریز، جلد 47، شماره 2، صفحات 381-390، تابستان 1396.
[16]  مریم رمضانیان لنگرودی، سیدمازیار میرحسینی مقدم، بهنام علیزاده، «استفاده از روش یادگیری رقابتی برای قیمت‌دهی استراتژیک شرکت‌های تولید براساس LMP در بازار برق»، مجله مهندسی برق دانشگاه تبریز، جلد 47، شماره 2، صفحات 537-549، تابستان 1396.