Algoritmos de Aprendizaje por Refuerzo Profundo para la Navegación de Barcos en Aguas Restringidas

Autores/as

  • Jonathas Marcelo Pereira Figueiredo Universidade de São Paulo. Escola Politécnica
  • Rodrigo Pereira Abou Rejaili Universidade de São Paulo. Escola Politécnica

DOI:

https://doi.org/10.11606/issn.2526-8260.mecatrone.2018.151953

Palabras clave:

Reinforcement learning, Navigation, Neural networks, Deep learning

Resumen

Reinforcement Learning has not been fully explored for the automated control of ships maneuvering movements in restricted waters. Nevertheless, more robust and efficient control can be achieved with such algorithms. This paper presents the use of Deep Q Network and Deep Deterministic Policy Gradient methods with a numerical simulator for ship maneuvers to develop control laws. Both methods proved to be efficient in navigational control through a channel. A comparison of response and control behavior resulting from each of the methods is presented.

Descargas

Los datos de descarga aún no están disponibles.

Biografía del autor/a

  • Jonathas Marcelo Pereira Figueiredo, Universidade de São Paulo. Escola Politécnica

    Graduating in Mechatronics Engineering at Escola Politécnica of the University of São Paulo, Brazil. Graduated in Automated Systems and Information Engineering at ENSE3 Grenoble-INP, France.

  • Rodrigo Pereira Abou Rejaili, Universidade de São Paulo. Escola Politécnica

    Graduating in Mechatronics Engineering at Escola Politécnica of the University of São Paulo, Brazil. Graduated in Robotics and Embedded Systems Engineering at ENSTA ParisTech, France, and Machine Learning and Data Science at University Paris Sud, France.

Referencias

AHMED, Yaseen Adnan; HASEGAWA, Kazuhiko. Experiment results for automatic shipberthing using artificial neural network based controller. IFAC Proceedings Volumes, Elsevier BV, v. 47, n. 3, 2014, 2658–2663. Disponível em: <https://doi.org/10.3182/20140824-6-za-1003.00538>.

AMENDOLA, José. Batch reinforcement learning of feasible trajectories in a ship maneuvering simulator. In:Anais do XV Encontro Nacional de Inteligência Artificial e Computacional (ENIAC). [S.l.: s.n.], 2018.

BROCKMAN, Greg ; et al. Openai gym.CoRR, [S.l.: s.n.], abs/1606.01540, 2016.

CUTLER, Mark; HOW, Jonathan P. Efficient reinforcement learning for robots usinginformative simulated priors. In: 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2015. Disponível em: <https://doi.org/10.1109/icra.2015.7139550>.

DOUGHERTY, Mark. A review of neural networks applied to transport. Transportation Research Part C: Emerging Technologies, Elsevier BV, v. 3, n. 4, aug 1995, 247–260. Disponível em: <https://doi.org/10.1016/0968-090x(95)00009-8>.

FILHO, Asdrubal N. Queiroz; ZIMBRES, Marcelo; TANNURI, Eduardo A. Development andvalidation of a customizable DP system for a full bridge real time simulator. In:Volume 1A: Offshore Technology. ASME, 2014. Disponível em: <https://doi.org/10.1115/omae2014-23623>.

FOSSEN, Thor I. Handbook of marine craft hydrodynamics and motion control. John Wiley &Sons, Ltd, 2011. Disponível em: <https://doi.org/10.1002/9781119994138>.

GERLA, Mario ; et al. Internet of vehicles: From intelligent grid to autonomous cars andvehicular clouds. In:2014 IEEE World Forum on Internet of Things (WF-IoT). IEEE, 2014. Disponível em: <https://doi.org/10.1109/wf-iot.2014.6803166>.

HAFNER, Roland; RIEDMILLER, Martin. Reinforcement learning in feedback control.Machine learning, [S.l.]: Springer, v. 84, n. 1-2, 2011, 137–169.

HETHERINGTON, Catherine; FLIN, Rhona; MEARNS, Kathryn. Safety in shipping: Thehuman element.Journal of Safety Research, Elsevier BV, v. 37, n. 4, jan 2006, 401–411. Disponível em: <https://doi.org/10.1016/j.jsr.2006.04.007>.

KINGMA, Diederik P; BA, Jimmy. Adam: A method for stochastic optimization.arXivpreprint arXiv:1412.6980, [S.l.: s.n.], 2014.

LACKI, Mirosław. Reinforcement learning in ship handling. TransNav: International Journalon Marine Navigation and Safety of Sea Transportation, [S.l.: s.n.], v. 2, n. 2, 2008.

LILLICRAP, Timothy P ; et al. Continuous control with deep reinforcement learning.arXiv preprint arXiv:1509.02971, [S.l.: s.n.], 2015.

MNIH, Volodymyr ; et al. Human-level control through deep reinforcement learning. Nature,[S.l.]: Nature Publishing Group, v. 518, n. 7540, 2015, 529.

PEREIRA, Jonathas; REJAILI, Rodrigo Pereira Abou.ShipAI. [S.l.]: GitHub, 2018.<https://github.com/jmpf2018/ShipAI>.

RAK, Andrzej; GIERUSZ, Witold. Reinforcement learning in discrete and continuous domainsapplied to ship trajectory generation.Polish Maritime Research, Walter de Gruyter GmbH, v. 19,n. Special, oct 2012, 31–36. Disponível em: <https://doi.org/10.2478/v10012-012-0020-8>.

STAMENKOVICH, M. An application of artificial neural networks for autonomous shipnavigation through a channel. In: IEEE PLANS 92 Position Location and Navigation Symposium Record. [S.l.: s.n.], 1992. p. 346–352.

TANNURI, Eduardo Aoun. Desenvolvimento de metodologia de projeto de sistema deposicionamento dinâmico aplicado a operações em alto-mar. Tese (Doutorado), 2002. Disponível em: <https://doi.org/10.11606/t.3.2002.tde-04082003-173204>.

TORREY, Lisa; SHAVLIK, Jude. Transfer learning. In:Handbook of Research on MachineLearning Applications and Trends: Algorithms, Methods, and Techniques. [S.l.]: IGI Global,2010. p. 242–264

Descargas

Publicado

2018-12-29

Número

Sección

Artigos

Cómo citar

Algoritmos de Aprendizaje por Refuerzo Profundo para la Navegación de Barcos en Aguas Restringidas. (2018). Mecatrone, 3(1). https://doi.org/10.11606/issn.2526-8260.mecatrone.2018.151953