AI, Robotics, and Markov Decision Processes: Enhancing Precision and Autonomy inHealthcare Systems
Abstract
Robotic systems are increasingly integrated into healthcare to enhance precision, autonomy, and efficiency. This study provides a systematic review of decision-making systems and control architectures for autonomous and social robots in hospitals, with a specific focus on Markov Decision Processes (MDPs) and their variants. A systematic search of ScienceDirect, SpringerLink, and IEEE databases was conducted covering the last three decades. Inclusion criteria focused on studies describing action-selection or decision-making methods for autonomous or semiautonomous healthcare robots. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology was applied to identify, screen, and analyze relevant publications. The review identifies major application areas of MDP-based decision-making in healthcare: 1) surgical robotics, predominantly using Completely Observable Markov Decision Processes (COMDPs), 2) rehabilitation, where Partially Observable Markov Decision Processes (POMDPs) combined with deep reinforcement learning are common, 3) telemedicine, using COMDP frameworks with multi-agent coordination, 4) elderly care, leveraging POMDPs with human feedback, and 5) emergency response, applying multi-robot COMDPs enhanced with Bayesian updates. Emerging trends include hybrid COMDP–POMDP approaches and integration with machine learning for real-world deployment. MDP-based decision-making systems demonstrate strong potential to improve autonomy and adaptability in healthcare robotics. While COMDPs are effective in structured environments such as surgery, POMDPs are increasingly preferred for human-centered and uncertain contexts. Key challenges remain, including a lack of standardized benchmarks, limited clinical validation, and computational complexity. Addressing these gaps will be essential for the safe, efficient, and ethical deployment of robotic systems in healthcare.
Keywords:
Markov decision processes, Healthcare robotics, Human-robot interaction, Partially observable Markov decision processes, Decision support systems, Deep reinforcement learningReferences
- [1] Lanfranco, A. R., Castellanos, A. E., Desai, J. P., & Meyers, W. C. (2004). Robotic surgery: A current perspective. Annals of surgery, 239(1), 14–21. http://dx.doi.org/10.1097/01.sla.0000103020.19595.7d
- [2] Loh, H. W., Ooi, C. P., Seoni, S., Barua, P. D., Molinari, F., & Acharya, U. R. (2022). Application of explainable artificial intelligence for healthcare: A systematic review of the last decade (2011–2022). Computer methods and programs in biomedicine, 226, 107161. https://doi.org/10.1016/j.cmpb.2022.107161
- [3] Paul, D., Sanap, G., Shenoy, S., Kalyane, D., Kalia, K., & Tekade, R. K. (2021). Artificial intelligence in drug discovery and development. Drug discovery today, 26(1), 80–93. https://doi.org/10.1016/j.drudis.2020.10.010
- [4] Broadbent, E., Stafford, R., & MacDonald, B. (2009). Acceptance of healthcare robots for the older population: Review and future directions. International journal of social robotics, 1(4), 319–330. https://doi.org/10.1007/s12369-009-0030-6
- [5] Alsabah, M., Naser, M. A., Albahri, A. S., Albahri, O. S., Alamoodi, A. H., Abdulhussain, S. H., & Alzubaidi, L. (2025). A comprehensive review on key technologies toward smart healthcare systems based IoT: technical aspects, challenges and future directions. Artificial intelligence review, 58(11), 343. https://doi.org/10.1007/s10462-025-11342-3
- [6] Olaronke, I., Ojerinde, O., & Ikono, R. (2017). State of the art: A study of human-robot interaction in healthcare. International journal of information engineering and electronic business, 3(3), 43–55. https://doi.org/10.5815/ijieeb.2017.03.06
- [7] Yu, K.-H., Beam, A. L., & Kohane, I. S. (2018). Artificial intelligence in healthcare. Nature biomedical engineering, 2(10), 719–731. https://doi.org/10.1038/s41551-018-0305-z
- [8] Jiang, F., Jiang, Y., Zhi, H., Dong, Y., Li, H., Ma, S., … ., & Wang, Y. (2017). Artificial intelligence in healthcare: Past, present and future. Stroke and vascular neurology, 2(4), 230-243. https://doi.org/10.1136/svn-2017-000101
- [9] Topol, E. (2019). Deep medicine: How artificial intelligence can make healthcare human again. Hachette UK. https://www.amazon.com/Deep-Medicine-Artificial-Intelligence-Healthcare/dp/1541644638
- [10] Sutton, R. S., Barto, A. G., & others. (1998). Reinforcement learning: An introduction (Vol. 1). MIT press Cambridge. https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf
- [11] Birkhoff, D. C., van Dalen, A. S. H. M., & Schijven, M. P. (2021). A review on the current applications of artificial intelligence in the operating room. Surgical innovation, 28(5), 611–619. https://doi.org/10.1177/1553350621996961
- [12] Yang, G. Z., Cambias, J., Cleary, K., Daimler, E., Drake, J., Dupont, P. E., … ., Taylor, R. H. (2017). Medical robotics—Regulatory, ethical, and legal considerations for increasing levels of autonomy. Science Robotics, 2(4). https://doi.org/10.1126/scirobotics.aam8638
- [13] Köbis, N., Starke, C., & Rahwan, I. (2022). The promise and perils of using artificial intelligence to fight corruption. Nature machine intelligence, 4(5), 418–424. https://doi.org/10.1038/s42256-022-00489-1
- [14] Kristoffersson, A., Coradeschi, S., & Loutfi, A. (2013). A review of mobile robotic telepresence. Advances in human-computer interaction, 2013, 1–17. https://doi.org/10.1155/2013/902316
- [15] World Health Organization (WHO). (2021). 14.9 million excess deaths associated with the COVID-19 pandemic in 2020 and 2021. https://www.who.int/news/item/05-05-2022-14.9-million-excess-deaths-were-associated-with-the-covid-19-pandemic-in-2020-and-2021
- [16] World Health Organization (WHO). (2021). Noncommunicable diseases: Mortality. https://www.who.int/data/gho/data/themes/topics/noncommunicable-diseases-mortality
- [17] Weiser, T. G., Regenbogen, S. E., Thompson, K. D., Haynes, A. B., Lipsitz, S. R., Berry, W. R., & Gawande, A. A. (2008). An estimation of the global volume of surgery: A modelling strategy based on available data. The lancet, 372(9633), 139–144. https://doi.org/10.1016/S0140-6736(08)60878-8
- [18] Williams, S., Layard Horsfall, H., Funnell, J. P., Hanrahan, J. G., Khan, D. Z., Muirhead, W., … ., & Marcus, H. J. (2021). Artificial intelligence in brain tumour surgery—An emerging paradigm. Cancers, 13(19), 1–25. https://doi.org/10.3390/cancers13195010
- [19] Podnar, S., Kukar, M., Gunčar, G., Notar, M., Gošnjak, N., & Notar, M. (2019). Diagnosing brain tumours by routine blood tests using machine learning. Scientific reports, 9(1), 14481. https://doi.org/10.1038/s41598-019-51147-3
- [20] Eyraud, R., Ayache, S., Tsvetkov, P. O., Kalidindi, S. S., Baksheeva, V. E., Boissonneau, S., ... ., & Tabouret, E. (2023). Plasma nanodsf denaturation profile at baseline is predictive of glioblastoma EGFR status. Cancers, 15(3), 1–9. https://doi.org/10.3390/cancers15030760
- [21] Autenbahn, K., & Billard, A. (2022). The impact of interactive robots on autism therapy. Proceedings of the IEEE international symposium on robot and human interactive communication. IEEE. https://doi.org/10.1109/ROMAN.2002.1138485
- [22] Feil-Seifer, D., & Matarić, M. (2009). Toward socially assistive robotics for augmenting interventions for children with autism spectrum disorders. In Experimental robotics (Vol. 54, pp. 201–210). Springer Tracts in Advanced Robotics. https://doi.org/10.1007/978-3-642-00196-3_24
- [23] Hiolle, A., Lewis, M., & Cañamero, L. (2014). Arousal regulation and affective adaptation to human responsiveness by a robot that explores and learns a novel environment. Frontiers in neurorobotics, 8. https://doi.org/10.3389/fnbot.2014.00017
- [24] Lones, J., Lewis, M., & Cañamero, L. (2018). A hormone-driven epigenetic mechanism for adaptation in autonomous robots. IEEE transactions on cognitive and developmental systems, 10(2), 445–454. https://doi.org/10.1109/TCDS.2017.2775620
- [25] Lewis, M., & Canamero, L. (2016). Hedonic quality or reward? A study of basic pleasure in homeostasis and decision making of a motivated autonomous robot. Adaptive behavior, 24(5), 267–291. https://doi.org/10.1177/1059712316666331
- [26] Casey, D. (2016). MARIO Managing active and healthy aging with use of caring service robots. https://researchrepository.universityofgalway.ie/server/api/core/bitstreams/f2032b66-6112-4857-9fcb-f56341a4261b/content
- [27] Kelasidi, E., Moe, S., Pettersen, K. Y., Kohl, A. M., Liljebäck, P., & Gravdahl, J. T. (2019). Path following, obstacle detection and obstacle avoidance for thrusted underwater Snake robots. Frontiers in robotics and ai, 6. https://doi.org/10.3389/frobt.2019.00057
- [28] Pandey, A. K., & Gelin, R. (2018). A mass-produced sociable humanoid robot: Pepper: The first machine of its kind. IEEE robotics & automation magazine, 25(3), 40–48. https://doi.org/10.1109/MRA.2018.2833157
- [29] Tanevska, A., Rea, F., Sandini, G., Cañamero, L., & Sciutti, A. (2020). A socially adaptable framework for human-robot interaction. Frontiers in robotics and AI, 7. https://doi.org/10.3389/frobt.2020.00121
- [30] Moerman, C. J., van der Heide, L., & Heerink, M. (2019). Social robots to support children’s well-being under medical treatment: A systematic state-of-the-art review. Journal of child health care : for professionals working with children in the hospital and community, 23(4), 596–612. https://doi.org/10.1177/1367493518803031
- [31] Pairet, È., Ardón, P., Liu, X., Lopes, J., Hastie, H., & Lohan, K. S. (2019). A digital twin for human-robot interaction. 2019 14th ACM/IEEE international conference on human-robot interaction (HRI) (pp. 372). IEEE. https://doi.org/10.1109/HRI.2019.8673015
- [32] Hauskrecht, M. (2000). Value-function approximations for partially observable markov decision processes. Journal of artificial intelligence research, 13, 33–94. https://doi.org/10.1613/jair.678
- [33] Shortreed, S. M., Laber, E., Lizotte, D. J., Stroup, T. S., Pineau, J., & Murphy, S. A. (2011). Informing sequential clinical decision-making through reinforcement learning: An empirical study. Machine learning, 84(1), 109–136. https://doi.org/10.1007/s10994-010-5229-0
- [34] Blankart, K. E., & Lichtenberg, F. R. (2020). Are patients more adherent to newer drugs? Health care management science, 23(4), 605–618. https://doi.org/10.1007/s10729-020-09513-5
- [35] Shechter, S., Bailey, M., & Schaefer, A. (2008). The optimal time to initiate HIV therapy under ordered health states. Operations research, 56(1), 20–33. https://doi.org/10.1287/opre.1070.0480
- [36] Puterman, M. L. (1994). Markov decision processes: Discrete stochastic dynamic programming. John Wiley & Sons. https://doi.org/10.1002/9780470316887
- [37] Alagoz, O., Hsu, H., Schaefer, A. J., & Roberts, M. S. (2010). Markov decision processes: A tool for sequential decision making under uncertainty. Medical decision making, 30(4), 474–483. https://doi.org/10.1177/0272989X09353194
- [38] Thall, P. F., & Wathen, J. K. (2007). Practical Bayesian adaptive randomisation in clinical trials. European journal of cancer, 43(5), 859–866. https://doi.org/10.1016/j.ejca.2007.01.006
- [39] Liu, S., See, K. C., Ngiam, K. Y., Celi, L. A., Sun, X., & Feng, M. (2020). Reinforcement learning for clinical decision support in critical care: Comprehensive review. Journal of medical internet research, 22(7), e18477. https://doi.org/10.2196/18477
- [40] Kaelbling, L. P., Littman, M. L., & Cassandra, A. R. (1998). Planning and acting in partially observable stochastic domains. Artificial intelligence, 101(1), 99–134. https://doi.org/10.1016/S0004-3702(98)00023-X
- [41] Silver, D., & Veness, J. (2010). Monte-carlo planning in large pomdps. Advances in neural information processing systems (pp. 2164–2172). Curran Associates, Inc. https://www.researchgate.net/publication/221620445
- [42] Igl, M., Zintgraf, L., Le, T. A., Wood, F., & Whiteson, S. (2018). Deep variational reinforcement learning for pomdps. Proceedings of the 35th international conference on machine learning (Vol. 80, pp. 2117–2126). PMLR. https://proceedings.mlr.press/v80/igl18a.html
- [43] Daumé III, H. (2007). Frustratingly easy domain adaptation. Proceedings of the 45th annual meeting of the association of computational linguistics (pp. 256–263). Association for Computational Linguistics (ACL). https://aclanthology.org/P07-1033.pdf
- [44] Pineau, J., Gordon, G., Thrun, S. (2003). Point-based value iteration: An anytime algorithm for POMDPs. Proceedings of the 18th International joint conference on artificial intelligence (IJCAI) (Vol. 3, pp. 1025–1032). Morgan Kaufmann Publishers Inc. http://www.thrun.org/papers/Pineau03a.pdf
- [45] Goodrich, M. A., & Schultz, A. C. (2008). Human-robot interaction: A survey. Foundations and trends®in human-computer interaction, 1(3), 203–275. https://www.emerald.com/fthci/article/1/3/203/1321642
- [46] Sheridan, T. B. (2016). Human-robot interaction: status and challenges. Human factors, 58(4), 525–532. https://doi.org/10.1177/0018720816644364
- [47] Dautenhahn, K. (2007). Socially intelligent robots: Dimensions of human-robot interaction. Philosophical transactions of the royal society b: Biological sciences, 362(1480), 679–704. https://royalsocietypublishing.org/rstb/article/362/1480/679/20947
- [48] Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The ethics of algorithms: Mapping the debate. Big data & society, 3(2). https://doi.org/10.1177/2053951716679679
- [49] Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., … ., & Vayena, E. (2018). AI4People—An ethical framework for a good AI society: Opportunities, Risks, Principles, and Recommendations. Minds and machines, 28(4), 689–707. https://doi.org/10.1007/s11023-018-9482-5
- [50] Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G., & Chin, M. H. (2018). Ensuring fairness in machine learning to advance health equity. Annals of internal medicine, 169(12), 866–872. https://doi.org/10.7326/M18-1990
- [51] Matthias, A. (2004). The responsibility gap: Ascribing responsibility for the actions of learning automata. Ethics and information technology, 6(3), 175–183. https://doi.org/10.1007/s10676-004-3422-1

