WSEAS Transactions on Environment and Development


Print ISSN: 1790-5079
E-ISSN: 2224-3496

Volume 14, 2018

Notice: As of 2014 and for the forthcoming years, the publication frequency/periodicity of WSEAS Journals is adapted to the 'continuously updated' model. What this means is that instead of being separated into issues, new papers will be added on a continuous basis, allowing a more regular flow and shorter publication times. The papers will appear in reverse order, therefore the most recent one will be on top.


Volume 14, 2018



A Novel Realizer of Conversational Behavior for Affective and Personalized Human Machine Interaction - EVA U-Realizer

AUTHORS: Izidor Mlakar, Zdravko Kačič, Matej Borko, Matej Rojc

Download as PDF

ABSTRACT: In order to engage with user on more personal level, natural human-machine interaction is starting to virtual or even physical entities resembling human collocutors. Namely, through human-like entities the designed multimodal interaction models, try to adapt to user’s context and to facilitate context of the conversational situation. The designed multimodal processes tend to follow the ‘’rules’’ and ques as implanted during face-face interaction among humans. In this way, the integration and exploitation of embodied conversational agents (ECA) for human-like interaction seems only natural. The ECA’s artificial body and articulation capabilities are already close to those found on real humans. From skin, face, hands, and body posture, these virtual entities tend to look and behave as realistically as possible. Furthermore, ECAs tend to imitate as many features of human face-face dialogs as possible, and integrate them into interaction as synchronized as possible. One of the essential functions in face-to-face interaction is the ability to reproduce synchronized verbal and co-verbal signals coupled into conversational behavior. Other signals, such as: social cues, attitude (emotions), personality, eye-blinks, and spontaneous head movements are equally important, and have to be blended into multimodal expressions. However, designing a realistic entity that acts realistically like humans do, is a daunting task. Modern 3D environments and 3D modeling tools, such as: Maya, Daz3D, Blender, Panda3D and Unity, have opened up a completely new possibilities to design virtual entities, which appear almost (if not completely) like real-life persons. However, the modern 3D technology mostly covers the design and deployment part of such realism, while the realism of the behavior itself and its diversity and dynamic nature are generally not the focus of 3D modeling frameworks. Namely, most of the animation prepared in 3D frameworks is planned, designed in advance, and over well-written and well-designed situations. Therefore, it integrates limited diversity as well as limited capacity to adapt to a new set of parameters. As a result, 3D frameworks have a limited capacity to handle highly dynamic and interchangeable contexts present in human interaction. In this paper we, therefore, outline EVA U-Realizer, the second generation of proprietary behavior realization module. The goal of the realizer is to integrate the diversity, responsiveness, and adaptiveness that is required for the facilitation of conversational responses with animation capacities and realism provided by 3D modeling frameworks. As a result, the presented novel realizer is built over Unity 3D game engine’s core. The proposed realizer considerably improves the capabilities of the animation engine itself, by providing interpreter and executor for incorporating dynamic and real-time generated conversational artefacts, similar to those found in conversations among humans.

KEYWORDS: embodied conversational agents, personalized interaction, co-verbal behavior, behavior realizer, animation, virtual reality, mixed reality, multimodal interaction

REFERENCES:

[1] Luger, E., & Sellen, A. (2016, May). Like having a really bad PA: the gulf between user expectation and experience of conversational agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5286-5297). ACM.

[2] Ochs, M., Pelachaud, C., & Mckeown, G. (2017). A User Perception--Based Approach to Create Smiling Embodied Conversational Agents. ACM Transactions on Interactive Intelligent Systems (TiiS), 7(1), 4.

[3] Fabian, R., & Alexandru-Nicolae, M. (2009). Natural language processing implementation on Romanian ChatBot. In WSEAS International Conference. Proceedings. Mathematics and Computers in Science and Engineering (No. 5). WSEAS.

[4] Malcangi, M. (2009). Soft-computing methods for text-to-speech driven avatars. In Proceedings of the 11th WSEAS international conference on Mathematical methods and computational techniques in electrical engineering (pp. 288- 292). World Scientific and Engineering Academy and Society (WSEAS).

[5] Kuhnke, F., & Ostermann, J. (2017, July). Visual speech synthesis from 3D mesh sequences driven by combined speech features. In Multimedia and Expo (ICME), 2017 IEEE International Conference on (pp. 1075-1080). IEEE.

[6] Caridakis, G., & Karpouzis, K. (2004). Design and implementation of a greek sign language synthesis system. WSEAS Transactions on Systems, 3(10), 3108-3113.

[7] Rojc, M., Presker, M., Kačič, Z., & Mlakar, I. (2014). TTS-driven Expressive Embodied Conversation Agent EVA for UMB-SmartTV. International journal of computers and communications, 8, pp. 57-66.

[8] Tolins, J., Liu, K., Neff, M., Walker, M. A., & Tree, J. E. F. (2016). A Verbal and Gestural Corpus of Story Retellings to an Expressive Embodied Virtual Character. In LREC.

[9] Esposito, A., Esposito, A. M., & Vogel, C. (2015). Needs and challenges in human computer interaction for processing social emotional information. Pattern Recognition Letters, 66, 41-51.

[10] Kok, K. I., & Cienki, A. (2016). Cognitive Grammar and gesture: Points of convergence, advances and challenges. Cognitive Linguistics, 27(1), 67-100.

[11] Kopp, S., & Bergmann, K. (2017, April). Using cognitive models to understand multimodal processes: The case for speech and gesture production. In The Handbook of Multimodal-Multisensor Interfaces (pp. 239- 276). Association for Computing Machinery and Morgan & Claypool.

[12] Pelachaud, C. (2015, May). Greta: an interactive expressive embodied conversational agent. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems (pp. 5-5). International Foundation for Autonomous Agents and Multiagent Systems.

[13] Neff, M. (2016). Hand Gesture Synthesis for Conversational Characters. Handbook of Human Motion, 1-12.

[14] Rojc, M., Mlakar, I., 2016. An Expressive Conversational-behavior Generation Model for Advanced Interaction within Multimodal User Interfaces, (Computer Science, Technology and Applications). Nova Science Publishers, Inc., Corp., New York, 234.

[15] Rojc, M., Mlakar, I., & Kačič, Z. (2017). The TTS-driven affective embodied conversational agent EVA, based on a novel conversationalbehavior generation algorithm. Engineering Applications of Artificial Intelligence, 57, 80- 104.

[16] Gratch, J., Hartholt, A., Dehghani, M., & Marsella, S. (2013). Virtual humans: a new toolkit for cognitive science research. Applied Artificial Intelligence, 19, 215-233.

[17] Thiebaux, M., Marsella, S., Marshall, A. N., & Kallmann, M. (2008, May). Smartbody: Behavior realization for embodied conversational agents. In Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems-Volume 1 (pp. 151-158). International Foundation for Autonomous Agents and Multiagent Systems.

[18] Pelachaud, C., 2015. Greta: an interactive expressive embodied conversational agent. In: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, International Foundation for Autonomous Agents and Multiagent Systems, (pp. 5-5).

[19] Klaassen, R., Hendrix, J., Reidsma, D., & van Dijk, B. (2013). Elckerlyc Goes Mobile Enabling Natural Interaction in Mobile User Interfaces.

[20] Heloir, A., & Kipp, M. (2010). Real-time animation of interactive agents: Specification and realization. Applied Artificial Intelligence, 24(6), 510-529.

[21] Kolkmeier, J., Bruijnes, M., Reidsma, D., & Heylen, D. (2017, August). An asap realizerunity3d bridge for virtual and mixed reality applications. In International Conference on Intelligent Virtual Agents (pp. 227-230). Springer, Cham.

[22] Mlakar, I., & Rojc, M. (2011). EVA: expressive multipart virtual agent performing gestures and emotions. International journal of mathematics and computers in simulation.

[23] Bédi, Branislav, et al. 'Starting a Conversation with Strangers in Virtual Reykjavik: Explicit Announcement of Presence.' Proceedings from the 3rd European Symposium on Multimodal Communication, Dublin, September 17-18, 2015. No. 105. Linköping University Electronic Press, 2016.

[24] Li, J., Galley, M., Brockett, C., Spithourakis, G. P., Gao, J., & Dolan, B. (2016). A personabased neural conversation model. arXiv preprint arXiv:1603.06155.

[25] Gibet, S., Carreno-Medrano, P., & Marteau, P. F. (2016). Challenges for the Animation of Expressive Virtual Characters: The Standpoint of Sign Language and Theatrical Gestures. In Dance Notations and Robot Motion (pp. 169- 186). Springer International Publishing.

[26] Salama, M. A. R. I. A., & Shawish, A. H. M. E. D. (2013). A Comprehensive Mobile-Based Companion for Diabetes Management. In 7th WSEAS European Computing Conference, Dubrovnik.

[27] Ochs, M., Pelachaud, C., & Mckeown, G. (2017). A User Perception--Based Approach to Create Smiling Embodied Conversational Agents. ACM Transactions on Interactive Intelligent Systems (TiiS), 7(1), 4.

[28] Belessiotis, V. S., Vosinakis, S., & Panayiotopoulos, T. (2001). The use of the Virtual Agent SimHuman in the ISM scenario system. Advances in Automation, Multimedia and Video Systems, and Modern Computer Science, 97-101.

[29] Linehan, C., & McCarthy, J. (2017). 57 Using conversation to model interaction in the MATHS workstation. Engineering Psychology and Cognitive Ergonomics: Volume FiveAerospace and Transportation Systems, 51.

[30] El Haddad, K., Cakmak, H., Dupont, S., & Dutoit, T. (2016). Laughter and Smile Processing for Human-Computer Interactions. Just talking-casual talk among humans and machines, Portoroz, Slovenia, 23-28.

[31] Linehan, C., & McCarthy, J. (2017). 57 Using conversation to model interaction in the MATHS workstation. Engineering Psychology and Cognitive Ergonomics: Volume FiveAerospace and Transportation Systems, 51.

[32] Porcheron, M., Fischer, J. E., McGregor, M., Brown, B., Luger, E., Candello, H., & O'Hara, K. (2017, February). Talking with conversational agents in collaborative action. In Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (pp. 431-436). ACM.

[33] Pérez, J., Sánchez, Y., Serón, F. J., & Cerezo, E. (2017, August). Interacting with a semantic affective ECA. In International Conference on Intelligent Virtual Agents (pp. 374-384). Springer, Cham.

[34] Kang, J., Badi, B., Zhao, Y., & Wright, D. K. (2006, April). Human motion modeling and simulation. In 6th International Conference on Robotics, Control and Manufacturing Technology (ROCOM 2006) (pp. 62-67).

[35] Akinjala, T. B., Agada, R., & Yan, J. (2016, December). Animating Human Movement & Gestures on an Agent Using Microsoft Kinect. In Multimedia (ISM), 2016 IEEE International Symposium on (pp. 369-374). IEEE.

WSEAS Transactions on Environment and Development, ISSN / E-ISSN: 1790-5079 / 2224-3496, Volume 14, 2018, Art. #9, pp. 87-101


Copyright © 2018 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution License 4.0

Bulletin Board

Currently:

The editorial board is accepting papers.


WSEAS Main Site