Human behaviour is an interesting area of research since centuries, however the trend and techniques of understanding human responses have changed with time. With the emergence of robotics, scientists all over the world have been keen to understand human behaviours with robots. Robot receptionist is deployed in front of the main gate of Arfa Technology Park, and hundreds of visitors interact with the robot for various reasons. The interaction records are used to analyse and study human robot interactions and different experiments are performed to reach successful conclusions.
The Receptionist robot uses mid-air input to interact with the users. The robot gives instructions and relevant information to the users and also interacts with them according to the situation. The prototype talks to people in Urdu and attracts a large number of audiences. The mid-air input functionalities include,
Cursor Control: Cursor is controlled remotely using hand detection and tracking. A grip gesture is used to click on current position.
Highlight gripping: Currently selected button is highlighted based on the calculation of coordinates of detected hand and the user easy-to-move region. A grip gesture is used to click on the selected button.
Swipe: A swipe gesture was use to navigate through the options and buttons.
The robot has the capability to lip sync all Urdu characters while interacting with the users. Currently, implementation on mood detection is in progress. The Robot also gives an option to play games with hand gestures if the user is bored. The nature of the project revolves around computer vision, human robot interaction, social robotics and affective computing.
Faces are detected using Viola Jones algorithm which is the most basic and close to human method. This algorithm uses sample pictures of a specific thing in different conditions and tries to learn them exactly like how a child starts to see and name things.
Facial Expression Recognition:
The detected Faces are processed using Chehra 3D Head Pose Estimator for Matlab and Chehra Matlab Fitting Model to detect Head pose and extract facial landmarks. These Landmarks are tracked and processed along with values of pitch, yaw and roll in a learning model to predict the basic expressions of the user.
A customized algorithm is implemented which use a string to generate an order of respective viseme images and duration array. Lip-Sync is created using these two arrays along with the robotic voice from text to speech module.
Hand Gesture game:
A puzzle game was implemented for the user which uses swipe gesture to arrange the puzzle pieces.
Interactive Pen Game:
The same game was controlled using color detection. It highlights the selectable puzzle piece using the region estimation of a specific color. This color is illuminated using a button to create a different color which is used to select the puzzle piece and move it to the required direction.