Home > Meeting Topics and Material > Bio for Dr. Divakaran and an abstract for his presentation to QUEST this week

Bio for Dr. Divakaran and an abstract for his presentation to QUEST this week

Title: Multimodal analysis of human behavior for social interaction training.


We present a suite of multimodal techniques for assessment of human behavior with cameras and microphones. These techniques drive the sensing module of an interactive simulation trainer in which the trainee has lifelike interaction with a virtual character so as to learn social interaction. We recognize facial expressions, gaze behaviors, gestures, postures, speech and paralinguistics in real-time and transmit the results to the simulation environment which reacts to the trainee’s behavior in a manner that serves the overall pedagogical purpose. We will describe the techniques developed and results, comparable to the state of the art ,obtained for each of the behavioral cues, as well as identify avenues for further research. Behavior sensing in social interactions poses a few key challenges for each of the cues including the large number of possible behaviors, the high variability in execution of the same behavior within and across individuals and real-time execution. Furthermore, we have the challenge of appropriate fusion of the multimodal cues so as to arrive at a comprehensive assessment of the behavior at multiple time scales We also explore going beyond extraction of verbs to extraction of the associated adverbs that capture the intensity/valence of the behavior.. We will present a video of the demonstration of the end to end simulation trainer. If time permits, we will cover other training applications such intelligent tutoring systems and stress resiliency training systems.


Ajay Divakaran, PhD is a Technical Manager at SRI International Sarnoff. He has developed several innovative technologies for multimodal systems for both commercial and government programs over the past 16 years. He currently leads SRI Sarnoff’s projects on Modeling and Analysis of Human Behavior for the DARPA SSIM project, ONR Stress Resiliency project, Army “Master Trainer” Intelligent Tutoring project, Audio Analysis for Event Detection in Open Source Video in the IARPA Aladdin program, and People, Vehicle and Vessel tracking for ONR and JIEDDO-DHS among others. He worked at Mitsubishi Electric Research Labs for ten years where he was the lead inventor of the world’s first sports highlights playback enabled DVR, as well as a manager overseeing a wide variety of product applications of machine learning. He was elevated to Fellow of the IEEE in 2011 for his contributions to multimedia content analysis. He developed techniques for recognition of agitated speech for his work on sports highlights. He established a sound experimental and theoretical framework for human perception of action in video sequences, as lead-inventor of the MPEG-7 video standard motion activity descriptor. He serves on TPC’s of key multimedia conferences and served as an associate editor of the IEEE transactions on Multimedia from 2007 to 2011 and has two books and over 100 publications to his credit as well as over 40 issued patents. He received his Ph.D. degree in Electrical Engineering from Rensselaer Polytechnic Institute in 1993.

  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: