Research of Victor Ng-Thow-Hing

digitalmonkeys

Home page of Victor Ng-Thow-Hing

About my research

Links to electronic copies of most of my publications can be found on my academia.edu page.

The common theme that ties my various research interests together is a desire to understand and model human and animal behavior, either with computer graphics and animation or on mechanical devices like humanoid robots. My perspective has moved towards a systems-based approach rather than looking at isolated components. However, in order to do any research at a wider holistic level, I often had to work on individual components first to make sure they are modeled correctly. This has enabled me to pursue interests in human-machine interaction which often requires working with systems of components interacting with each other in interesting ways.

The following are research projects that I've worked on and continue to be interested in while working in graduate school at the University of Toronto and at my current employment at the Honda Research Institute USA. Most recent research activity is listed first.

Augmented Reality for the Car

The automobile industry is quickly moving towards cars connected to the internet. Although this means more information to the driver, there is a real danger of driver distraction. We are exploring the use of augmented reality windshields to create ways of enhancing the driver's experience while minimizing distraction. Not only do we work on the technical algorithms for fast computer vision and localization, but we follow a user-centered design thinking approach to develop appropriate applications and evaluate these with a configurable driving simulator to refine our designs before field tests.

Ng-Thow-Hing, V., K. Bark, L. Beckwith, C. Tran, R. Bhandari, S. Sridhar, User-centered Perspectives for Automotive Augmented Reality, International Symposium on Mixed and Augmented Reality 2013 (ISMAR 2013), 2013.
V. Ng-Thow-Hing, Considering the Driver Experience for Augmented Reality in Cars (video of talk), Augmented Word Exposition (AWE 2013) talk, Santa Clara, June 4, 2013.
L. Beckwith, K. Bark, V. Ng-Thow-Hing, C. Tran, Projected Path Vehicular Augmented Reality, A' Design Award 2013.
S. Sridhar and V. Ng-Thow-Hing, Generation of Virtual Display Surfaces for In-vehicle Contextual Augmented Reality, ISMAR 2012, 2012.

Multi-modal Facial Expression Recognition

My team wanted to continue our theme of non-verbal expressions by allowing our humanoid robot to pick up on emotional cues given by facial expression. We reasoned that if the robot can then build up a model of its human partner's emotional state, we can detect conditions like frustration or happiness. We had to overcome several problems: dealing with the small image sizes of faces from the robot's cameras, handling moving faces as people generally don't stand still, and determining how to pick up accurate facial expressions while the person was talking. For the latter case, some facial recognition algorithms depend on the mouth shape. However, when we talk, the mouth shape is not stable and can lead to false categorizations of expression. We used a multi-modal model to vary the types of feature masks we used to weight features from different parts of the face depending on whether the person was talking or not. We used Mitchel Benovoy's biologically-inspired models to deal with robust recognition over large distances.

Sarvadevabhatla, R.K., M. Benovoy, S. Musallam, V. Ng-Thow-Hing, Adaptive Facial Expression Recognition using inter-modal top-down context, ICMI 2011, 2011.

Gesture Generation

This topic is a very exciting interest of mine as it combines motor control, linguistics and planning. Our goal is to automatically and intelligently determine what gestures are co-produced with various speech utterances. The choice of gesture can be driven by many processes and needs, such as the need for emphasis, affective (emotional) states, content of the thought to be presented and personal style. The gesture models we produce exhibit phenomena at all levels of gesture expression - from low-level beats, deictics, iconics, metaphorics to high-level emblems. Gestures have probabilistic elements in the model at various stages to produce non-repetitive behavior, so you don't get the unnatural precision that often comes with robot displays of motion. This is continuing work which I am actively exploring further.

Ng-Thow-Hing, V., P. Luo, S. Okita, Synchronized Gesture and Speech Production for Humanoid Robots, in the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, Oct. 18-22, 2010.
IEEE Spectrum YouTube video

Learning with Kids

As part of the search for finding useful applications for humanoid robots, we collaborated with Dr. Sandra Okita from Columbia University to design humanoid robots as potential learning partners (not teacher substitutes) for children. In our earlier work, we found that children aged 4-6 were especially open to interaction with humanoid robots, but were often confused if the robot did not respond correctly to social cues such as expectant eye gazes.This has driven new research activity in producing better models for interaction, including cognitive models for attentive behaviors and gesture in communication. New methodologies for conducting human-robot experiments and measurements were developed to allow us to capture and analyze the interaction with multiple modalities and viewpoints, from different time scales. To help keep us honest, we perform comparative pre- and post-testing on students to determine if interaction with our robot produces any learning effects.

Ng-Thow-Hing, V., and S. Okita, Playdates with Robots, August 2012, vol. 45, no. 8, IEEE Computer, 2012. IEEE Life Sciences version
Okita, S., V. Ng-Thow-Hing, R.K. Sarvadevabhatla. Multimodal Approach to Affective Human-Robot Interaction Design with Children. ACM Transactions on Interactive Intelligent Systems: Special Issue on Affective Interaction in Natural Environments, 2011.
S. Okita, V. Ng-Thow-Hing, R. K. Sarvadevabhatla, Captain May I? Proxemics Examining Factors that Influence Distance between Humanoid Robots, Children and Adults, during Human-Robot Interaction, HRI 2012, March 5-8, 2012.
Sarvadevabhatla, R. K., V. Ng-Thow-Hing, S. Okita, Extended Duration Human-Robot Interaction: Tools and Analysis, in the 19th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2010), 2010. Best paper candidate.
Okita, S. Y., V. Ng-Thow-Hing, and R. Sarvadevabhatla, Learning together: ASIMO developing an interactive learning partnership with children, in the 18th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2009). pp. 1125-1130, 2009

Human Robot Interactive Turn-taking Scenarios

To test out our ideas on interaction (perception, behavioral models and expression), we focus on various test scenarios to serve as research platforms to improve the quality of interaction and to test that interaction with people. Our first platform, which has undergone several interations in its various components was the Memory Game, the venerable card game where people (and robots) attempt to select pairs of identical cards from a set of face down cards. The challenge here is to use the robot's on-board perception to identify cards and balance the tasks of keeping track of the game with monitoring turn-taking and the fact that human's can make mistakes (or cheat).

Ng-Thow-Hing, V., J. Lim, J. Wormer, R. K. Sarvadevabhatla, C. Rocha, K. Fujimura, Y. Sakagami. The Memory Game: Creating reactive, interruptible, turn-taking, human-robot interaction for ASIMO, ed. H. Abdellatif, Robotics 2010: Current and Future Challenges, ISBN 978-953-7619-78-7, Inteh, pages 433-454, 2010.
Ng-Thow-Hing, V., J. Lim, J. Wormer, R. Sarvadevabhatla, C. Rocha, K. Fujimura, Y. Sakagami. The Memory Game: Creating a human-robot interactive scenario for ASIMO, International Conference on Intelligent Robots and Systems (IROS 2008), pages 779-786, 2008.

Panoramic Attention for Perception

We realized that a big problem is deciding how to filter the sheer amount of sensory information into actionable information that a robot can respond to. Our idea was to build a 3-layer model that starts with low-level attention mechanisms in the visual and auditory modalities, that can quickly isolate regions of interest (ROI). These ROIs are then fed to a second layer which we call mid-layer detection where specialized detectors for faces or objects reside. Finally, this high-level semantic information is sparsely stored in the panoramic attention layer, so-named because it is stored in the ego-centric panoramic view of the agent (robot)'s perspective.

Sarvadevabhatla, R. K., V. Ng-Thow-Hing. Panoramic attention for humanoid robots, in 9th IEEE-RAS International Conference on Humanoid Robots, pp. 215-222, Paris, France, 2009.

Intelligent Systems for Humanoid Robots

At my current work at the Honda Research Institute USA, I work on mult-agent intelligent systems for modeling human-robot interactions and complex tasks on the 2000 Honda ASIMO humanoid robot. Our goal is to develop autonomous robots using well-designed and reusable interaction models and a variety of perceptual, decision-making and motor control components. I currently am the project leader for human-robot interaction and intelligent systems integration. We've written several major software systems for handling environment maps, sensor fusion, task organization, robot control, and inter-agent communication.

Ng-Thow-Hing, V., E. Drumwright, K. Hauser, Q. Wu, J. Wormer. Expanding Task Functionality in Established Humanoid Robots, IEEE/RAS International Conference on Humanoid Robots (Humanoids 2007), 2007.
Ng-Thow-Hing, V., K. Thórisson, R. K. Sarvadevabhatla, J. Wormer, and T. List, The Cognitive Map architecture for facilitating human-robot interaction in humanoid robots, IEEE Robotics & Automation Magazine, March 16(1): 55-66, 2009.
Ng-Thow-Hing, V., T. List, K. Thórisson, J. Lim, J. Wormer, Design and Evaluation of Communication Middleware in a Humanoid Robot Architecture, in IROS 2007 Workshop on Measures and Procedures for the Evaluation of Robot Architectures and Middleware, Oct. 29, 2007, San Diego, CA, 2007.
Yang, A., H. Gonzalez-Banos, V. Ng-Thow-Hing, J. Davis, RoboTalk: Controlling Arms, Bases and Androids through a Single Motion Interface, in Proceedings of the 12th International Conference on Advanced Robotics, July 18-20, Seattle, 2005.

Computational Palaeontology

This project was a collaboration with John Hutchinson (Royal Veterinary College, University of London) and Frank "Clay" Anderson (Stanford University). It originated during a cafe chat around 2003 at Starbucks in Mountain View, CA. I was developing some mass models for my work in Digital Human Modeling at Honda, and thought that by combining these mass models with the B-spline solid model I used for muscle, we could create a very versatile shape primitive for estimating mass properties of body tissue in animals for both extant and extinct species. Well, 4 years later in 2007 we finally published our work with the developed mass set model applied to a Tyrannosaurus rex skeleton and validated with an ostrich carcass. I really enjoyed this project and it was done in my off-hours (Honda's not really into dinosaurs), so it was truly a labour of love.

Hutchinson, J. R., V. Ng-Thow-Hing, F.C. Anderson, A 3D interactive method for estimating body segmental parameters in animals: Application to the turning and running performance of Tyrannosaurus rex, Journal of Theoretical Biology, Vol. 246, No. 4, June 21, 2007, pages 660-680. National Geographic Online, BBC News media coverage
Ng-Thow-Hing, V., F. C. Anderson., and J. R. Hutchinson, Mass Sets for Interactive Computation of Inertial Parameters for Rigid and Deformable Body Segments, in IX International Symposium on Computer Simulation in Biomechanics, July 2-4, Sydney, Australia, 2003.

Motion Planning for Robots

Motion planning is a very important step prior to actual execution of motion on a robot. In order to design motion trajectories to send to a robot's joints, the trajectories to accomplish the task goal must avoid self-collisions, collisions with the environment and respect kinematic joint limits. I've worked on methods for planning tasks that involve switching between different modalities of motion (like walking and pushing). My main goal is to develop good modes of manipulation for a robot to accomplish higher level complex tasks. This work is done at the Honda Research Institute, USA with my former interns, Kris Hauser and Evan Drumwright.

Hauser, K., V. Ng-Thow-Hing, H. Gonzalez-Banos. Multi-modal motion planner for a humanoid robot manipulation task, ISRR 2007..
Drumwright, E., V. Ng-Thow-Hing. Toward Interactive Reaching in Static Environments for Humanoid Robots, in IROS 2006, Beijing, China, 2006.

Complex Task Modeling for Humanoid Robots

The control algorithms for various motor tasks on a robot can vary widely depending on the goals of each task. They can range from simple joint angle trajectories to achieve certain poses, to pointing commands that require task-space control variables as well as perceptual information about objects in the environment. Evan Drumwright and I developed a Task Matrix that is a framework to unify these task programs with a simple, parameterized robot independent abstract interface. The Task Matrix can handle concurrency, conflict resolution, and can allow complex tasks to be assembled from simpler ones. Recently, the Task Matrix was made to work with the humanoid robot ASIMO. This project originated at the Honda Research Institute, and Evan has continued with the work into his PhD thesis. I am also actively developing the Task Matrix for my own research at Honda.

Drumwright, E., V. Ng-Thow-Hing, M. Mataric. The Task Matrix Framework for Platform-Independent Humanoid Programming, in Humanoids 2006, Genova, Italy, 2006.
Drumwright, E., V. Ng-Thow-Hing, M. Mataric. Toward a Vocabulary of Primitive Task Programs for Humanoid Robots, in International Conference on Development and Learning 2006, Bloomington, Indiana, May 31-June 3,
2006.Drumwright, E. V. Ng-Thow-Hing. The Task Matrix: An Extensible Framework for Creating Versatile Humanoid Robots, in Proc. Of the 2006 IEEE Intl. Conf. on Robotics and Automation (ICRA), Orlando, Florida, , pages 448-455, May 15-19, 2006.

Realistic joints for humans and animals

Wei Shao and I worked on a joint component model to more accurately model the complexity of human joints. With this model, we could build complex joints, such as in the human spine, shoulder and knee. These joints could be animated at real-time rates and can be used in interactive applications. I also worked on developing automatic methods for building subject-specific skeletons from motion captured data with Jianbo Peng.

Ng-Thow-Hing, V. Revisiting the Standard Joint Hierarchy: Improving Realistic Modeling of Articulated Characters, lecture at Game Developers Conference, San Jose, CA, March 22-26, 2004.
Shao, W. and V. Ng-Thow-Hing, A Generic Joint Component Framework For Realistic Articulation in Human Characters, in 2003 ACM Symposium on Interactive 3-D Graphics, pages 11-18, April 2003.
Media coverage in MIT Tech Review.
Ng-Thow-Hing, V., and W. Shao, Modular Components for Detailed Kinematic Modelling of Joints, International Society of Biomechanics XIXth Congress, July 6-11, 2003.
Ng-Thow-Hing, V., and J. Peng, Automated Subject-specific Scaling of Skeletons to Motion-Captured Markers, in IV World Congress of Biomechanics, poster presentation, Calgary Aug. 4-9, 2002.

Musculo-tendon Modeling for humans and animals

Although this topic was my first passion for research, it continues to still intrigue and excite me. My current work with humanoid robots makes it difficult to get back into this, but it hasn't stopped me from developing new ideas in whatever spare time I have left these days. I continue to collaborate with my co-authors from my grad school days as they have taken my original models and extended its application to other soft tissues. When I first started this topic, directly considering anatomy for modeling animals in computer graphics and animation was still unheard of. Most computer graphics were better at mechanical devices like cars and robots. Now we've almost got it right, but there is still something missing, and I hope continued work on building better underlying anatomical rigs for computer graphics creatures will be done.

Selected publications:

Teran, J., S. Blemker, V. Ng-Thow-Hing, R. Fedkiw, Finite volume methods for the simulation of skeletal muscle, in Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pages 68-74, 2003.
Agur, A., V. Ng-Thow-Hing, K. Ball, E. Fiume, and N. McKee, Documentation and Three-Dimensional Modelling of Human Soleus Muscle Architecture, Clinical Anatomy, Vol. 16, No. 4, pages 285-293, June 2003.
Ng-Thow-Hing, V. and E. Fiume, Application-specific Muscle Representations, in Proceedings of Graphics Interface 2002, (editor) W. Stűrzlinger and M. McCool, pages 107-115, 2002.
Ng-Thow-Hing, V., A. Agur, and N. McKee, A muscle model that captures external shape, internal fibre architecture, and permits simulation of active contraction with volume preservation, in 5th International Symposium on Computer Methods in Biomechanics and Biomedical Engineering, Rome Oct. 31-Nov.3, 2001.
Ng-Thow-Hing, V., and E. Fiume, B-spline Solids as physical and geometric muscle models for musculoskeletal systems, in Proceedings of the VIIth International Symposium of Computer Simulation in Biomechanics, pages 68-71, 1999. Winner of Andrzej Komor New Investigator Award
PhD Thesis: Anatomically based models for physical and geometric reconstruction of animals. University of Toronto, Department of Computer Science, supervisor: Eugene Fiume , January 2001.
MSc. Thesis: A Biomechanical Musculotendon Model for Animating Articulated Objects, University of Toronto, Department of Computer Science, supervisor: Eugene Fiume, January 1994.

Physics-Based Animation Systems

During my PhD thesis (1994-2000) at the University of Toronto, I co-developed with Petros Faloutsos a physics-based animation system called DANCE (Dynamic Animation and Control Environment). DANCE could be extended by plug-ins and featured abstract interfaces for controllers, numerical integrators and physical models. It could be used as a physics-simulation playground for testing out a variety of different ideas from virtual stuntmen to biomechanical models of muscle. This was a formative experience for me, as it was this project that I really started building plug-in based frameworks with abstract component interfaces in my software.

Since then, DANCE has subsequently been extended and improved by Ari Shapiro and has its own home page at UCLA.

Shapiro, A., V. Ng-Thow-Hing, P. Faloutsos, Dynamic Animation and Control Environment, in Graphics Interface 2005, Victoria, Canada, May 9-11, 2005. CNN online coverage.
Ng-Thow-Hing, V., and P. Faloutsos, Dynamic Animation and Control Environment (DANCE), Siggraph Technical Sketch, in Siggraph Conference Abstracts and Applications, page 198, 2000.
Ng-Thow-Hing, V., and P. Faloutsos, DANCE: dynamic animation and control environment, in Graphics Interface ’99 Poster Abstracts¸ poster presentation, pages 31-32, 1999.