Activity recognition

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agents' actions and the environmental conditions. Since the 1980s, this research field has captured the attention of several computer science communities due to its strength in providing personalized support for many different applications and its connection to many different fields of study such as medicine, human-computer interaction, or sociology.

To understand activity recognition better, consider the following elderly assistance scenario. An elderly man wakes up at dawn in his apartment, where he stays alone. He lights the stove to make a pot of tea, switches on the toaster oven, and takes some bread and jelly from the cupboard. After he takes his morning medication, a computer-generated voice gently reminds him to turn off the toaster. Later that day, his daughter accesses a secure website where she scans a check-list, which was created by a sensor network in her father's apartment. She finds that her father is eating normally, taking his medicine on schedule, and continuing to manage his daily life on his own.

Due to its many-faceted nature, different fields may refer to activity recognition as plan recognition, goal recognition, intent recognition, behavior recognition, location estimation and location-based services.

Types of activity recognition[edit]

Sensor-based, single-user activity recognition[edit]

Sensor-based activity recognition integrates the emerging area of sensor networks with novel data mining and machine learning techniques to model a wide range of human activities.[1][2] Mobile devices (e.g. smart phones) provide sufficient sensor data and calculation power to enable physical activity recognition to provide an estimation of the energy consumption during everyday life. Sensor-based activity recognition researchers believe that by empowering ubiquitous computers and sensors to monitor the behavior of agents (under consent), these computers will be better suited to act on our behalf.

Levels of sensor-based activity recognition[edit]

Sensor-based activity recognition is a challenging task due to the inherent noisy nature of the input. Thus, statistical modeling has been the main thrust in this direction in layers, where the recognition at several intermediate levels is conducted and connected. At the lowest level where the sensor data are collected, statistical learning concerns how to find the detailed locations of agents from the received signal data. At an intermediate level, statistical inference may be concerned about how to recognize individuals' activities from the inferred location sequences and environmental conditions at the lower levels. Furthermore, at the highest level a major concern is to find out the overall goal or subgoals of an agent from the activity sequences through a mixture of logical and statistical reasoning. Scientific conferences where activity recognition work from wearable and environmental often appears are ISWC and UbiComp.

Sensor-based, multi-user activity recognition[edit]

Recognizing activities for multiple users using on-body sensors first appeared in the work by ORL using active badge systems[3] in the early 90's. Other sensor technology such as acceleration sensors were used for identifying group activity patterns during office scenarios.[4] Activities of Multiple Users in intelligent environments are addressed in Gu et al.[5] In this work, they investigate the fundamental problem of recognizing activities for multiple users from sensor readings in a home environment, and propose a novel pattern mining approach to recognize both single-user and multi-user activities in a unified solution.

Sensor-based group activity recognition[edit]

Recognition of group activities is fundamentally different from single, or multi-user activity recognition in that the goal is to recognize the behavior of the group as an entity, rather than the activities of the individual members within it.[6] Group behavior is emergent in nature, meaning that the properties of the behavior of the group are fundamentally different then the properties of the behavior of the individuals within it, or any sum of that behavior.[7] The main challenges are in modeling the behavior of the individual group members, as well as the roles of the individual within the group dynamic [8] and their relationship to emergent behavior of the group in parallel.[9] Challenges which must still be addressed include quantification of the behavior and roles of individuals who join the group, integration of explicit models for role description into inference algorithms, and scalability evaluations for very large groups and crowds. Group activity recognition has applications for crowd management and response in emergency situations, as well as for social networking and Quantified Self applications.[10]

Approaches of activity recognition[edit]

Activity recognition through logic and reasoning[edit]

Logic-based approaches keep track of all logically consistent explanations of the observed actions. Thus, all possible and consistent plans or goals must be considered. Kautz[11] provided a formal theory of plan recognition. He described plan recognition as a logical inference process of circumscription. All actions, plans are uniformly referred to as goals, and a recognizer's knowledge is represented by a set of first-order statements called event hierarchy encoded in first-order logic, which defines abstraction, decomposition and functional relationships between types of events.

Kautz's general framework for plan recognition has an exponential time complexity in worst case, measured in the size of input hierarchy. Lesh and Etzioni[12] went one step further and presented methods in scaling up goal recognition to scale up his work computationally. In contrast to Kautz's approach where the plan library is explicitly represented, Lesh and Etzioni's approach enables automatic plan-library construction from domain primitives. Furthermore, they introduced compact representations and efficient algorithms for goal recognition on large plan libraries.

Inconsistent plans and goals are repeatedly pruned when new actions arrive. Besides, they also presented methods for adapting a goal recognizer to handle individual idiosyncratic behavior given a sample of an individual's recent behavior. Pollack et al. described a direct argumentation model that can know about the relative strength of several kinds of arguments for belief and intention description.

A serious problem of logic-based approaches is their inability or inherent infeasibility to represent uncertainty. They offer no mechanism for preferring one consistent approach to another and incapable of deciding whether one particular plan is more likely than another, as long as both of them can be consistent enough to explain the actions observed. There is also a lack of learning ability associated with logic based methods.

Another approach to logic-based activity recognition is to use stream reasoning based on Answer Set Programming,[13] and has been applied to recognising activities for health-related applications,[14] which uses weak constraints to model a degree of ambiguity/uncertainty.

Activity recognition through probabilistic reasoning[edit]

Probability theory and statistical learning models are more recently applied in activity recognition to reason about actions, plans and goals under uncertainty.[15] In the literature, there have been several approaches which explicitly represent uncertainty in reasoning about an agent's plans and goals.

Using sensor data as input, Hodges and Pollack designed machine learning-based systems for identifying individuals as they perform routine daily activities such as making coffee.[16] Intel Research (Seattle) Lab and University of Washington at Seattle have done some important works on using sensors to detect human plans.[17][18][19][20] Some of these works infer user transportation modes from readings of radio-frequency identifiers (RFID) and global positioning systems (GPS).

The use of temporal probabilistic models has been shown to perform well in activity recognition and generally outperform non-temporal models.[21] Generative models such as the Hidden Markov Model (HMM) and the more generally formulated Dynamic Bayesian Networks (DBN) are popular choices in modelling activities from sensor data.[22][23][24] Discriminative models such as Conditional Random Fields (CRF) are also commonly applied and also give good performance in activity recognition.[25][26]

Generative and discriminative models both have their pros and cons and the ideal choice depends on their area of application. A dataset together with implementations of a number of popular models (HMM, CRF) for activity recognition can be found here.

Conventional temporal probabilistic models such as the hidden Markov model (HMM) and conditional random fields (CRF) model directly model the correlations between the activities and the observed sensor data. In recent years, increasing evidence has supported the use of hierarchical models which take into account the rich hierarchical structure that exists in human behavioral data.[22][27][28] The core idea here is that the model does not directly correlate the activities with the sensor data, but instead breaks the activity into sub-activities (sometimes referred to as actions) and models the underlying correlations accordingly. An example could be the activity of preparing spaghetti, which can be broken down into the subactivities or actions of cutting vegetables, frying the vegetables in a pan and serving it on a plate. Examples of such a hierarchical model are Layered Hidden Markov Models (LHMMs)[27] and the hierarchical hidden Markov model (HHMM), which have been shown to significantly outperform its non-hierarchical counterpart in activity recognition.[22]

Data mining based approach to activity recognition[edit]

Different from traditional machine learning approaches, an approach based on data mining has been recently proposed. In the work of Gu et al.,[29] the problem of activity recognition is formulated as a pattern-based classification problem. They proposed a data mining approach based on discriminative patterns which describe significant changes between any two activity classes of data to recognize sequential, interleaved and concurrent activities in a unified solution. Gilbert et al.[30] use 2D corners in both space and time. These are grouped spatially and temporally using a hierarchical process, with an increasing search area. At each stage of the hierarchy, the most distinctive and descriptive features are learned efficiently through data mining (Apriori rule).

Sensors used in activity recognition[edit]

Vision-based activity recognition[edit]

It is a very important and challenging problem to track and understand the behavior of agents through videos taken by various cameras. The primary technique employed is computer vision. Vision-based activity recognition has found many applications such as human-computer interaction, user interface design, robot learning, and surveillance, among others. Scientific conferences where vision based activity recognition work often appears are ICCV and CVPR.

In vision-based activity recognition, a great deal of work has been done. Researchers have attempted a number of methods such as optical flow, Kalman filtering, Hidden Markov models, etc., under different modalities such as single camera, stereo, and infrared. In addition, researchers have considered multiple aspects on this topic, including single pedestrian tracking, group tracking, and detecting dropped objects.

Recently some researchers have used RGBD cameras like Microsoft Kinect to detect human activities. Depth cameras add extra dimension i.e. depth which normal 2d camera fails to provide. Sensory information from these depth cameras have been used to generate real-time skeleton model of humans with different body positions. These skeleton information provides meaningful information that researchers have used to model human activities which are trained and later used to recognize unknown activities.[31]

Levels of vision-based activity recognition[edit]

In vision-based activity recognition, the computational process is often divided into four steps, namely human detection, human tracking, human activity recognition and then a high-level activity evaluation.

Automatic gait recognition[edit]

Main article: Gait recognition

One way to identify specific people is by how they walk. Gait-recognition software can be used to record a person's gait or gait profile in a database for the purpose of recognizing that person later, even if they are wearing a disguise.

Wi-Fi-based activity recognition[edit]

When activity recognition is performed indoors and in cities using the widely available Wi-Fi signals and 802.11 access points, there is much noise and uncertainty. These uncertainties are modeled using a dynamic Bayesian network model by Yin et al.[32] A multiple goal model that can reason about user's interleaving goals is presented by Chai and Yang,[33] where a deterministic state transition model is applied. A better model that models the concurrent and interleaving activities in a probabilistic approach is proposed by Hu and Yang.[34] A user action discovery model is presented by Yin et al.,[35] where the Wi-Fi signals are segmented to produce possible actions.

A fundamental problem in Wi-Fi-based activity recognition is to estimate the user locations. Two important issues are how to reduce the human labelling effort and how to cope with the changing signal profiles when the environment changes. Yin et al.[36] dealt with the second issue by transferring the labelled knowledge between time periods. Chai and Yang[37] proposed a hidden Markov model-based method to extend labelled knowledge by leveraging the unlabelled user traces. J. Pan et al.[38] propose to perform location estimation through online co-localization, and S. Pan et al.[39] proposed to apply multi-view learning for migrating the labelled data to a new time period.

Applications of activity recognition[edit]

Many different applications have been studied by researchers in activity recognition; examples include assisting the sick and disabled. For example, Pollack et al.[40] show that by automatically monitoring human activities, home-based rehabilitation can be provided for people suffering from traumatic brain injuries. One can find applications ranging from security-related applications and logistics support to location-based services.

Labs in the world[edit]

Related conferences[edit]

See also[edit]

online lectures on activity recognition
related articles


  1. ^ Tanzeem Choudhury, Gaetano Borriello, et al. The Mobile Sensing Platform: An Embedded System for Activity Recognition. Appears in the IEEE Pervasive Magazine - Special Issue on Activity-Based Computing, April 2008.
  2. ^ Nishkam Ravi, Nikhil Dandekar, Preetham Mysore, Michael Littman. Activity Recognition from Accelerometer Data. Proceedings of the Seventeenth Conference on Innovative Applications of Artificial Intelligence (IAAI/AAAI 2005).
  3. ^ Want R., Hopper A., Falcao V., Gibbons J.: The Active Badge Location System, ACM Transactions on Information, Systems, Vol. 40, No. 1, pp. 91-102, January 1992
  4. ^ Bieber G., Kirste T., Untersuchung des gruppendynamischen Aktivitaetsverhaltes im Office-Umfeld, 7. Berliner Werkstatt Mensch-Maschine-Systeme, Berlin, Germany, 2007
  5. ^ Tao Gu, Zhanqing Wu, Liang Wang, Xianping Tao, and Jian Lu. Mining Emerging Patterns for Recognizing Activities of Multiple Users in Pervasive Computing. In Proc. of the 6th International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services (MobiQuitous '09), Toronto, Canada, July 13–16, 2009.
  6. ^ Dawud Gordon, Jan-Hendrik Hanne, Martin Berchtold, Ali Asghar Nazari Shirehjini, Michael Beigl: Towards Collaborative Group Activity Recognition Using Mobile Devices, Mobile Networks and Applications 18(3), 2013, p. 326-340
  7. ^ Lewin, K. Field theory in social science: selected theoretical papers. Social science paperbacks. Harper, New York, 1951.
  8. ^ Hirano, T., and Maekawa, T. A hybrid unsupervised/supervised model for group activity recognition. In Proceedings of the 2013 International Symposium on Wearable Computers, ISWC ’13, ACM (New York, NY, USA, 2013), 21–24
  9. ^ Brdiczka, O., Maisonnasse, J., Reignier, P., and Crowley, J. L. Detecting small group activities from multimodal observations. Applied Intelligence 30, 1 (July 2007), 47–57.
  10. ^ Dawud Gordon, Group Activity Recognition Using Wearable Sensing Devices, Dissertation, Karlsruhe Institute of Technology, 2014
  11. ^ H. Kautz. "A formal theory of plan recognition". In PhD thesis, University of Rochester, 1987.
  12. ^ N. Lesh and O. Etzioni. "A sound and fast goal recognizer". In Proceedings of the International Joint Conference on Artificial Intelligence, 1995.
  13. ^ Do, Thang; Seng W. Loke; Fei Liu (2011). "Answer Set Programming for Stream Reasoning". Advances in Artificial Intelligence, Lecture Notes in Computer Science 6657: 104–109. 
  14. ^ Do, Thang; Seng W. Loke; Fei Liu (2012). "HealthyLife: an Activity Recognition System with Smartphone using Logic-Based Stream Reasoning" (PDF). Proceedings of the 9th International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, (Mobiquitous 2012). 
  15. ^ E. Charniak and R.P. Goldman. "A Bayesian model of plan recognition". Artificial Intelligence, 64:53–79, 1993.
  16. ^ M.R. Hodges and M.E. Pollack. "An 'object-use fingerprint': The use of electronic sensors for human identification". In Proceedings of the 9th International Conference on Ubiquitous Computing, 2007.
  17. ^ Mike Perkowitz, Matthai Philipose, Donald J. Patterson, and Kenneth P. Fishkin. "Mining models of human activities from the web". In Proceedings of the Thirteenth International World Wide Web Conference (WWW 2004), pages 573–582, May 2004.
  18. ^ Matthai Philipose, Kenneth P. Fishkin, Mike Perkowitz, Donald J. Patterson, Dieter Fox, Henry Kautz, and Dirk Hähnel. "Inferring activities from interactions with objects". In IEEE Pervasive Computing, pages 50–57, October 2004.
  19. ^ Dieter Fox Lin Liao, Donald J. Patterson and Henry A. Kautz. "Learning and inferring transportation routines". Artif. Intell., 171(5-6):311–331, 2007.
  20. ^ Piyathilaka, L.; Kodagoda, S., "Gaussian mixture based HMM for human daily activity recognition using 3D skeleton features," Industrial Electronics and Applications (ICIEA), 2013 8th IEEE Conference on , vol., no., pp.567,572, 19–21 June 2013
  21. ^ TLM van Kasteren, Gwenn Englebienne, BJA Kröse. "Human activity recognition from wireless sensor network data: Benchmark and software." Activity Recognition in Pervasive Intelligent Environments, 165-186, Atlantis Press
  22. ^ a b c TLM van Kasteren, Gwenn Englebienne, Ben Kröse"Hierarchical Activity Recognition Using Automatically Clustered Actions", 2011, ,Ambient Intelligence, 82-91, Springer Berlin/Heidelberg
  23. ^ Daniel Wilson and Chris Atkeson. Simultaneous tracking and activityrecognition (star) using many anonymous binary sensors. In Proceedings of the 3rd international conference on Pervasive Computing, Pervasive, pages 62–79, Munich , Germany, 2005.
  24. ^ Nuria Oliver, Barbara Rosario and Alex Pentland "A Bayesian Computer Vision System for Modeling Human Interactions" Appears in PAMI Special Issue on Visual Surveillance and Monitoring, Aug 00
  25. ^ TLM Van Kasteren, Athanasios Noulas, Gwenn Englebienne, Ben Kröse"Accurate activity recognition in a home setting", 2008/9/21, Proceedings of the 10th international conference on Ubiquitous computing, 1-9, ACM
  26. ^ Derek Hao Hu, Sinno Jialin Pan, Vincent Wenchen Zheng, Nathan NanLiu, and Qiang Yang. Real world activity recognition with multiple goals.In Proceedings of the 10th international conference on Ubiquitous computing,Ubicomp, pages 30–39, New York, NY, USA, 2008. ACM.
  27. ^ a b Nuria Oliver, Ashutosh Garg, and Eric Horvitz. Layered representations for learning and inferring office activity from multiple sensory channels. Comput. Vis. Image Underst., 96(2):163–180, 2004.
  28. ^ Amarnag Subramanya, Alvin Raj, Jeff Bilmes, and Dieter Fox. Hierarchical models for activity recognition. In Proceedings of the international conference on Multimedia Signal Processing, MMSP, Victoria, CA, October 2006.
  29. ^ Tao Gu, Zhanqing Wu, Xianping Tao, Hung Keng Pung, and Jian Lu. epSICAR: An Emerging Patterns based Approach to Sequential, Interleaved and Concurrent Activity Recognition. In Proc. of the 7th Annual IEEE International Conference on Pervasive Computing and Communications (Percom '09), Galveston, Texas, March 9–13, 2009.
  30. ^ Gilbert A, Illingworth J, Bowden R. Action Recognition using Mined Hierarchical Compound Features. IEEE Trans Pattern Analysis and Machine Learning
  31. ^ Piyathilaka, L.; Kodagoda, S., "Gaussian mixture based HMM for human daily activity recognition using 3D skeleton features," Industrial Electronics and Applications (ICIEA), 2013 8th IEEE Conference on , vol., no., pp.567,572, 19–21 June 2013 URL:
  32. ^ Jie Yin, Xiaoyong Chai and Qiang Yang, "High-level Goal Recognition in a Wireless LAN". In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04), San Jose, CA USA, July 2004. Pages 578-584
  33. ^ Xiaoyong Chai and Qiang Yang, "Multiple-Goal Recognition From Low-level Signals". Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI 2005), Pittsburg, PA USA, July 2005. Pages 3-8.
  34. ^ Derek Hao Hu, Qiang Yang. "CIGAR: Concurrent and Interleaving Goal and Activity Recognition", to appear in AAAI 2008
  35. ^ Jie Yin, Dou Shen, Qiang Yang and Ze-nian Li "Activity Recognition through Goal-Based Segmentation". Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI 2005), Pittsburg, PA USA, July 2005. Pages 28-33.
  36. ^ Jie Yin, Qiang Yang and Lionel Ni. "Adaptive Temporal Radio Maps for Indoor Location Estimation". In Proceedings of the 3rd Annual IEEE International Conference on Pervasive Computing and Communications (IEEE PerCom 2005), Kauai Island, Hawaii, March, 2005. Pages 85-94.
  37. ^ Xiaoyong Chai and Qiang Yang. "Reducing the Calibration Effort for Location Estimation Using Unlabeled Samples". In Proceedings of the 3rd Annual IEEE International Conference on Pervasive Computing and Communications, (IEEE PerCom 2005) Kauai Island, Hawaii, March 2005. Pages 95--104.
  38. ^ Jeffrey Junfeng Pan, Qiang Yang and Sinno Jialin Pan. "Online Co-Localization in Indoor Wireless Networks". In Proceedings of the 22nd AAAI Conference on Artificial Intelligence (AAAI'07) Vancouver, British Columbia, Canada. July 2007. 1102-1107
  39. ^ Sinno Jialin Pan, James T. Kwok, Qiang Yang, Jeffrey Junfeng Pan. "Adaptive localization in a dynamic WiFi environment through multi-view learning". In Proceedings of the 22nd AAAI Conference on Artificial Intelligence (AAAI'07) Vancouver, British Columbia, Canada. July 2007. 1108-1113
  40. ^ Pollack, M.E., and et al., L. E. B. 2003. "Autominder: an intelligent cognitive orthotic system for people with memory impairment". Robotics and Autonomous Systems 44(3-4):273–282.