Modality (human–computer interaction)
In the context of human–computer interaction, a modality is the classification of a single independent channel of sensory input/output between a computer and a human. A system is designated unimodal if it has only one modality implemented, and multimodal if it has more than one. When multiple modalities are available for some tasks or aspects of a task, the system is said to have overlapping modalities. If multiple modalities are available for a task, the system is said to have redundant modalities. Multiple modalities can be used in combination to provide complementary methods that may be redundant but convey information more effectively. Modalities can be generally defined in two forms: human-computer and computer-human modalities.
Computers utilize a wide range of technologies to communicate and send information to humans:
- Common modalities
- Uncommon modalities
Any human sense can be used as a computer to human modality. However, the modalities of seeing and hearing are the most commonly employed since they are capable of transmitting information at a higher speed than other modalities, 250 to 300 and 150 to 160 words per minute, respectively. Though not commonly implemented as computer-human modality, tactition can achieve an average of 125 wpm through the use of a refreshable Braille display. Other more common forms of tactition are smartphone and game controller vibrations.
Computers can be equipped with various types of input devices and sensors to allow them to receive information from humans. Common input devices are often interchangeable if they have a standardized method of communication with the computer and afford practical adjustments to the user. Certain modalities can provide a richer interaction depending on the context, and having options for implementation allows for more robust systems.
- Simple modalities
- Complex modalities
With the increasing popularity of smartphones, the general public are becoming more comfortable with the more complex modalities. Speech recognition was a major selling point of the iPhone 4S and following Apple products, with the introduction of Siri. This technology gives users an alternative way to communicate with computers when typing is less desirable. However, in a loud environment, the audition modality is not quite effective. This exemplifies how certain modalities have varying strengths depending on the situation. Other complex modalities such as computer vision in the form of Microsoft's Kinect or other similar technologies can make sophisticated tasks easier to communicate to a computer especially in the form of three dimensional movement.
Using multiple modalities
Having multiple modalities in a system gives more affordance to users and can contribute to a more robust system. Having more also allows for greater accessibility for users who work more effectively with certain modalities. Multiple modalities can be used as backup when certain forms of communication are not possible. This is especially true in the case of redundant modalities in which two or more modalities are used to communicate the same information. Certain combinations of modalities can add to the expression of a computer-human or human-computer interaction because the modalities each may be more effective at expressing one form or aspect of information than others.
There are six types of cooperation between modalities, and they help define how a combination or fusion of modalities work together to convey information more effectively.
- Equivalence: information is presented in multiple ways and can be interpreted as the same information
- Specialization: when a specific kind of information is always processed through the same modality
- Redundancy: multiple modalities process the same information
- Complimentarity: multiple modalities take separate information and merge it
- Transfer: a modality produces information that another modality consumes
- Concurrency: multiple modalities take in separate information that is not merged
Complimentary-redundant systems are those which have multiple sensors to form one understanding or dataset, and the more effectively the information can be combined without duplicating data, the more effectively the modalities cooperate. Having multiple modalities for communication is common, particularly in smartphones, and often their implementations work together towards the same goal, for example gyroscopes and accelerometers working together to track movement.
- Karray, Fakhreddine; Alemzadeh, Milad; Saleh, Jamil Abou; Arab, Mo Nours (March 2008). "Human-Computer Interaction: Overview on State of the Art" (PDF). International Journal on Smart Sensing and Intelligent Systems. 1 (1). Retrieved April 21, 2015.
- Palanque, Philippe; Paterno, Fabio (2001). Interactive Systems. Design, Specification, and Verification. Springer Science & Business Media. p. 43. ISBN 9783540416630.
- Ziefle, M (December 1998). "Effects of display resolution on visual performance.". Human factors. 40 (4): 554–68. PMID 9974229. doi:10.1518/001872098779649355.
- Williams, J. R. (1998). Guidelines for the use of multimedia in instruction, Proceedings of the Human Factors and Ergonomics Society 42nd Annual Meeting, 1447–1451
- "Braille". ACB. American Council of the Blind. Retrieved 21 April 2015.
- Bainbridge, William (2004). Berkshire Encyclopedia of Human-computer Interaction. Berkshire Publishing Group LLC. p. 483. ISBN 9780974309125.
- Epstein, Zach (Nov 2, 2011). "Siri said to be driving force behind huge iPhone 4S sales". Retrieved April 21, 2015.
- Kurkovsky, Stan (2009). Multimodality in Mobile Computing and Mobile Devices: Methods for Adaptable Usability. IGI Global. pp. 210–211. ISBN 9781605669793.
- Kurosu, Masaaki (2013). Human-Computer Interaction: Interaction Modalities and Techniques. Springer. p. 366. ISBN 9783642393303.
- Grifoni, Patrizia (2009). Multimodal Human Computer Interaction and Pervasive Services. IGI Global. p. 37. ISBN 9781605663876.