This activity was launched in 2002. The Multimodal Interaction Framework Working group has already produced :
- the Multimodal Interaction Framework, providing a general framework for multimodal interaction, and the kinds of markup languages being considered.
- A set of use cases.
- A set of core requirements, which describes the fundamental requirements to address in the future specifications.
The following XML specifications (currently in advanced Working draft state) are already addressing various parts of the Core Requirements :
- EMMA (Extensible Multi-Modal Annotations): a data exchange format for the interface between input processors and interaction management systems. It will define the means for recognizers to annotate application specific data with information such as confidence scores, time stamps, input mode (e.g. key strokes, speech or pen), alternative recognition hypotheses, and partial recognition results etc.
- InkML – an XML language for digital ink traces: an XML data exchange format for ink entered with an electronic pen or stylus as part of a multimodal system.
- Multimodal architecture: A loosely coupled architecture for the multimodal interaction framework that focuses on providing a general means for components to communicate with each other, plus basic infrastructure for application control and platform services.
- Emotion Markup Language: EmotionML will provide representations of emotions and related states for technological applications.
- Multimodal Interaction
- VoiceXML – the W3C's standard XML format for specifying interactive voice dialogues between a human and a computer.
- SSML – Speech Synthesis Markup Language
- CCXML – Call Control eXtensible Markup Language
- SCXML – an XML language that provides a generic state-machine based execution environment