The purpose of this project is to develop the infrastructure for developing multi-modal applications that use both VoiceXML and InkXML, and to implement an appropriate example application. VoiceXML portals are currently available for developing VoiceXML applications and, in fact, our own facility is under development at Pace University. Therefore, we need to develop only an InkXML facility and the integration environment.
We anticipate that the integration of voice and ink data will lead to the development of more friendly and efficient applications. For example, for some types of input, such as mailing addresses, current speech recognition engines do not provide sufficient speaker-independent, continuous speech recognition accuracy rates. However, handwriting systems can be more accurate on such input. Hence, we propose to integrate both the voice and ink data in order to enhance the efficiency and accuracy of applications that can benefit from this multi-modal approach.
The proposed architecture and more details of the system are described in the paper referenced below. When the user calls the gateway VoiceXML/InkXML server, the application prototype prompts the user to, for example, fill the fields of a form. To do that, the user can input speech data, ink data, or touch-tone digits. In the case of speech data and ink data inputs, the recognition process will be performed on the VoiceXML/InkXML gateway. In the future this might even be done on the client side by a smart phone. Then, once the recognition process is done, the field values of the form are collected and submitted to the Web server and the dialog can continue or terminate.
Z. Trabelsi, S-H. Cha, D. Desai, and C. Tappert, " A Voice and Ink XML Multimodal Architecture for Mobile e-Commerce Systems," to be presented at the 2nd ACM Mobile Commerce workshop, Atlanta, September 2002.