Text this: An Intention Understanding Algorithm Based on Multimodal Information Fusion