Neural Enquirer: Learning to Query Tables

Neural Enquirer is a fully neural, end-to-end differentiable network that learns to execute compositional queries on knowledge-base tables. It not only gives distributional representation of the query and the knowledge-base, but also realizes the execution of compositional queries as a series of differentiable operations, with intermediate results saved on multiple layers of memory.

[arXiv e-prints]

TAQA: n-Tuple Assertion-based Question Answering

TAQA is a novel open KB-QA system that leverages the rich semantics of n-tuple open knowledge-base to answer questions with complex semantic constraints.

An n-tuple open KB is composed of n-tuple natural language assertions (e.g., <Barack Obama; graduated; from Harvard Law School, in 1992>) as the representation of factual knowledge.

TAQA answers questions using a cascaded pipeline of paraphrasing, question parsing, KB querying and answer ranking, and significantly outperforms state-of-the-art KB-QA systems.

[Project Page] | [CIKM Paper] | [Slides]

Query Rewriting via Paraphrasing

The mismatch between user-issued queries and Web documents is a fundamental problem in Information Retrieval. In order to improve search relevance, query rewriting methods are employed to rewrite original queries and use the rewritten query for search. We propose a novel query rewriting method based on paraphrase templates. By leveraging massive click-through logs, paraphrase templates are learned from semantically similar queries. CYK parsing is then employed to rewrite each hierarchical span of the original query based on learned templates. The model is optimized via Minimum Error Rate Training towards NDCG. Compared with previous approaches, this novel generative query rewriting model enables alteration of user-issued queries in both lexical and syntactical level.

This project has been shipped into Microsoft Bing Search/Ads production systems to improve search relevance.

Chinese New Word Detection and Tagging on Twitter

Proposed the interesting task of detecting new Chinese words on Twitter and tag each new word using its existing semantically similar words. Employed SVD and PLSA as semantic metrics, achieved 95% accuracy in new word detection and 82% in tagging

[DaWaK Paper]