主题：《用知识和工具扩充视觉语言模型》（Augmenting [Visual] Language Model with Knowledge and Tools）
时间：北京时间 9月6日 20:00-21:00
Dr. Hu's ultimate research vision is to develop an AI agent that combines the advantages of both neural learning and symbolic reasoning. This framework enables neural models to interact with external symbolic modules, such as knowledge graphs, logical engines, math calculators, and physical/chemical simulators. This will facilitate end-to-end training of such a Neural-Symbolic AI system without the need for annotated intermediate programs.During this talk, Dr. Hu will introduce his research endeavors focused on augmenting visual language models with external symbolic knowledge and tools. First, he will introduce AVIS, an autonomous LLM Agent that conducts dynamic decision-making to leverage tools. It conducts tree searches to solve complex info-seek visual questions.
Next, he will introduce two of his works that conduct end-to-end knowledge integration:
OREO-LM, which incorporates knowledge graph relational reasoning into a Large Language Model, significantly improving multi-hop question-answering using a single model.
REVEAL, which conducts end-to-end retrieval-augmented pre-training, teaching VLM to retrieve useful evidence from multiple knowledge sources.
Finally, he'll also introduce SciBench, a comprehensive college-level benchmark to evaluate LLM's problem-solving capabilities and their efforts to improve such abilities.
Postdoctor at California Institute of Technology
Visiting Researcher at Google Research
Young Researcher at Shanghai AI Laboratory