首页科研动态科研活动

南洋理工大学顶尖学者共论AIGC前沿技术

2023年11月23日(周四) 14:00 - 16:30

上海市徐汇区龙台路180号模速空间1楼路演厅

名额50

已结束

活动详情

11月23日，上海人工智能大讲堂AI TALK暨“星河TALK”邀请新加坡南洋理工大学吕健勤、刘子纬、潘新钢三位学者，带来AIGC前沿技术分享。AIGC正重塑传统内容生产方式，为颠覆性提升生产效率带来可能。来自MMLab@NTU的三位学者，将共同探讨顶尖实验室视角下的AIGC技术进步，并详细介绍为AIGC提供技术支撑的“生成对抗网络（GAN）/扩散模型（Diffusion）”与“Transformer预训练大模型”。

本期AI TALK由上海市经济和信息化委员会、徐汇区人民政府指导，上海人工智能实验室（上海AI实验室）和全球高校人工智能学术联盟（GAIAA）主办。

活动详情

主讲人：

吕健勤南洋理工大学教授

刘子纬南洋理工大学助理教授

潘新钢南洋理工大学助理教授

参与方式：线下参会，点击“报名”填写报名信息，审核通过将短信通知

嘉宾及主题

吕健勤（Chen Change Loy）

Professor with the School of Computer Science and Engineering at NTU

He serves as the Co-Associate Director for S-Lab, NTU and Director of MMLab@NTU. He received his PhD (2010) in Computer Science from the Queen Mary University of London. Before joining NTU, he served as a Research Assistant Professor at the MMLab of the Chinese University of Hong Kong, from 2013 to 2018. He is the recipient of 2019 Nanyang Associate Professorship (Early Career Award) from NTU. He is recognized as one of the 100 most influential scholars in computer vision by AMiner. His research interests include computer vision and deep learning with a focus on image/video restoration, enhancement, and content creation. He serves as an Associate Editor of the IJCV, TPAMI and CVIU. He also serves/served as the Area Chair of top conferences such as CVPR, ICCV, ECCV, NeurIPS and ICLR. He is a senior member of IEEE.

Harnessing Diffusion Prior for Content Enhancement and Creation

This talk shares the exploration and application of pretrained diffusion models for content enhancement and creation. By leveraging the abundant image priors and robust generative capability of diffusion models, we address diverse applications including face restoration, image super-resolution, and video-to-video translation. The research introduces novel approaches to content enrichment by harnessing the inherent structure of visual data through the diffusion process, without explicit model retraining. This discussion aims to provide insights into the viability of diffusion models as a powerful tool for image and video enhancement tasks, and stimulate further research in exploiting the generative potential of diffusion models.

刘子纬（Ziwei Liu）

Nanyang Assistant Professor at NTU

His research revolves around computer vision, machine learning and computer graphics. He has published extensively on top-tier conferences and journals in relevant fields, including CVPR, ICCV, ECCV, NeurIPS, ICLR, SIGGRAPH, TPAMI, TOG and Nature - Machine Intelligence, with over 30,000 citations. He is the recipient of Microsoft Young Fellowship, Hong Kong PhD Fellowship, ICCV Young Researcher Award, HKSTP Best Paper Award, CVPR Best Paper Award Candidate, WAIC Yunfan Award, ICBS Frontiers of Science Award and MIT Technology Review Innovators under 35 Asia Pacific. He has won the championship in major computer vision competitions, including DAVIS Video Segmentation Challenge 2017, MSCOCO Instance Segmentation Challenge 2018, FAIR Self-Supervision Challenge 2019, Video Virtual Try-on Challenge 2020 and Computer Vision in the Wild Challenge 2022. He is also the lead contributor of several renowned computer vision benchmarks and softwares, including CelebA, DeepFashion, MMHuman3D and MMFashion. He serves as an Area Chair of CVPR, ICCV, NeurIPS and ICLR, as well as an Associate Editor of IJCV.

Multi-Modal Generative AI with Foundation Models

Generating photorealistic and controllable visual content has long been a sought-after objective in artificial intelligence (AI) with numerous real-world applications. It stands as a cornerstone of embodied intelligence. In this presentation, the focus will be on discussing their work in AI-driven visual context generation encompassing humans, objects, and scenes. There will be an emphasis on leveraging the synergy between neural rendering and extensive multimodal foundation models. Their generative AI framework has demonstrated its effectiveness and adaptability across a diverse array of tasks.

潘新钢（Xingang Pan）

Assistant Professor at NTU

Prior to joining NTU, he was a postdoctoral researcher at Max Planck Institute for Informatics, working with Prof. Christian Theobalt. He received his Ph.D. degree in Information Engineering from the Chinese University of Hong Kong, supervised by Prof. Xiaoou Tang. Before that, he received his B.E. degree from Tsinghua University. His research interests are at the interface of computer vision, machine learning, and computer graphics, with a focus on generative AI. His main works include DragGAN, GAN2Shape, and Deep Generative Prior. He serves as an Area Chair for CVPR 2024.

DragGAN: a New Paradigm for Controllable Image Synthesis

The synthesis of visual content tailored to users' needs often demands precise and flexible control over the pose, shape, expression, and arrangement of generated objects. Existing methods for controlling generative adversarial networks (GANs) typically rely on manually annotated training data or prior 3D models, lacking the desired flexibility, precision, and generality. This presentation introduces a relatively unexplored yet powerful approach to controlling GANs—using a "dragging" technique to precisely relocate any points within an image to specific target points in an interactive user-driven manner.

The focus will be on DragGAN, which comprises two key components: 1) A feature-based motion supervision mechanism that guides handle points toward their target positions. 2) A novel point tracking approach that harnesses discriminative GAN features to continually pinpoint the positions of the handle points. Through DragGAN, individuals can deform an image with meticulous control over pixel placement, enabling manipulation of the pose, shape, expression, and arrangement across various categories like animals, cars, humans, landscapes, and more. Leveraging the learned generative image manifold of a GAN, these manipulations often result in realistic outputs, even in challenging scenarios involving hallucinated occluded content or deformed shapes maintaining object rigidity throughout the process.

上海人工智能大讲堂（Shanghai AI Talk）

上海人工智能大讲堂（Shanghai AI Talk）是在上海市经济与信息化委员会和上海市徐汇区政府的指导下开展的系列公开讲座，致力于面向大众传播人工智能最前沿的洞见与思考。自2019年开设以来，邀请到了国内外人工智能领域有影响力的专家、学者，以及与人工智能有示范性的结合应用案例的行业领袖。

上海人工智能实验室学术论坛

“上海人工智能实验室学术论坛”分设“星河Talk”与“星启Talk”两个主题系列活动，将分别邀请全球顶尖教授和杰出青年学者作为嘉宾，线上线下分享学术成果、探讨科技前沿。更多精彩内容，敬请期待。

热门活动

WAIC 2025 | 科学前沿全体会议观众报名

2025年7月1日(周二) 00:00-2025年7月17日(周四) 00:00

上海西岸

打造科学研究“革命的工具”，『AI4S攀登者行动计划』开放申请

2025年3月1日(周六) 00:00-2026年4月1日(周三) 00:00

CVPR 2025 上海 AI 实验室·北极星交流会

美国纳什维尔

寻找定义AI未来的提问者 | 首届明珠湖会议公开招募

2025年5月23日(周五) 00:00-2025年5月29日(周四) 18:00

上海人工智能实验室“国智计划”学术交流活动暨招生宣讲会（北京专场）

ICLR上海AI实验室双活动：云帆奖 × 北极星人才交流会（报名开启）

2025年WAIC云帆奖征集启动：共青年之智，铸AGI未来

2025年2月26日(周三) 00:00-2025年6月16日(周一) 23:59

以开放连接创新，浦江AI生态共赢计划 | 招募合作伙伴

2024年1月1日(周一) 00:00-2026年1月1日(周四) 00:00

“通用群体智能学术论坛”议程公布，与30位顶级专家“群而增智”

上海人工智能实验室视频号

《医疗大模型评测：“大模型”医生能上岗吗？4200次考核给出答案》直播解读

2025年3月19日(周三) 00:00-2025年3月20日(周四) 00:00

线上

全能高手+科学明星，全球领先开源科学多模态大模型『书生』Intern-S1发布

新闻动态

科研活动

全能高手+科学明星，全球领先开源科学多模态大模型『书生』Intern-S1发布

人才招聘

招生信息

上海AI实验室“北极星”人才计划：顶配资源、顶格成就，共创AGI美好未来 | 全球招聘

星启计划 | 上海人工智能实验室2025届校园招聘启动

南洋理工大学顶尖学者共论AIGC前沿技术

WAIC 2025 | 科学前沿全体会议观众报名

打造科学研究“革命的工具”，『AI4S攀登者行动计划』开放申请

CVPR 2025 上海 AI 实验室·北极星交流会

寻找定义AI未来的提问者 | 首届明珠湖会议公开招募

上海人工智能实验室“国智计划”学术交流活动暨招生宣讲会（北京专场）

ICLR上海AI实验室双活动：云帆奖 × 北极星人才交流会（报名开启）

2025年WAIC云帆奖征集启动：共青年之智，铸AGI未来

以开放连接创新，浦江AI生态共赢计划 | 招募合作伙伴

“通用群体智能学术论坛”议程公布，与30位顶级专家“群而增智”

《医疗大模型评测：“大模型”医生能上岗吗？4200次考核给出答案》直播解读

全能高手+科学明星，全球领先开源科学多模态大模型『书生』Intern-S1发布

新闻动态

科研活动

全能高手+科学明星，全球领先开源科学多模态大模型『书生』Intern-S1发布

人才招聘

招生信息

上海AI实验室“北极星”人才计划：顶配资源、顶格成就，共创AGI美好未来 | 全球招聘

星启计划 | 上海人工智能实验室2025届校园招聘启动

南洋理工大学顶尖学者 共论AIGC前沿技术

WAIC 2025 | 科学前沿全体会议 观众报名

打造科学研究“革命的工具”，『AI4S攀登者行动计划』开放申请

CVPR 2025 上海 AI 实验室·北极星交流会

寻找定义AI未来的提问者 | 首届明珠湖会议公开招募

上海人工智能实验室“国智计划”学术交流活动暨招生宣讲会（北京专场）

ICLR上海AI实验室双活动：云帆奖 × 北极星人才交流会（报名开启）

2025年WAIC云帆奖征集启动：共青年之智，铸AGI未来

以开放连接创新，浦江AI生态共赢计划 | 招募合作伙伴

“通用群体智能学术论坛”议程公布，与30位顶级专家“群而增智”

《医疗大模型评测：“大模型”医生能上岗吗？4200次考核给出答案》直播解读

南洋理工大学顶尖学者共论AIGC前沿技术

WAIC 2025 | 科学前沿全体会议观众报名