科学研究

Research

首页 >  科学研究
研究方向

人工智能基础理论

开展人工智能前沿基础理论研究,包括机器学习、强化学习、深度学习、知识计算、因果推理、信息安全等;关注人工智能交叉学科研究,探索数据驱动的科学研究新范式。

人工智能开放平台

构建人工智能新型大数据、算法和算力等平台,全面支撑人工智能基础和应用研究。

人工智能基础软件和基础硬件系统

开展人工智能基础软硬件系统的研发,构建技术生态的软硬件基础,包括新一代人工智能训练框架、编程语言、编译器等基础软件,人工智能芯片、传感器等基础硬件。

人工智能应用

探索人工智能技术在城市、交通、医疗、教育、文旅、金融、制造业等行业的应用,关注新领域,开展共性技术平台的研发。

人工智能核心技术

发展新一代人工智能技术,包括计算机视觉、自然语言处理、语音处理、决策智能、智能机器人、城市计算、计算机图形学、数字孪生等。

人工智能伦理与政策

关注人工智能可能引发的经济、社会、伦理、法律、安全、隐私和数据治理等问题,提出解决方案,提供政策参考。

学术成果

发表会议及期刊:Nature Communications

2024

AI-driven projection tomography with multicore fibre-optic cell rotation

Optical tomography has emerged as a non-invasive imaging method, providing three-dimensional insights into subcellular structures and thereby enabling a deeper understanding of cellular functions, interactions, and processes. Conventional optical tomography methods are constrained by a limited illumination scanning range, leading to anisotropic resolution and incomplete imaging of cellular structures. To overcome this problem, we employ a compact multi-core fibre-optic cell rotator system that facilitates precise optical manipulation of cells within a microfluidic chip, achieving full-angle projection tomography with isotropic resolution. Moreover, we demonstrate an AI-driven tomographic reconstruction workflow, which can be a paradigm shift from conventional computational methods, often demanding manual processing, to a fully autonomous process. The performance of the proposed cell rotation tomography approach is validated through the three-dimensional reconstruction of cell phantoms and HL60 human cancer cells. The versatility of this learning-based tomographic reconstruction workflow paves the way for its broad application across diverse tomographic imaging modalities, including but not limited to flow cytometry tomography and acoustic rotation tomography. Therefore, this AI-driven approach can propel advancements in cell biology, aiding in the inception of pioneering therapeutics, and augmenting early-stage cancer diagnostics.

发表会议及期刊:arXiv

2023

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

In contrast to numerous NLP and 2D computer vision foundational models, the learning of a robust and highly generalized 3D foundational model poses considerably greater challenges. This is primarily due to the inherent data variability and the diversity of downstream tasks. In this paper, we introduce a comprehensive 3D pre-training framework designed to facilitate the acquisition of efficient 3D representations, thereby establishing a pathway to 3D foundational models. Motivated by the fact that informative 3D features should be able to encode rich geometry and appearance cues that can be utilized to render realistic images, we propose a novel universal paradigm to learn point cloud representations by differentiable neural rendering, serving as a bridge between 3D and 2D worlds. We train a point cloud encoder within a devised volumetric neural renderer by comparing the rendered images with the real images. Notably, our approach demonstrates the seamless integration of the learned 3D encoder into diverse downstream tasks. These tasks encompass not only high-level challenges such as 3D detection and segmentation but also low-level objectives like 3D reconstruction and image synthesis, spanning both indoor and outdoor scenarios. Besides, we also illustrate the capability of pre-training a 2D backbone using the proposed universal methodology, surpassing conventional pre-training methods by a large margin. For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks. The consistent improvements in various settings imply the effectiveness of the proposed method. Code and models will be made available at https://github.com/OpenGVLab/PonderV2.

comm@pjlab.org.cn

上海市徐汇区云锦路701号西岸国际人工智能中心37-38层

沪ICP备2021009351号-1