Publications

Please visit my Google Scholar profile to check out my up-to-date publication list.

# indicates equal contributions; * indicates corresponding authors.

2024

  1. Preprint
    neurips24_o3d.png
    Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
    Hao Wen#, Zehuan Huang# , Yaohui Wang, Xinyuan Chen, Yu Qiao, and Lu Sheng*
    CoRR, 2024
  2. Preprint
    acmmm_p2w.png
    From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation
    Zehuan Huang#, Hongxin Fan# , Lipeng Wang#, and Lu Sheng*
    CoRR, 2024
  3. Preprint
    tr24_llm_eval.png
    From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
    Chaochao Lu, Chen Qian, Guodong Zheng, Hongxing Fan, Hongzhi Gao , Jie Zhang, Jing Shao, Jingyi Deng, Jinlan Fu, Kexin Huang, and 26 more authors
    CoRR, (authors listed in alphabetical order) , 2024
  4. Preprint
    eccv24_minedreamer.jpg
    MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control
    Enshen Zhou, Yiran Qin, Zhenfei Yin, Yuzhou Huang , Ruimao ZhangLu ShengYu Qiao, and Jing Shao
    CoRR, 2024
  5. Preprint
    eccv24_ch3ef.png
    Assessment of Multimodal Large Language Models in Alignment with Human Values
    Zhelun Shi , Zhipin Wang, Hongxing Fan , Zaibin Zhang, Lijun Li , Yongting Zhang, Zhenfei YinLu ShengYu Qiao, and Jing Shao
    CoRR, 2024
  6. Preprint
    eccv24_rh20tp.png
    RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents
    Zeren Chen, Zhelun Shi , Xiaoya Lu, Lehan He, Sucheng Qian, Haoshu FangZhenfei YinWanli OuyangJing ShaoYu Qiao, and 2 more authors
    CoRR, 2024
  7. Fast-BEV: A Fast and Strong Bird’s-Eye View Perception Baseline
    Yangguang Li, Bin Huang, Zeren Chen, Yufeng Cui, Feng Liang, Mingzhu Shen , Fenggang Liu, Enze Xie, Lu Sheng*Wanli Ouyang, and 1 more author
    IEEE Trans. Pattern Anal. Mach. Intell., early access , 2024
  8. 3D Reconstruction from a Single Sketch via View-dependent Depth Sampling
    Chenjian Gao# , Xilin Wang#, Qian Yu*Lu ShengJing Zhang, Xiaoguang Han, Yi-Zhe Song, and Dong Xu
    IEEE Trans. Pattern Anal. Mach. Intell., early access , 2024
  9. IJCAI
    ijcai24_ssmd.png
    Self-Supervised Monocular Depth Estimation in the Dark: Towards Data Distribution Compensation
    Haolin Yang, Chaoqiang Zhao, Lu Sheng, and Yang Tang
    In 33rd International Joint Conference on Artificial Intelligence , 2024
  10. CVPR
    cvpr24_epidiff.png
    EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion
    Zehuan Huang#, Hao Wen#, Junting Dong# , Yaohui Wang, Yangguang Li, Xinyuan Chen, Yan-Pei Cao, Ding Liang, Yu QiaoBo Dai*, and 1 more author
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2024
  11. CVPR
    cvpr24_mp5.png
    MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception
    Yiran Qin#, Enshen Zhou# , Qichang Liu#, Zhenfei YinLu Sheng*Ruimao Zhang*Yu Qiao, and Jing Shao
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2024
  12. ICLR
    iclr24_octavius.png
    Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE
    Zeren Chen# , Ziqin Wang# , Zhen Wang , Huayang Liu, Zhenfei Yin , Si Liu, Lu Sheng*Wanli OuyangYu Qiao, and Jing Shao*
    In International Conference on Learning Representations , 2024
  13. AAAI
    aaai24_3dwssg.png
    Multi-Modality Affinity Inference for Weakly Supervised 3D Semantic Segmentation
    Xiawei Li , Qingyuan Xu, Jing Zhang* , Tianyi Zhang, Qian YuLu Sheng, and Dong Xu
    In Thirty-Eighth AAAI Conference on Artificial Intelligence , 2024
  14. AAAI
    aaai24_dfzsl.png
    Data-Free Generalized Zero-Shot Learning
    Bowen Tang, Jing Zhang* , Long Yan, Qian YuLu Sheng, and Dong Xu
    In Thirty-Eighth AAAI Conference on Artificial Intelligence , 2024

2023

  1. Preprint
    tr23_chef.png
    ChEF: A Comprehensive Evaluation Framework for Standardized Assessment of Multimodal Large Language Models
    Zhelun Shi , Zhipin Wang, Hongxing Fan, Zhenfei YinLu ShengYu Qiao, and Jing Shao
    CoRR, 2023
  2. NeurIPS
    neurips23_lamm.png
    LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark
    Zhenfei Yin# , Jiong Wang#, Jianjian Cao#, Zhelun Shi# , Dingning Liu, Mukai Li, Xiaoshui Huang , Zhiyong Wang, Lu Sheng, Lei Bai*, and 2 more authors
    In Advances in Neural Information Processing Systems , 2023
  3. CVPR
    cvpr23_siamese_detr.png
    Siamese DETR
    Zeren Chen#, Gengshi Huang#, Wei Li, Jianing Teng , Kun Wang, Jing ShaoChen Change Loy, and Lu Sheng*
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023
  4. CVPR
    cvpr23_vlsat.png
    VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud
    Ziqin Wang, Bowen Cheng, Lichen Zhao, Dong XuYang Tang*, and Lu Sheng*
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (Highlight Poster) , 2023
  5. IEEE T-CSVT
    csvt23_fegqa.png
    Toward Explainable 3D Grounded Visual Question Answering: A New Benchmark and Strong Baseline
    Lichen Zhao, Daigang Cai, Jing ZhangLu ShengDong Xu, Rui Zheng, Yinjie Zhao , Lipeng Wang, and Xibo Fan
    IEEE Trans. Circuits Syst. Video Technol., 2023
  6. Guest Editorial
    jsps.png
    Guest Editorial: Special Issue on Machine Learning and Signal Processing
    Qian Yu, Liang Zheng, Lu Sheng, and Dong Xu
    J. Signal Process. Syst., 2023
  7. ACM MM
    mm23_360sod.png
    Distortion-aware Transformer in 360 Salient Object Detection
    Yinjie Zhao, Lichen Zhao, Qian YuLu ShengJing Zhang, and Dong Xu
    In Proceedings of the 31st ACM International Conference on Multimedia , 2023

2022

  1. VPU: A Video-Based Point Cloud Upsampling Framework
    Kaisiyuan Wang, Lu Sheng, Shuhang Gu, and Dong Xu
    IEEE Trans. Image Process., 2022
  2. ECCV
    eccv22_rgbd_registration.png
    Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation
    Ziming Wang*, Xiaoliang Huo*, Zhenghao Chen, Jing ZhangLu Sheng#, and Dong Xu
    In European Conference on Computer Vision , 2022
  3. ECCV
    eccv22_sketchsampler.png
    SketchSampler: Sketch-Based 3D Reconstruction via View-Dependent Depth Sampling
    Chenjian Gao, Qian Yu*Lu Sheng, Yi-Zhe Song, and Dong Xu
    In European Conference on Computer Vision , 2022
  4. ECCV
    eccv22_x_learner.png
    X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation
    Yinan He#, Gengshi Huang#, Siyu Chen#, Jianing Teng# , Kun Wang, Zhenfei YinLu ShengZiwei LiuYu Qiao*, and Jing Shao
    In European Conference on Computer Vision , 2022
  5. CVPR
    cvpr22_3djcg.png
    3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
    Daigang Cai, Lichen Zhao, Jing Zhang*Lu Sheng, and Dong Xu
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (Oral Presentation) , 2022
  6. AAAI
    aaai22_danceformer.png
    DanceFormer: Music Conditioned 3D Dance Generation with Parametric Motion Transformer
    Buyu Li, Yongchi Zhao, Zhelun Shi, and Lu Sheng*
    In Thirty-Sixth AAAI Conference on Artificial Intelligence , 2022

2021

  1. IEEE T-CSVT
    tcsvt21_spu.png
    Sequential Point Cloud Upsampling by Exploiting Multi-Scale Temporal Dependency
    Kaisiyuan Wang, Lu Sheng, Shuhang Gu, and Dong Xu
    IEEE Trans. Circuits Syst. Video Technol., 2021
  2. IEEE T-CSVT
    tcsvt21_t3d.png
    Transformer3D-Det: Improving 3D Object Detection by Vote Refinement
    Lichen Zhao, Jinyang Guo, Dong Xu, and Lu Sheng
    IEEE Trans. Circuits Syst. Video Technol., 2021
  3. PCG-TAL: Progressive Cross-Granularity Cooperation for Temporal Action Localization
    Rui Su, Dong XuLu Sheng, and Wanli Ouyang
    IEEE Trans. Image Process., 2021
  4. IEEE T-MM
    tmm21_npc.png
    Motion Compensated Virtual View Synthesis Using Novel Particle Cell
    Chi Ho Cheung, Lu Sheng, and King Ngi Ngan
    IEEE Trans. Multim., 2021
  5. CVPR
    cvpr21_forgerynet.png
    ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
    Yinan He#, Bei Gan#, Siyu Chen# , Yichun Zhou# , Guojun Yin, Luchuan Song, Lu ShengJing Shao* , and Ziwei Liu
    In IEEE Conference on Computer Vision and Pattern Recognition (Oral Presentation) , 2021
  6. CVPR
    cvpr21_brnet.png
    Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point Clouds
    Bowen Cheng, Lu Sheng*, Shaoshuai Shi, Ming Yang, and Dong Xu
    In IEEE Conference on Computer Vision and Pattern Recognition , 2021
  7. ICCV
    iccv21_3dvg.png
    3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds
    Lichen Zhao, Daigang Cai, Lu Sheng*, and Dong Xu
    In IEEE/CVF International Conference on Computer Vision (1st place at 3D Object Localization Challenge at the CVPR 2021, 1st Workshop on Language for 3D Scenes) , 2021
  8. ICCV
    iccv21_styleformer.png
    StyleFormer: Real-time Arbitrary Style Transfer via Parametric Style Composition
    Xiaolei Wu, Zhihao Hu, Lu Sheng, and Dong Xu
    In IEEE/CVF International Conference on Computer Vision , 2021
  9. ACM MM
    mm21_votehmr.png
    VoteHMR: Occlusion-Aware Voting Network for Robust 3D Human Mesh Recovery from Partial Point Clouds
    Guanze Liu, Yu Rong, and Lu Sheng*
    In Proceedings of the 29th ACM International Conference on Multimedia (Oral Presentation) , 2021
  10. WACV
    wacv21_increaco.png
    IncreACO: Incrementally Learned Automatic Check-out with Photorealistic Exemplar Augmentation
    Yandan Yang, Lu Sheng, Xiaolong Jiang , Haochen Wang, Dong Xu, and Xianbin Cao
    In IEEE Winter Conference on Applications of Computer Vision , 2021

2020

  1. IJCV
    ijcv20_highquality.png
    High-Quality Video Generation from Static Structural Annotations
    Lu Sheng#*, Junting Pan#, Jiaming Guo, Jing Shao, and Chen Change Loy
    Int. J. Comput. Vis., 2020
  2. AAAI
    aaai20_msn.png
    Morphing and Sampling Network for Dense Point Cloud Completion
    Minghua Liu, Lu Sheng, Sheng Yang, Jing Shao, and Shi-Min Hu
    In The Thirty-Fourth AAAI Conference on Artificial Intelligence , 2020
  3. ECCV
    eccv20_f3net.png
    Thinking in Frequency: Face Forgery Detection by Mining Frequency-Aware Clues
    Yuyang Qian , Guojun Yin, Lu Sheng*, Zixuan Chen, and Jing Shao
    In European Conference on Computer Vision , 2020
  4. ECCV
    eccv20_oneshot_nas.png
    Powering One-Shot Topological NAS with Stabilized Share-Parameter Proxy
    Ronghao Guo, Chen Lin, Chuming Li, Keyu Tian , Ming Sun, Lu Sheng, and Junjie Yan
    In European Conference on Computer Vision , 2020

2019

  1. Visibility Constrained Generative Model for Depth-Based 3D Facial Pose Tracking
    Lu Sheng, Jianfei Cai, Tat-Jen Cham, Vladimir Pavlovic, and King Ngi Ngan
    IEEE Trans. Pattern Anal. Mach. Intell., 2019
  2. PRL
    prl19_cascaded.jpg
    Cascaded regression using landmark displacement for 3D face reconstruction
    Fanzi Wu, Songnan Li, Tianhao Zhao, King Ngi Ngan, and Lu Sheng
    Pattern Recognit. Lett., 2019
  3. VRIH
    vrih19_bag.png
    Bags of tricks for learning depth and camera motion from monocular videos
    Bowen Dong, and Lu Sheng
    Virtual Real. Intell. Hardw., 2019
  4. CVPR
    cvpr19_gs3d.png
    GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving
    Buyu Li, Wanli OuyangLu Sheng, Xingyu Zeng, and Xiaogang Wang
    In IEEE Conference on Computer Vision and Pattern Recognition , 2019
  5. CVPR
    cvpr19_sdgan.png
    Semantics Disentangling for Text-To-Image Generation
    Guojun Yin , Bin Liu, Lu Sheng* , Nenghai Yu, Xiaogang Wang, and Jing Shao
    In IEEE Conference on Computer Vision and Pattern Recognition (Oral Presentation) , 2019
  6. CVPR
    cvpr19_single.png
    Video Generation From Single Semantic Label Map
    Junting Pan , Chengyu Wang, Xu Jia, Jing ShaoLu ShengJunjie Yan, and Xiaogang Wang
    In IEEE Conference on Computer Vision and Pattern Recognition , 2019
  7. CVPR
    cvpr19_cag.png
    Context and Attribute Grounded Dense Captioning
    Guojun Yin, Lu Sheng , Bin Liu , Nenghai Yu, Xiaogang Wang, and Jing Shao
    In IEEE Conference on Computer Vision and Pattern Recognition , 2019
  8. ICCV
    iccv19_keyframe.png
    Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM
    Lu Sheng , Dan Xu, Wanli Ouyang, and Xiaogang Wang
    In IEEE/CVF International Conference on Computer Vision , 2019
  9. ICCV
    iccv19_reid.png
    Improving Pedestrian Attribute Recognition With Weakly-Supervised Multi-Scale Attribute-Specific Localization
    Chufeng Tang, Lu Sheng , Zhaoxiang Zhang, and Xiaolin Hu
    In IEEE/CVF International Conference on Computer Vision , 2019
  10. ICCV
    iccv19_camp.png
    CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
    Zihao Wang, Xihui Liu, Hongsheng Li, Lu ShengJunjie YanXiaogang Wang, and Jing Shao
    In IEEE/CVF International Conference on Computer Vision , 2019

2018

  1. IEEE T-MM
    tmm18_disocclusion.gif
    Spatio-Temporal Disocclusion Filling Using Novel Sprite Cells
    Chi Ho Cheung, King Ngi Ngan, and Lu Sheng
    IEEE Trans. Multim., 2018
  2. CVPR
    cvpr18_off.jpg
    Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition
    Shuyang Sun, Zhanghui Kuang, Lu ShengWanli Ouyang , and Wei Zhang
    In IEEE Conference on Computer Vision and Pattern Recognition , 2018
  3. CVPR
    cvpr18_d2ae.png
    Exploring Disentangled Feature Representation Beyond Face Identification
    Yu Liu, Fangyin Wei, Jing ShaoLu ShengJunjie Yan, and Xiaogang Wang
    In IEEE Conference on Computer Vision and Pattern Recognition , 2018
  4. CVPR
    cvpr18_avatarnet.png
    Avatar-Net: Multi-Scale Zero-Shot Style Transfer by Feature Decoration
    Lu Sheng, Ziyi Lin, Jing Shao, and Xiaogang Wang
    In IEEE Conference on Computer Vision and Pattern Recognition , 2018
  5. ECCV
    eccv18_zoomnet.png
    Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition
    Guojun Yin, Lu Sheng , Bin Liu , Nenghai Yu, Xiaogang WangJing Shao, and Chen Change Loy
    In European Conference on Computer Vision , 2018
  6. ACM MM
    mm18_mlic.jpg
    Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection
    Yongcheng Liu, Lu ShengJing ShaoJunjie Yan, Shiming Xiang, and Chunhong Pan
    In ACM International Conference on Multimedia Conference , 2018

2017

  1. CVPR
    cvpr17_face.png
    A Generative Model for Depth-Based Robust 3D Facial Pose Tracking
    Lu Sheng, Jianfei Cai, Tat-Jen Cham, Vladimir Pavlovic, and King Ngi Ngan
    In IEEE Conference on Computer Vision and Pattern Recognition , 2017
  2. ICCV
    iccv17_hpnet.jpg
    HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis
    Xihui Liu, Haiyu Zhao, Maoqing Tian, Lu ShengJing Shao, Shuai Yi, Junjie Yan, and Xiaogang Wang
    In IEEE International Conference on Computer Vision , 2017

2016

  1. Real-Time Head Pose Tracking with Online Face Template Reconstruction
    Songnan Li, King Ngi Ngan, Raveendran Paramesran, and Lu Sheng
    IEEE Trans. Pattern Anal. Mach. Intell., 2016

2015

  1. Online Temporally Consistent Indoor Depth Video Enhancement via Static Structure
    Lu ShengKing Ngi Ngan, Chern-Loon Lim, and Songnan Li
    IEEE Trans. Image Process., 2015
  2. ICME-W
    icmew15_disc.png
    A disocclusion filling method using multiple sprites with depth for virtual view synthesis
    Chi Ho Cheung, Lu Sheng, and King Ngi Ngan
    In IEEE International Conference on Multimedia & Expo Workshops , 2015

2014

  1. ACCV
    accv14_wmf.webp
    Accelerating the Distribution Estimation for the Weighted Median/Mode Filters
    Lu ShengKing Ngi Ngan, and Tak-Wai Hui
    In Asian Conference on Computer Vision , 2014
  2. ICIP
    icip14_tss.png
    Temporal depth video enhancement based on intrinsic static structure
    Lu ShengKing Ngi Ngan, and Songnan Li
    In IEEE International Conference on Image Processing , 2014
  3. ICIP
    icip14_calib.png
    Screen-camera calibration using a thread
    Songnan Li, King Ngi Ngan, and Lu Sheng
    In IEEE International Conference on Image Processing , 2014

2013

  1. ICIP
    icip13_holefilling.png
    Depth enhancement based on hybrid geometric hole filling strategy
    Lu Sheng, and King Ngi Ngan
    In IEEE International Conference on Image Processing , 2013
  2. ICVS
    icvs13_face.png
    A Head Pose Tracking System Using RGB-D Camera
    Songnan Li, King Ngi Ngan, and Lu Sheng
    In International Conference on Computer Vision Systems , 2013