Teng Wang 王腾

alt text 

Researcher at Tencent
Shenzhen, China

Email: ttengwang@gmail.com
Github
Google scholar

Teng Wang is currently a researcher at Tencent ARC Lab. He obtained his Ph.D. degree from the University of Hong Kong (HKU) in 2024, fortunately supervised by Prof. Ping Luo and Prof. Feng Zheng. Before that, he obtained his B.E. and M.E. degrees from Sun Yat-sen University (SYSU) under the supervision of Prof. Huicheng Zheng.

Hiring! We are hiring self-motivated research interns to join the Multimodal Foundation Model team. Please feel free to drop me an email if you are interested.

Research

My research interests include:

  • Computer Vision

  • Multi-Modal Machine Learning (vision, language, audio, etc)

  • Video Understanding

Selected Publications

* equal contribution

Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang, Teng Wang, Haigang Zhang, Ping Lu, and Feng Zheng
European Conference on Computer Vision (ECCV), 2024.

Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer
Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng, Xuelong Li
ACM International Conference on Multimedia (MM), 2024.

Transferable decoding with visual entities for zero-shot image captioning
Junjie Fei*, Teng Wang*, Jinrui Zhang, Zhenyu He, Chengjie Wang, Feng Zheng
International Conference on Computer Vision (ICCV), 2023.

Knowledge-aware prompt tuning for generalizable vision-language models
Baoshuo Kan*, Teng Wang*, Wenpeng Lu, Xiantong Zhen, Weili Guan, Feng Zheng
International Conference on Computer Vision (ICCV), 2023.

Set-level guidance attack: Boosting adversarial transferability of vision-language pre-training models
Dong Lu, Zhiqiang Wang, Teng Wang, Weili Guan, Hongchang Gao, Feng Zheng
International Conference on Computer Vision (ICCV), 2023.

Pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
Chengyue Wu, Teng Wang, Yixiao Ge, Zeyu Lu, Ruisong Zhou, Ying Shan, Ping Luo
International Conference on Machine Learning (ICML), 2023

Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
Tiantian Geng, Teng Wang, Jinming Duan, Runmin Cong, Feng Zheng
IEEE Computer Vision and Pattern Recognition (CVPR), 2023.

Accelerating Vision-Language Pretraining with Free Language Modeling
Teng Wang , Yixiao Ge, Feng Zheng, Ran Cheng, Ying Shan, Xiaohu Qie, Ping Luo
IEEE Computer Vision and Pattern Recognition (CVPR), 2023.

Show, Tell and Rephrase: Diverse Video Captioning via Two-Stage Progressive Training
Zhu Liu, Teng Wang, Jinrui Zhang, Feng Zheng, Wenhao Jiang, Ke Lu
IEEE Transactions on Multimedia (TMM), 2022.

VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix
Teng Wang, Wenhao Jiang, Zhichao Lu, Feng Zheng, Ran Cheng, Chengguo Yin, Ping Luo
International Conference on Machine Learning (ICML), 2022

End-to-end dense video captioning with parallel decoding
Teng Wang, Ruimao Zhang, Zhichao Lu, Feng Zheng, Ran Cheng, Ping Luo
International Conference on Computer Vision (ICCV), 2021.

Event-centric hierarchical representation for dense video captioning
Teng Wang, Huicheng Zheng, Mingjing Yu, Qian Tian, Haifeng Hu
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2020.

Arxiv

UniAV: Unified Audio-Visual Perception for Multi-Task Video Localization
Tiantian Geng, Teng Wang, Yanfu Zhang, Jinming Duan, Weili Guan, Feng Zheng
Arxiv, 2024.

Video understanding with large language models: A survey
Yunlong Tang, Jing Bi, Siting Xu, Luchuan Song, Susan Liang, Teng Wang, Daoan Zhang, Jie An, Jingyang Lin, Rongyi Zhu, Ali Vosoughi, Chao Huang, Zeliang Zhang, Feng Zheng, Jianguo Zhang, Ping Luo, Jiebo Luo, Chenliang Xu
Arxiv, 2024.

Caption anything: Interactive image description with diverse multimodal controls
Teng Wang*, Jinrui Zhang*, Junjie Fei*, Yixiao Ge, Hao Zheng, Yunlong Tang, Zhe Li, Mingqi Gao, Shanshan Zhao, Ying Shan, Feng Zheng
Arxiv, 2023.

Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
Teng Wang*, Jinrui Zhang*, Feng Zheng, Wenhao Jiang, Ran Cheng, Ping Luo
Arxiv, 2023.

Academic service

    Conference reviewer for CVPR, ICCV, ECCV, ICML, NeurIPS
    Journal reviewer for IJCV, IEEE TNNLS, IEEE TIP, IEEE TMM, IEEE TCSVT

Experience

    Research Intern at TikTok, 2023.
    Research Intern at Tencent ARC Lab, 2022.
    Research Intern at Tencent Data Platform, 2021.
    Research Intern at Tencent AI Lab, 2019.

Competitions & Awards