Teng Wang 王腾
你好!我是王腾,目前就职于腾讯ARC实验室,从事算法研究工作,专注于多模态基础模型与视频理解系统的探索。在此之前,我于2024年在香港大学获得计算机科学博士学位,师从罗平教授和郑锋教授。在攻读博士之前,我在中山大学先后获得学士和硕士学位,师从郑慧诚教授。 合作意向: 我们正在积极寻找有志于多模态基础模型与视频理解系统研究的实习生和合作者。如果你对视觉-语言-音频任务、视频理解或多模态推理模型感兴趣,欢迎通过邮箱联系我们! 近期动态
研究兴趣我的研究兴趣包括:
部分论文* 共一作者
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Video understanding with large language models: A survey
UniAV: Unified Audio-Visual Perception for Multi-Task Video Localization
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Caption anything: Interactive image description with diverse multimodal controls
Transferable decoding with visual entities for zero-shot image captioning
Knowledge-aware prompt tuning for generalizable vision-language models
Set-level guidance attack: Boosting adversarial transferability of vision-language pre-training models
Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
Accelerating Vision-Language Pretraining with Free Language Modeling
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix
End-to-end dense video captioning with parallel decoding
Event-centric hierarchical representation for dense video captioning 学术服务
期刊审稿人: IJCV, IEEE TNNLS, IEEE TIP, IEEE TMM, IEEE TCSVT 工作经历
实习: TikTok (2023), 腾讯ARC实验室 (2022), 腾讯数据平台 (2021), 腾讯AI Lab (2019) 竞赛获奖
冠军,Make-up Temporal Video Grounding Track of PIC challenge at ACM MM 2022 冠军,Make-up Dense Video Captioning Track of PIC challenge at ACM MM 2022 亚军,Generic Event Boundary Captioning Track of LOVEU Challenge at CVPR 2022 亚军,Event Dense-Captioning Track of ActivityNet Challenge at CVPR 2020, CVPR2021, CVPR2022 季军,TinyAction Challenge at CVPR 2021 |