Tan-Hanh Pham

Tan-Hanh Pham

I have been working as a postdoctoral researcher at Harvard Medical School and Massachusetts General Hospital since June 2025, where I focus application of AI in medical image analysis.

I earned my Ph.D. in Robotics and Deep Learning from the Florida Institute of Technology, USA, in May 2025. Prior to that, I was a research assistant at the Bioinspired System Laboratory after completing my M.Sc. in Smart Vehicle Engineering at Konkuk University, South Korea, in May 2022. I received my B.Sc. in Mechanical Engineering from HCMUTE, Vietnam, in 2019.

CV / Google Scholar / GitHub

Tan-Hanh Pham, Research Fellow
Mass General Hospital, Harvard Medical School

Research Interests

My current research focus on reasoning multimodal models, 3D/4D medical image analysis, image/video understanding through vision-language models, and RL for Robotics.

I am open to collaborations on AI-related projects. If you're interested, feel free to contact me.

News

✅ [Multimodal Chain of Continuous Thought for Latent-Space Reasoning] Pham, T.H. and Chris Ngo, 2025. Multimodal Chain of Continuous Thought for Latent-Space Reasoning in Vision-Language Models.

✨ [EMNLP 2025] (Our paper got accepted at the main conference) Pham, T.H., Hoang-Nam Le, Phu-Vinh Nguyen, Chris Ngo, and Truong-Son Hy. 🎉 SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization.

✨ [June 2025] Pham, T.H., and Ngo, C., 2025. RARL: Improving Medical VLM Reasoning and Generalization with Reinforcement Learning and LoRA under Data and Hardware Constraints.

✨ [CVPR 2025] Pham, T.H., Bui, T.D., Quang, M.L., Pham, T.H., Ngo, C., and Hy, T.S., 2025. 🌟 SilVar-Med: A Speech-Driven Visual Language Model for Explainable Abnormality Detection in Medical Imaging. In Proceedings of the Computer Vision and Pattern Recognition Conference, pp. 2984–2994. 2025.

✨ [May 2025] Pham, T.H., PV Nguyen, DT Hung, BT Duong, VN Thanh, C Ngo, TQ Truong, and TS Hy, 2025. IQBench: How "Smart'' Are Vision-Language Models? A Study with Human IQ Tests.

✨ [ACL 2025] Khai Le-Duc, Phuc Phan, Tan-Hanh Pham, Bach Phan Tat, Minh-Huong Ngo, Truong-Son Hy, 2025. MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics.