I am currently on the job market! Feel free to DM me for any opportunities or discussions. You can access my CV here.
I’m a final-year Ph.D. candidate at Princeton University, advised by Prof. Prateek Mittal. I am also very fortunate to closely work with Prof. Ruoxi Jia at Virginia Tech. I am currently a student researcher at Google Research, working closely with Lin Chen, Mohammadhossein Bateni, and Vahab Mirrokni. Before moving to Princeton, I received my master’s degree from Harvard in 2021, where I worked with Prof. Salil Vadhan. Before that, I was an undergrad in Computer Science and Statistics at the University of Waterloo, where I closely worked with Prof. Florian Kerschbaum.
I’m interested in the broad area of machine learning and artificial intelligence (AI). During my Ph.D., I focus on developing principled data-centric methodologies to build trustworthy AI at scale. I use tools from statistics, optimization, and algorithmic game theory to understand the intricate interactions between data, optimizers, and architectures.
I am honored to be supported by Apple PhD Fellowship (awarded to 21 phd students worldwide in 2025), Princeton’s Yan Huo *94 Graduate Fellowship (one of 3 recipients in the department), and Princeton’s Gordon Y.S. Wu Fellowship. My work on scalable data attribution received ICLR'25 Outstanding Paper Honorable Mention (one of 6 papers recognized among 11,000+ submissions). In 2024, I was recognized as a Rising Star in Data Science.
I led the organization of ICLR 2025 workshop on Data Problems for Foundation Models (DATA-FM). I gave a tutorial at NeurIPS'24 with Ruoxi Jia and Ludwig Schmidt on Advancing Data Selection for Foundation Models: From Heuristics to Principled Methods. The slides are available here.
[04/2025] Deeply honored to receive an
ICLR'25 Outstanding Paper Honorable Mention for In-Run Data Shapley (one of just 6 papers recognized among 11,000+ submissions).
[03/2025] Deeply honored to be selected as one of the recipients of the
2025 Apple Scholars in AI/ML PhD fellowship.
[02/2025] Two papers on data attribution for foundation models,
In-Run Data Shapley and
Data Value Embedding, are both being selected for oral presentation (top ~1.5% among submissions) at ICLR 2025. See you in Singapore!
tianhaowang[at]princeton.edu
Engineering Quad B307, Princeton, NJ
Princeton University, Sept 2021 - Present
Harvard University, Aug 2019 - May 2021
MEng in Computational Science and Engineering
University of Waterloo, Sept 2016 - May 2019
B.S. in Computer Science and Statistics