Jiachen (Tianhao) Wang

Ph.D. Student

Princeton University

About Me

I’m a Ph.D. candidate at Princeton University, advised by Prof. Prateek Mittal. I am also very fortunate to closely work with Prof. Ruoxi Jia at Virginia Tech. Before moving to Princeton, I received my master’s degree from Harvard in 2021, where I worked with Prof. Salil Vadhan. Before that, I received my Bachelor’s Degree in Computer Science and Statistics from the University of Waterloo, where I closely worked with Prof. Florian Kerschbaum.

I am interested in developing theoretical foundations and practical tools for machine learning and data-driven systems from a data-centric perspective. I use tools from statistics, game theory, and economics to analyze the intricate connection between data and society.

I am honored to be supported by Apple PhD Fellowship (awarded to 21 phd students worldwide), Princeton University’s Yan Huo *94 Graduate Fellowship (one of three recipients in the department), and Princeton’s Gordon Y.S. Wu Fellowship. I received an ICLR'25 Outstanding Paper Honorable Mention for In-Run Data Shapley (one of just 6 papers recognized among 11,000+ submissions). In 2024, I was recognized as a Rising Star in Data Science.

I lead the organization of ICLR 2025 workshop on Data Problems for Foundation Models (DATA-FM). I gave a tutorial at NeurIPS 2024 with Ruoxi Jia and Ludwig Schmidt on Advancing Data Selection for Foundation Models: From Heuristics to Principled Methods. The slides are available here.

News

[04/2025] Deeply honored to receive an ICLR'25 Outstanding Paper Honorable Mention for In-Run Data Shapley (one of just 6 papers recognized among 11,000+ submissions).
[03/2025] Deeply honored to be selected as one of the recipients of the 2025 Apple Scholars in AI/ML PhD fellowship.
[02/2025] Two papers on data attribution for foundation models, In-Run Data Shapley and Data Value Embedding, are both being selected for oral presentation (top ~1.5% among submissions) at ICLR 2025. See you in Singapore!

tianhaowang[at]princeton.edu
Engineering Quad B307, Princeton, NJ

Interests

Data-related problems

Education

Princeton University, Sept 2021 - Present
Harvard University, Aug 2019 - May 2021
MEng in Computational Science and Engineering
University of Waterloo, Sept 2016 - May 2019
B.S. in Computer Science and Statistics

Selected Publications

Capturing the Temporal Dependence of Training Data Influence

Jiachen T. Wang, Dawn Song, James Zou, Prateek Mittal, Ruoxi Jia

ICLR'25
Oral Presentation (top ~1.5% among submissions)

Jiachen (Tianhao) Wang

Ph.D. Student

Princeton University

About Me

News

Interests

Education

Selected Publications

Capturing the Temporal Dependence of Training Data Influence

Data Shapley in One Training Run

GREATS: Online Selection of High-Quality Data for LLM Training in Every Iteration

Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

Efficient Data Valuation for Weighted Nearest Neighbor Algorithms

DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

Privacy-Preserving In-Context Learning for Large Language Models

Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

A Randomized Approach for Tight Privacy Accounting

Data Banzhaf: A Robust Data Valuation Framework for Machine Learning

LAVA: Data Valuation without Pre-Specified Learning Algorithms

Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning

Improving Cooperative Game Theory-based Data Valuation via Data Utility Learning

Concurrent Composition of Differential Privacy

DPlis: Boosting Utility of Differentially Private Deep Learning via Randomized Smoothing

RIGA: Covert and Robust White-Box Watermarking of Deep Neural Networks

Improving Robustness to Model Inversion Attacks via Mutual Information Regularization

A Principled Approach to Data Valuation for Federated Learning