Jiachen (Tianhao) Wang

Jiachen (Tianhao) Wang

Ph.D. Student

Princeton University

About Me

I’m a Ph.D. candidate at Princeton University, advised by Prof. Prateek Mittal. I am also very fortunate to closely work with Prof. Ruoxi Jia at Virginia Tech. Before moving to Princeton, I received my master’s degree from Harvard in 2021, where I worked with Prof. Salil Vadhan. Before that, I received my Bachelor’s Degree in Computer Science and Statistics from the University of Waterloo, where I closely worked with Prof. Florian Kerschbaum.

I am interested in developing theoretical foundations and practical tools for machine learning and data-driven systems from a data-centric perspective. I use tools from statistics, game theory, and economics to analyze the intricate connection between data and society.

I am honored to be supported by Apple PhD Fellowship (awarded to 21 phd students worldwide), Princeton University’s Yan Huo *94 Graduate Fellowship (one of three recipients in the department), and Princeton’s Gordon Y.S. Wu Fellowship. I received an ICLR'25 Outstanding Paper Honorable Mention for In-Run Data Shapley (one of just 6 papers recognized among 11,000+ submissions). In 2024, I was recognized as a Rising Star in Data Science.

I lead the organization of ICLR 2025 workshop on Data Problems for Foundation Models (DATA-FM). I gave a tutorial at NeurIPS 2024 with Ruoxi Jia and Ludwig Schmidt on Advancing Data Selection for Foundation Models: From Heuristics to Principled Methods. The slides are available here.

News

[04/2025] Deeply honored to receive an ICLR'25 Outstanding Paper Honorable Mention for In-Run Data Shapley (one of just 6 papers recognized among 11,000+ submissions).
[03/2025] Deeply honored to be selected as one of the recipients of the 2025 Apple Scholars in AI/ML PhD fellowship.
[02/2025] Two papers on data attribution for foundation models, In-Run Data Shapley and Data Value Embedding, are both being selected for oral presentation (top ~1.5% among submissions) at ICLR 2025. See you in Singapore!

  tianhaowang[at]princeton.edu
   Engineering Quad B307, Princeton, NJ

Interests

  • Data-related problems

Education

  • Princeton University, Sept 2021 - Present

  • Harvard University, Aug 2019 - May 2021

    MEng in Computational Science and Engineering

  • University of Waterloo, Sept 2016 - May 2019

    B.S. in Computer Science and Statistics

Selected Publications

Capturing the Temporal Dependence of Training Data Influence

Data Shapley in One Training Run

GREATS: Online Selection of High-Quality Data for LLM Training in Every Iteration

Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

Efficient Data Valuation for Weighted Nearest Neighbor Algorithms

DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

Privacy-Preserving In-Context Learning for Large Language Models

Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

A Randomized Approach for Tight Privacy Accounting

Data Banzhaf: A Robust Data Valuation Framework for Machine Learning

LAVA: Data Valuation without Pre-Specified Learning Algorithms

Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning

Improving Cooperative Game Theory-based Data Valuation via Data Utility Learning

Concurrent Composition of Differential Privacy

DPlis: Boosting Utility of Differentially Private Deep Learning via Randomized Smoothing

RIGA: Covert and Robust White-Box Watermarking of Deep Neural Networks

Improving Robustness to Model Inversion Attacks via Mutual Information Regularization

A Principled Approach to Data Valuation for Federated Learning