Jiachen (Tianhao) Wang

Jiachen (Tianhao) Wang

Ph.D. Student

Princeton University

About Me

I’m a third-year Ph.D. student at Princeton University, advised by Prof. Prateek Mittal. I am also very fortunate to closely work with Prof. Ruoxi Jia at Virginia Tech. Before moving to Princeton, I received my master’s degree from Harvard University, where I worked with Prof. Salil Vadhan. Before that, I received my Bachelor’s Degree in Computer Science and Statistics from the University of Waterloo, where I closely worked with Prof. Florian Kerschbaum.

I am interested in exploring problems in responsible machine learning from a rigorous statistical perspective. Currently, I am working on data valuation with a focus on its applications in foundation models. I use tools and insights from statistics and game theory to analyze the intricate interaction between training data and learning algorithms.

I am supported by Princeton’s Gordon Y. S. Wu Fellowship.

[News (06/2024): Our paper on rethinking the application of data attribution in data selection receives oral presentation at ICML 2024!]

[News (05/2024): Two papers accepted by ICML 2024 (impossibility theorem for Shapley-based data selection, LLM as science tutor)]

[News (01/2024): Our paper about efficient weighted KNN-Shapley is accepted by AISTATS 2024 as Oral Presentation!]

   Engineering Quad B307, Princeton, NJ


  • Data Valuation


  • Princeton University, Sept 2021 -

  • Harvard University, Aug 2019 - May 2021

    MEng in Computational Science and Engineering

  • University of Waterloo, Sept 2016 - May 2019

    B.S. in Computer Science and Statistics

Selected Publications

Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

Efficient Data Valuation for Weighted Nearest Neighbor Algorithms

DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

Privacy-Preserving In-Context Learning for Large Language Models

Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

A Randomized Approach for Tight Privacy Accounting

Data Banzhaf: A Robust Data Valuation Framework for Machine Learning

LAVA: Data Valuation without Pre-Specified Learning Algorithms

ModelPred: A Framework for Predicting Trained Model from Training Data

Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning


Privacy and Fairness (CS 126) @ Harvard University

Statistics (STAT 231) @ University of Waterloo

Algebra for Honor Mathematics (MATH 135) @ University of Waterloo


Academic Events Organization


TDSC 2020, PoPETS 2021