Jiachen (Tianhao) Wang

Jiachen (Tianhao) Wang

Ph.D. Student

Princeton University

About Me

I’m a 4th year Ph.D. student at Princeton University, advised by Prof. Prateek Mittal. I am also very fortunate to closely work with Prof. Ruoxi Jia at Virginia Tech. Before moving to Princeton, I received my master’s degree from Harvard University, where I worked with Prof. Salil Vadhan. Before that, I received my Bachelor’s Degree in Computer Science and Statistics from the University of Waterloo, where I closely worked with Prof. Florian Kerschbaum.

I am interested in exploring problems in responsible machine learning from a rigorous statistical perspective. Currently, I am developing principled and scalable data valuation techniques for foundation models. I use tools from statistics and game theory to analyze the intricate interaction between training data and learning algorithms.

I am supported by Princeton’s Gordon Y. S. Wu Fellowship.

[News (09/2024): Deeply honored and humbled to be selected as Rising Stars in Data Science!]

[News (09/2024): Two papers accepted by NeurIPS 2024 (online batch selection, machine unlearning).]

[News (06/2024): Our paper on rethinking the application of data attribution in data selection receives oral presentation at ICML 2024!]

  tianhaowang[at]princeton.edu
   Engineering Quad B307, Princeton, NJ

Interests

  • Data Valuation

Education

  • Princeton University, Sept 2021 -

  • Harvard University, Aug 2019 - May 2021

    MEng in Computational Science and Engineering

  • University of Waterloo, Sept 2016 - May 2019

    B.S. in Computer Science and Statistics

Selected Publications

Compute-efficient LLM Training via Online Batch Selection

Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

Efficient Data Valuation for Weighted Nearest Neighbor Algorithms

DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

Privacy-Preserving In-Context Learning for Large Language Models

Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

A Randomized Approach for Tight Privacy Accounting

Data Banzhaf: A Robust Data Valuation Framework for Machine Learning

LAVA: Data Valuation without Pre-Specified Learning Algorithms

ModelPred: A Framework for Predicting Trained Model from Training Data

Teaching

Privacy and Fairness (CS 126) @ Harvard University

Statistics (STAT 231) @ University of Waterloo

Algebra for Honor Mathematics (MATH 135) @ University of Waterloo

Services

Academic Events Organization

Reviewer

TDSC 2020, PoPETS 2021