STATS 213
Synthetic Data Generation
Description: Lecture, three hours; discussion, one hour. Requisite: one course from course 200B, 201B, 202A, M231A, 231B. Introduction of data-centric approach, i.e., synthetic data generation, to build trustworthy artificial intelligence systems. In general, well-designed generation process of synthetic data can remove individual information (e.g., preserved data privacy), inject knowledge (e.g., guaranteed robustness), or increase diversity (e.g., enhanced fairness) based on raw data sets. Study includes tutorial on modern generative modeling approaches for synthetic data: generative-adversarial-network-based methods, diffusion process-based methods, and generative-flow-network-based methods. Examination of several use cases of synthetic data in various industries including financial service, e-commerce, and health care. S/U or letter grading.
Units: 4.0
Units: 4.0