Intelligent Workflow Diagnostics

Palo Alto Research Center (PARC) Mathematics, 2013-14

Liaison(s): Dr. Eric Huang ’02
Advisor(s): Weiqing Gu
Students(s): Andrew Gibiansky, Yongqian Li, Amanda Llewellyn (PM), Jacob Morris-Knower, Patrick Meehan

Processing big data often involves Extract-Transform-Load (ETL) workflows, whose operations clean, verify, and join multiple data sources into one output. For large input data, debugging these workflows becomes incredibly time consuming. We designed a program, accessible to the data scientist user, which automates reasoning over and test performance for various parts of the workflow.