Correlation and Root Cause Detection in Bing Livesite Metrics

Microsoft Bing Mathematics, 2017-18

Liaison(s): Debashish Ghosal, Gautam Dewan
Advisor(s): Talithia Williams
Students(s): Grant Belsterling, Preethi Seshadri, Zhaocheng Yi (PM)

Bing is the second most popular search engine in the United States. Bing’s Livesite Engineering team collects system performance metrics to detect incidents (outages, etc.) that negatively impact the end user experience. These incidents result in a loss of ads revenue and users. Identifying root cause of such incidents requires sifting through a lot of metrics and narrowing down the root cause to one or more key metrics. Our clinic team is working to streamline and automate this process by leveraging statistical models and methods.