PhD graduated
Team : NPA
Departure date : 12/31/2012

Supervision : Renata CRUZ TEIXEIRA

Prediction of User Dissatisfaction with Network Application Performance at End-Hosts

Network disruptions can adversely impact a user's web browsing, cause video and audio interruptions, or render web sites and services unreachable. Such problems are frustrating to Internet users, who are oblivious to the underlying problems, but completely exposed to the service degradations. Ideally, if a user's end system could predict when the user will be dissatisfied with the performance of networked applications, then the system could launch automated tools to improve the user's experience without user intervention. Example tools include root cause diagnosis to assist the user in fixing the problem, or resource managers (e.g., bandwidth or video playout buffers) to tune the allocation of network resources to better serve the user. Clearly, the first step for such (end-host) diagnostic or resource management tools is a methodology to automatically predict performance degradations in the network that can affect a user's perception of application performance. Unfortunately, predicting user dissatisfaction with application performance is not as simple as identifying outliers in typical network metrics such as high round-trip times or loss rates. Understanding user perception requires direct feedback from end users.
This thesis develops a methodology to automatically predict user dissatisfaction with network application performance. We follow an empirical approach. We design HostView to collect network performance data annotated with user feedback at the end-hosts. When designing HostView, several questions arise concerning user privacy concerns, (un)willingness to provide feedback and the performance impact on user machines. Our first contribution is to present the results of a survey we did with 400 computer scientists to collect their perspectives on privacy issues and willingness to provide feedback. Overall, we find that users are willing to run an end-host measurement tool if we address their privacy concerns with features such as data anonymization and a pause button to temporarily stop data logging. We also find that a large portion of users will provide feedback about network performance but not more than three times per day. Our second contribution is the design and implementation of HostView. Guided by the survey results, we implement a first prototype of HostView to evaluate the CPU overhead of candidate techniques to collect network performance data. Then, we implement a second prototype of HostView to tune our algorithm for collecting user feedback to minimize the user annoyance. We recruit users in a large-scale release of HostView. Our user population connects from different networking environments (e.g., work, home, or coffee shop). Each of these environments can possibly have different network performance. Thus, we investigate if the network performance depends on the networking environment. We compare the distributions of RTTs and data rates across pairs of environments. Our third contribution is to show that for most users RTTs and download data rates are significantly different across networking environments. The mix of application determines data rates but it is the environment that determines RTTs. These results illustrate that statistical differences in network performance for a single user do not always indicate the presence of network performance degradations that could cause user dissatisfaction, but simply the presence of different application mixes or networking environments.
Finally, our fourth contribution is to develop predictors of user dissatisfaction with network application performance. The main challenges of modeling user dissatisfaction with network application performance comes from the scarcity of user feedback and the fact that poor performance episodes are rare. We develop a methodology to build training sets in face of these challenges. Then, we show that predictors based on non-linear support vector machine achieve higher true positive rates than predictors based on linear models. Our predictors consistently achieve true positive rates above 0.9. We also quantify the benefits of building per-application predictors over building general predictors that try to anticipate user dissatisfaction across multiple applications.

Defence : 12/18/2012 - 16h00 - Site Jussieu 25-26/105

Jury members :

M. Mark Crovella Université de Boston [Rapporteur]
M. James Kurose Université de Massachusetts Amherst [Rapporteur]
M. Serge Fdida CNRS et UPMC Sorbonne Universités
M. Krishna Gummadi Max Planck Institute for Software Systems
M. Thomas Karagiannis Microsoft Research
Mme Renata Teixeira CNRS et UPMC Sorbonne Universités

2008-2013 Publications