The knowledge of end-to-end network performance is essential to many Internet applications and systems including traffic engineering, content distribution networks, overlay routing, application-level multicast, and peer-to-peer applications. On the one hand, such knowledge allows service providers to adjust their services according to the dynamic network conditions. On the other hand, as many systems are flexible in choosing their communication paths and targets, knowing network performance enables to optimize services by e.g. intelligent path selection.
In the networking field, end-to-end network performance refers to some property of a network path measured by various metrics such as round-trip time (RTT), available bandwidth (ABW) and packet loss rate (PLR). While much progress has been made in network measurement, a main challenge in the acquisition of network performance on large-scale networks is the quadratical growth of the measurement overheads with respect to the number of network nodes, which renders the active probing of all paths infeasible. Thus, a natural idea is to measure a small set of paths and then predict the others where there are no direct measurements. This understanding has motivated numerous research on approaches to network performance prediction.
Commonly, the success of a prediction system is built on its scalability, efficiency, accuracy and practicability. For network performance prediction, two specific requirements have to be met. First, the prediction system should have a decentralized architecture which allows the natural deployment of the system within a networked application. Second, as different performance metrics are useful for different applications, the prediction system should be general and flexible to deal with various metrics in a unified framework.
This thesis presents practical approaches to network performance prediction. There are three main contributions. First, the problem of network performance prediction is formulated as a matrix completion problem where the matrix contains performance measures between network nodes with some of them known and the others unknown and thus to be filled. This new formulation is advantageous in that it is flexible to deal with various metrics in a unified framework, despite their diverse nature. The only requirement is that the matrix to be completed has a low-rank characteristic, which has long been observed in performance matrices constructed from various networks and in various metrics.
Second, the matrix completion problem is solved by a novel approach called Decentralized Matrix Factorization by Stochastic Gradient Descent (DMFSGD). The approach requires neither explicit constructions of matrices nor special nodes such as landmarks and central servers. Instead, by letting network nodes exchange messages with each other, matrix factorization is collaboratively and iteratively achieved at all nodes, with each node equally retrieving a number of measurements. The approach is practical in that it is simple, with no infrastructure, and is computationally lightweight, containing only vector operations.
Third, instead of the conventional representation of exact metric values, this thesis also investigates coarse performance representations including binary classes (The performance is classified into binary classes of either ``good' or ``bad'.) and ordinal ratings (The performance is quantized from 1 star to 5 stars.). Such more qualitative than quantitative measures not only fulfill the requirements of many Internet applications, but also reduce the measurement cost and enable a unified treatment of various metrics. In addition, as both class and rating measures can be nicely integrated in the matrix completion framework, the same DMFSGD approach is applicable for their prediction, with little modification required.
The resulting prediction system has been extensively evaluated on various publicly-available datasets of two kinds of metrics, namely RTT and ABW. These experiments demonstrate not only the scalability and the accuracy of the DMFSGD approach but also its usability in real Internet applications. In addition, the benefits of predicting performance classes and ratings, rather than their actual values, are demonstrated by a use-case study on peer selection, a function that is commonly required in a number of network applications.