Studying Developer Coordination Patterns in OS Distributions

The success of open source (OS) projects like the Linux kernel and the Apache web server is not only due to their high technical quality, but also because of the unique way in which these projects are delivered to end users. OS distributions like Debian, Ubuntu and countless derivatives customize and bundle thousands of OS projects to make them readily available to millions of end users worldwide.

Distributions play in fact the role of middlemen between “downstream” end users and “upstream” OS projects. This means that distributions do not just deliver software to end users, they also (and especially) filter the user feedback for the upstream projects. For example, if users report a bug or have a question on how to use a particular software project, they typically first communicate with the distribution.

However, since distributions typically only have limited man power and do not have the same level of in-depth expertise about a software project as the original developers, they need to coordinate at some point with the upstream developers. At other times, distributions do not directly contact the upstream project, but first coordinate with the parent distribution they are based on (e.g., Ubuntu’s parent is Debian). Again, the parent distribution can decide to handle a particular piece of feedback itself or to forward it upstream.

Given the many different ways in which end users, OS distributions and upstream OS projects need to coordinate, there is a clear need to analyze the various communication paths and improve their effectiveness. This would decrease the time to fix bugs, increase software quality and user satisfaction, and reduce redundant maintenance effort.

Hence, this project aims to analyze and document the patterns of coordination in OS distributions. In particular, we will mine data from the bug repositories, support fora and mailing list messages of Debian and Ubuntu distributions, and 10 to 20 major OS projects, then analyze this data in both a quantitative and qualitative way. The quantitative analysis will allow us to measure how much feedback is processed by a distribution, its parent distribution and the upstream projects. The qualitative analysis will allow us to study how feedback is treated, i.e., what are the best and worst practices for propagating feedback to the parent distribution? What kinds of tools and methodologies are needed? What are the possible risks?

The output of the project will be a catalogue of patterns that describe the best practices for coordinating user feedback between end users, distributions and upstream projects. Such a catalogue will be beneficial to various stakeholders. First, end users, distributions and upstream projects will better understand each other’s needs, enabling them to reduce the turnaround time of bugs and support requests. Second, new volunteers of a distribution or upstream project can get started more quickly. Third, similar ecosystems like Facebook apps or Eclipse plugins can leverage the experience gathered by distributions to improve their quality.

Faculty Supervisor:

Bram Adams


Javier Rosales Tovar



Computer science





Current openings

Find the perfect opportunity to put your academic skills and knowledge into practice!

Find Projects