Supervision : Pascal POIZAT
Dependency Analysis and Evolution in Software Ecosystems
In modern software, small changes can yield big consequences, alike what happened with the left-pad incident. The removal of this 10 lines function from the npm package manager registry caused thousands of projects that were depending on it, directly or indirectly, to break during development or at deployment. The entanglement between software pieces is a well-known reason for complexity in software. This is the case at the lower levels with dependencies between functions, methods, or classes. But higher levels, like software ecosystems, bring more complexity to the picture. The objective of the thesis is to propose new solutions, possibly based on machine learning, supporting software organisations in analysing their software ecosystems and performing evolution at the dependency level. In this direction, the following set of research questions will be addressed. RQ1. What are the quality attributes relevant at the dependency level for the evolution of software ecosystems? RQ2. What are the quality metrics that can be associated to dependency graphs, and how do they relate to the quality attributes in RQ? RQ3. It is possible to define smells / anti-patterns relative to RQ2? RQ4. Do the solutions to questions RQ1–RQ3 support the heterogeneous nature (e.g., multiple languages, presence of both source and DevOps files) of software ecosystems, and, if not, is it possible to extend them? RQ5. Do the solutions to questions RQ1–RQ3 support the dynamic nature of new software architectures (e.g., based on micro-services), and, if not, is it possible to extend them? RQ6. It it possible to apply the solutions to industrial-scale ecosystems? RQ7. What is the perception of practitioners on the solutions? RQ8. It is possible to take into account the human cost (e.g., in relation to developers' expertise) in evolution? RQ9. Can the existing machine learning solutions for code smell detection, technical debt analysis, and quality prediction, be lifted to the analysis of dependency evolution in software ecosystems? RQ10. How can one retrieve training data for the machine learning application? RQ11. Is it possible to automatize dependency evolution?