Socio-Technical Distillation -- From Micro-Level Responsibilities to Macro-Level Architecture Views

Abstract

The development of large software systems requires thousands of individuals to collaborate. This necessitates a logical decomposition of the system into smaller, manageable pieces, augmented by clearly defined ways of appraising and admitting modifications to the code base. While software architectures and integration processes are established means, neither can be automatically inferred from fundamental technical artefacts, such as source code. Rather, they require a-priori human involvement, judgement, and abstraction. Yet commonly, maintaining the formal description of architectures and process specifications is not a primary concern.

I show that often, open-source projects already contain well-tended micro-level information on code responsibility, and therefore the required human knowledge. In this work, I automatically derive macro-level views of software architectures, enriched with semantically understandable component identifiers without direct human involvement. In this work, I show how to visually track the temporal evolution of the derived macro-level architectural views. I argue that my results form a basis for quantitatively judging quality properties of projects. This is exemplified by applying my methodology to a specific use case, where I assess component viability for safety-critical software and other semi-formal certifications.

Finally, I evaluate my methodology using a carefully crafted mixed-method approach, comprising statistical modelling and analysis, expert-based assessment of results, and targeted interviews with key developers.