Workshop: Rediscovering Modularity with Restructure101
tuesday 13.30 - 16.30
Room: Keyboard Cat
The principles of modularity have been applied to engineering projects since Gorak built the wheel, and Thag the barrow of the world’s first wheelbarrow. Thag’s barrow didn’t care that the wheel was first hewn from rock, and later upgraded to a lighter, wooden one, and the same wheel design was reused for the world’s first chariot.
Analogous abstraction techniques are taught in Software Engineering 101 – information hiding, interfaces, clear responsibility, high internal cohesion, low external coupling, etc. We apply these routinely as we develop and continuously refactor the code encapsulated within classes.
However when the number of classes reaches some limit (Bob Martin has suggested 50 KLOC), higher level abstractions are needed in order to manage the complexity of the growing codebase. This limit is usually overshot, and the team is soon drowning in an ocean of classes. It is time to organize the classes into a hierarchy of modules, or watch the team’s frustration continue to rise, and productivity plummet.
“Refactoring” aims to make the code more readable, often with fairly invasive editing of code. This tutorial describes strategies for “restructuring”, where the goal is to make the entire code-base easier to understand, with only light impact on the code logic itself. These strategies have been developed while helping many development teams to restructure their codebases.
The Java package construct is used to realize the new structure, though the same strategies can be applied by using e.g. namespaces in C#, filesystem directories in C/C++, or even to define a structure that is maintained in parallel to the physical code.
Cyclic dependencies dramatically increase the overall connectedness of a code-base, and the construction of an acyclic compositional structure, or “levelization” (Lakos, Knoernschild), is a key first step in modularization, and this is used as a focus for the turorial.
The “top-down” approach aims to retain the existent packaging as far as possible. This involves the removal and reversal of inter-package dependencies so as to remove undesired and “feedback” dependencies. This preserves the team’s familiarity with the package structure, but can be very difficult if the package structure has become excessively tangled.
The alternative is to rebuild the package structure “bottom-up” by identifying cohesive clusters of classes which exhibit relatively low coupling with other clusters, and recursively “wrapping” these into packages. Where the starting package structure is very complex, this can lead to better modularity in a shorter time than the top-down approach.
Both approaches are typically applied to varying degrees; the initial structure is preserved in regions where the inter-package relationships are relatively orderly and the packages encapsulate reasonable design abstractions, and the structure is rebuilt where it is highly complex or provides poor abstraction.
Examples of the strategies that will be covered:
1. It is important to identify any large class tangles (sets of cyclically-dependent classes) early on, since it is not possible to create a levelized package structure for these. If the tangle is not too big, and the classes it contains do not span a very wide range of the ideal dependency levels in a code-base (e.g. tiers), the tangle can be isolated in a single package, otherwise it must be broken up into smaller tangles that can be so isolated.
2. The minimum feedback set (MFS) is the smallest set of dependencies within a graph whose removal would make the graph acyclic. Light-weight dependencies (ones that involve relatively few code references) that are also in the MFS are often accidental, and are good candidates for removal or reversal.
3. An item is only held within a tangle if it is both used by and uses other items in the tangle. This makes items with relatively few incoming or outgoing dependencies easier to release from a tangle.
4. Sometimes an item will contain disconnected sets of sub-items such that each set has different dependencies with external items – dividing such items reduces the granularity of dependence, and can make disentanglement easier.
5. The least invasive changes include the relocation of static methods between classes, or the relocation of classes between packages. For example, moving classes from the top levels of one package into a package at a higher dependency level, or classes in the lower levels of one package down to lower packages, can remove feedback dependencies with a simple change to the import statements. Low impact changes should be tried before higher-impact changes that require more invasive code editing or redesign.
The application of these and other strategies will be illustrated with concrete examples.
He has an MSc. in Computer Science from Trinity College Dublin. He has 28 years of experience in commercial software development, notably on large military and aerospace projects in Canada, including 5 years on the International Space Station project. Co-founder of Headway Software and designer of the JOLT winners Structure101 and Restructure101, he has 2 lovely daughters in college and lives on the south-east coast of Ireland.