Reverse Engineering Behavioural Models by Filtering out Utilities from Execution Traces

Title: Reverse Engineering Behavioural Models by Filtering out Utilities from Execution Traces
Authors: Braun, Edna
Date: 2013
Abstract: An important issue in software evolution is the time and effort needed to understand existing applications. Reverse engineering software to recover behavioural models is difficult and is complicated due to the lack of a standardized way of extracting and visualizing knowledge. In this thesis, we study a technique for automatically extracting static and dynamic data from software, filtering and analysing the data, and visualizing the behavioural model of a selected feature of a software application. We also investigate the usefulness of the generated diagrams as documentation for the software. We present a literature review of studies that have used static and dynamic data analysis for software comprehension. A set of criteria is created, and each approach, including this thesis’ technique, is compared using those criteria. We propose an approach to simplify lengthy traces by filtering out software components that are too low level to give a high-level picture of the selected feature. We use static information to identify and remove small and simple (or uncomplicated) software components from the trace. We define a utility method as any element of a program designed for the convenience of the designer and implementer and intended to be accessed from multiple places within a certain scope of the program. Utilityhood is defined as the extent to which a particular method can be considered a utility. Utilityhood is calculated using different combinations of selected dynamic and static variables. Methods with high utilityhood values are detected and removed iteratively. By eliminating utilities, we are left with a much smaller trace which is then visualized using the Use Case Map (UCM) notation. UCM is a scenario language used to specify and explain behaviour of complex systems. By doing so, we can identify the algorithm that generates a UCM closest to the mental model of the designers. Although when analysing our results we did not identify an algorithm that was best in all cases, there is a trend in that three of the best four algorithms (out of a total of eight algorithms investigated) used method complexity and method lines of code in their parameters. We also validated the algorithm results by doing a comparison with a list of methods given to us by the creators of the software and doing precision and recall calculations. Seven out of the eight participants agreed or strongly agreed that using UCM diagrams to visualize reduced traces is valid approach, with none who disagreed.
CollectionThèses, 2011 - // Theses, 2011 -