Search Swinburne Research Bank
Please use this identifier to cite or link to this item: http://hdl.handle.net/1959.3/190170
- Title
- Data files used to study the distribution of growth in software systems
- Author(s)
- Vasa, Rajesh
- Abstract
- The evolution of a software system can be studied in terms of how various properties as reflected by software metrics change over time. Current models of software evolution have allowed for inferences to be drawn about certain attributes of the software system, for instance, regarding the architecture, complexity and its impact on the development effort. However, an inherent limitation of these models is that they do not provide any direct insight into where growth takes place. In particular, we cannot assess the impact of evolution on the underlying distribution of size and complexity among the various classes. Such an analysis is needed in order to answer questions such as 'do developers tend to evenly distribute complexity as systems get bigger?', and 'do large and complex classes get bigger over time?'. These are questions of more than passing interest since by understanding what typical and successful software evolution looks like, we can identify anomalous situations and take action earlier than might otherwise be possible. Information gained from an analysis of the distribution of growth will also show if there are consistent boundaries within which a software design structure exists. The specific research questions that we address in Chapter 5 (Growth Dynamics) of the thesis this data accompanies are: What is the nature of distribution of software size and complexity measures? How does the profile and shape of this distribution change as software systems evolve? Is the rate and nature of change erratic? Do large and complex classes become bigger and more complex as software systems evolve? In our study of metric distributions, we focused on 10 different measures that span a range of size and complexity measures. In order to assess assigned responsibilities we use the two metrics Load Instruction Count and Store Instruction Count. Both metrics provide a measure for the frequency of state changes in data containers within a system. Number of Branches, on the other hand, records all branch instructions and is used to measure the structural complexity at class level. This measure is equivalent to Weighted Method Count (WMC) as proposed by Chidamber and Kemerer (1994) if a weight of 1 is applied for all methods and the complexity measure used is cyclomatic complexity. We use the measures of Fan-Out Count and Type Construction Count to obtain insight into the dynamics of the software systems. The former offers a means to document the degree of delegation, whereas the latter can be used to count the frequency of object instantiations. The remaining metrics provide structural size and complexity measures. In-Degree Count and Out-Degree Count reveal the coupling of classes within a system. These measures are extracted from the type dependency graph that we construct for each analyzed system. The vertices in this graph are classes, whereas the edges are directed links between classes. We associate popularity (i.e., the number of incoming links) with In-Degree Count and usage or delegation (i.e., the number of outgoing links) with Out-Degree Count. Number of Methods, Public Method Count, and Number of Attributes define typical object-oriented size measures and provide insights into the extent of data and functionality encapsulation. The raw metric data (4 .txt files and 1 .log file in a .zip file measuring ~0.5MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).
- Publication type
- Research dataset
- Research centre
- Swinburne University of Technology. Faculty of Information and Communication Technologies
- Source
- Originally presented as an appendix to: Vasa, R. (2010). Growth and change dynamics in open source software systems. PhD thesis, Appendix E: Growth Dynamics Data Files, p. 204
- Publication year
- 2010
- FOR Code(s)
- 080306 Open Software; 080309 Software Engineering; 8902 Computer Software and Services
- Keyword(s)
- Metrics; Open source software; PhD theses completed in 2010; Software evolution; Software engineering; Software maintenance
- Publisher
- Faculty of Information and Communication Technologies, Swinburne University of Technology
- Coverage
- Australasian Digital Theses collection
- Publisher URL
- http://www.ict.swin.edu.au/personal/rvasa/thesis/data/
- Publisher URL
- http://hdl.handle.net/1959.3/95058
- Copyright
- Copyright © 2010 Rajesh Vasa. The files are made available here with the kind permission of the creator under the terms of a Creative Commons Attribution 3.0 Unported (CC BY 3.0) licence (http://creativecommons.org/licenses/by/3.0/). The full thesis is available from: http://hdl.handle.net/1959.3/95058.
- Thesis Supervisor
- [Jean-Guy Schneider]
- Thesis Note
- [This research dataset accompanies a thesis submitted for the degree of Doctor of Philosophy, Swinburne University of Technology, 2010.]
- Full text



