Search Swinburne Research Bank
Please use this identifier to cite or link to this item: http://hdl.handle.net/1959.3/190177
- Title
- Data files used to study change dynamics in software systems
- Author(s)
- Vasa, Rajesh
- Abstract
- It is a widely accepted fact that evolving software systems change and grow. However, it is less well-understood how change is distributed over time, specifically in object oriented software systems. The patterns and techniques used to measure growth permit developers to identify specific releases where significant change took place as well as to inform them of the longer term trend in the distribution profile. This knowledge assists developers in recording systemic and substantial changes to a release, as well as to provide useful information as input into a potential release retrospective. However, these analysis methods can only be applied after a mature release of the code has been developed. But in order to manage the evolution of complex software systems effectively, it is important to identify change-prone classes as early as possible. Specifically, developers need to know where they can expect change, the likelihood of a change, and the magnitude of these modifications in order to take proactive steps and mitigate any potential risks arising from these changes. Previous research into change-prone classes has identified some common aspects, with different studies suggesting that complex and large classes tend to undergo more changes and classes that changed recently are likely to undergo modifications in the near future. Though the guidance provided is helpful, developers need more specific guidance in order for it to be applicable in practice. Furthermore, the information needs to be available at a level that can help in developing tools that highlight and monitor evolution prone parts of a system as well as support effort estimation activities. The specific research questions that we address in this chapter are: (1) What is the likelihood that a class will change from a given version to the next? (a) Does this probability change over time? (b) Is this likelihood project specific, or general? (2) How is modification frequency distributed for classes that change? (3) What is the distribution of the magnitude of change? Are most modifications minor adjustments, or substantive modifications? (4) Does structural complexity make a class susceptible to change? (5) Does popularity make a class more change-prone? We make recommendations that can help developers to proactively monitor and manage change. These are derived from a statistical analysis of change in approximately 55000 unique classes across all projects under investigation. The analysis methods that we applied took into consideration the highly skewed nature of the metric data distributions. The raw metric data (4 .txt files and 4 .log files in a .zip file measuring ~2MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).
- Publication type
- Research dataset
- Research centre
- Swinburne University of Technology. Faculty of Information and Communication Technologies
- Source
- Originally presented as an appendix to: Vasa, R. (2010). Growth and change dynamics in open source software systems. PhD thesis, Appendix F: Change Dynamics Data Files, p. 205
- Publication year
- 2010
- FOR Code(s)
- 080306 Open Software; 080309 Software Engineering; 8902 Computer Software and Services
- Keyword(s)
- Metrics; Open source software; PhD theses completed in 2010; Software evolution; Software engineering; Software maintenance
- Publisher
- Faculty of Information and Communication Technologies, Swinburne University of Technology
- Coverage
- Australasian Digital Theses collection
- Publisher URL
- http://www.ict.swin.edu.au/personal/rvasa/thesis/data/
- Publisher URL
- http://hdl.handle.net/1959.3/95058
- Copyright
- Copyright © 2010 Rajesh Vasa. The files are made available here with the kind permission of the creator under the terms of a Creative Commons Attribution 3.0 Unported (CC BY 3.0) licence (http://creativecommons.org/licenses/by/3.0/). The full thesis is available from: http://hdl.handle.net/1959.3/95058.
- Thesis Supervisor
- [Jean-Guy Schneider]
- Thesis Note
- [This research dataset accompanies a thesis submitted for the degree of Doctor of Philosophy, Swinburne University of Technology, 2010.]
- Full text



