Figure 1. Overall architecture of RepoSense
Parser
contains three components:
ArgsParser
: Parses the user-supplied command line arguments into a CliArguments
object. RunConfigurationDecider
then gets the appropriate RunConfiguration
for the CliArguments
which generates the appropriate Config
files.CsvParser
: Abstract generic class for CSV parsing functionality. The following three classes extend CsvParser
.
AuthorConfigCsvParser
: Parses the author-config.csv
config file into a list of AuthorConfiguration
for each repository to analyze.GroupConfigCsvParser
: Parses the group-config.csv
config file into a list of GroupConfiguration
for each repository to analyze.RepoConfigCsvParser
: Parses the repo-config.csv
config file into a list of RepoConfiguration
for each repository to analyze.JsonParser
: Abstract generic class for JSON parsing functionality. The following class extends JsonParser
class:
StandaloneConfigJsonParser
: Parses the _reposense/config.json
config file into a StandaloneConfig
.Git
package contains the wrapper classes for respective git commands.
GitBlame
: Wrapper class for git blame
functionality. Traces the revision and author last modified each line of a file.GitBranch
: Wrapper class for git branch
functionality. Gets the name of the working branch of the target repo.GitCatFile
: Wrapper class for git cat-file
functionality. Obtains the parent commit hash with the given commit indicated by the commit hash.GitCheckout
: Wrapper class for git checkout
functionality. Checks out the repository by branch name or commit hash.GitClone
: Wrapper class for git clone
functionality. Clones the repository from the given URL or local directory into a temporary folder in order to run the analysis.GitDiff
: Wrapper class for git diff
functionality. Obtains the changes between commits.GitLog
: Wrapper class for git log
functionality. Obtains the commit logs and the authors' info.GitRevList
: Wrapper class for git rev-list
functionality. Retrieves the commit objects in reverse chronological order.GitRevParse
: Wrapper class for git rev-parse
functionality. Ensures that the branch of the repo is to be analyzed exists.GitShortlog
: Wrapper class for git shortlog
functionality. Obtains the list of authors who have contributed to the target repo.GitShow
: Wrapper class for git show
functionality. Gets the date of the commit with the commit hash.GitUtil
: Contains helper functions used by the other Git classes above.GitVersion
: Wrapper class for git --version
functionality. Obtains the current Git version of the environment that RepoSense is being run on.Note that when constructing new commands containing path arguments, use the StringsUtil::addQuotesForFilePath
method to safely convert a Java string into an equivalent Bash/CMD argument.
CommitsReporter
is responsible for analyzing the commit history and generating a CommitContributionSummary
for each repository. CommitContributionSummary
contains information such as each author's daily and weekly contribution and the variance of their contribution. CommitsReporter
CommitInfoExtractor
to run the git log
command, which generates each commit's statistics within the date range.CommitInfo
for each commit, which contains the infoLine
and statLine
.CommitInfoAnalyzer
to extract the relevant data from CommitInfo
into a CommitResult
, such as the number of line insertions and deletions in the commit and the author of the commit.CommitResultAggregator
to aggregate all CommitResult
into a CommitContributionSummary
.AuthorshipReporter
is responsible for analyzing the whitelisted files, traces the original author for each line of text/code, and generating an AuthorshipSummary
for each repository. AuthorshipSummary
contains the analysis results of the whitelisted files and the number of line contributions each author made. AuthorshipReporter
FileInfoExtractor
to traverse the repository to find all relevant files.FileInfo
for each relevant file, which contains the path to the file and a list of LineInfo
representing each line of the file.FileInfoAnalyzer
to analyze each file, using git blame
or annotations, and finds the Author
for each LineInfo
.FileResult
for each file, which consolidates the authorship results into a Map of each author's line contribution to the file.FileResultAggregator
to aggregate all FileResult
into an AuthorshipSummary
.GitClone
API in a multi-threaded fashion.
--cloning-threads <threads>
.CommitReporter
and AuthorshipReporter
in a multi-threaded fashion.
CommitReporter
and AuthorshipReporter
to produce the commit and authorship summary, respectively.--analysis-threads <threads>
.JSON
files needed to generate the HTML
report.System
contains the classes that interact with the Operating System and external processes.
CommandRunner
creates processes that execute commands on the terminal. It consists of many git commands.LogsManager
uses the java.util.logging
package for logging. The LogsManager
class is used to manage the logging levels and logging destinations. Log messages are output through: Console
and to a .log
file.ReportServer
starts a server to display the report on the browser. It depends on the net.freeutils.httpserver
package.Model
holds the data structures that are commonly used by the different aspects of RepoSense.
Author
stores the Git ID
of an author. Any contributions or commits made by the author, using his/her Git ID
or aliases, will be attributed to the same Author
object. AuthorshipReporter
and CommitsReporter
use it to attribute the commit and file contributions to the respective authors.CliArguments
stores the parsed command-line arguments supplied by the user. It contains the configuration settings such as the CSV config file to read from, the directory to output the report to, and the date range of commits to analyze. These configuration settings are passed into RepoConfiguration
.FileTypeManager
stores the file format to be analyzed and the custom groups specified by the user for any repository.RepoConfiguration
stores the configuration information from the CSV config file for a single repository: the repository's organization, name, branch, list of authors to analyze, date range to analyze commits, and files from CliArguments
.
This configuration information is used by:
GitClone
to determine the location to clone the repository from and which branch to check out to.AuthorshipReporter
and CommitsReporter
to determine the range of commits and files to analyze.ReportGenerator
to determine the directory to output the report.