• HOME
  • SHOWCASE
  • USER GUIDE
  • DEVELOPER GUIDE
  • ABOUT
  • CONTACT
  • Appendix: Config files format

    Given below are the details of the various config files used by RepoSense.

    RepoSense ignores the first row (i.e., column headings) of CSV config files. It is used simply to provide more information to human readers. This also means the columns in your config files should be in the exact order specified here.

    repo-config.csv

    repo-config.csv file contains repo-level config data. Each row represents a repository's configuration.

    Here is an example:

    Repository's Location Branch File formats Ignore Glob List Ignore standalone config Ignore Commits List Ignore Authors List
    https://github.com/foo/bar.git master override:java;css test/** yes 2fb6b9b2dd9fa40bf0f9815da2cb0ae8731436c7;c5a6dc774e22099cd9ddeb0faff1e75f9cf4f151 Alice

    When using standalone config (if it is not ignored), it is possible to override specific values from the standalone config by prepending the entered value with override:.


    Column Name Explanation
    Repository's Location mandatory The GitHub URL or Disk Path to the git repository e.g., https://github.com/foo/bar.git or C:\Users\user\Desktop\GitHub\foo\bar
    Branch The branch to analyze in the target repository e.g., master. Default: the default branch of the repo
    File formats*+ The file extensions to analyze. Default: all file formats
    Ignore Glob List*+ The list of file path globs to ignore during analysis for each author. e.g., test/**;temp/**
    Ignore standalone config To ignore the standalone config file (if any) in target repository, enter yes. If the cell is empty, the standalone config file in the repo (if any) will take precedence over configurations provided in the csv files.
    Ignore Commit List*+ The list of commits to ignore during analysis. For accurate results, the commits should be provided with their full hash.
    Ignore Authors List*+ The list of authors to ignore during analysis. Authors should be specified by their Git Author Name.

    * Multi-value column: multiple values can be entered in this column using a semicolon ; as the separator. + Overrideable column: prepend with override: to use entered value(s) instead of value(s) from standalone config.

    author-config.csv

    Optionally, you can use a author-config.csv (which should be in the same directory as repo-config.csv file) to provide more details about the authors to analyze (example). It should contain the following columns:

    Column Name Explanation
    Repository's Location Same as repo-config.csv. Default: all the repos in repo-config.csv
    Branch The branch to analyze for this author e.g., master. Default: the author will be bound to all the repos in repo-config.csv that has the same repo's location, irregardless of branch
    Author's GitHub ID mandatory GitHub username of the target author e.g., JohnDoe
    Author's Emails* Associated Github emails of the author. This can be found in your GitHub settings.
    Author's Display Name The name to display for the author. Default: author's GitHub username.
    Author's Git Author Name* The meaning of Git Author Name is explained in A note about git author name.
    Ignore Glob List* Files to ignore for this author, in addition to files ignored by the patterns specified in repo-config.csv

    * Multi-value column: multiple values can be entered in this column using a semicolon ; as the separator.

    If author-config.csv is not given and the repo has not provide author details in a standalone config file, all the authors of the repositories within the date range specified (if any) will be analyzed.

    group-config.csv

    Optionally, you can provide a group-config.csv(which should be in the same directory as repo-config.csv file) to provide details on any custom groupings for files in specified repositories (example). It should contain the following columns:

    Column Name Explanation
    Repository's Location Same as repo-config.csv. Default: all the repos in repo-config.csv
    Group Name mandatory Name of the group e.g.,test.
    Globs * mandatory The list of file path globs to include for specified group. e.g.,**/test/*;**.java.

    * Multi-value column: multiple values can be entered in this column using a semicolon ; as the separator.

    Note that a file in a given repository should only be tagged to one group.
    e.g.: example.java in example-repo can either be in test group or in code group, but not in both test and code group. If multiple groups are specified for a given file, the latter group (i.e.: code group) is set for the file.

    config.json (standalone config file)

    Repo owners can provide the following additional information to RepoSense using a config file that we call the standalone config file:

    • which files/authors/commits to analyze/omit
    • which git and GitHub usernames belong to which authors
    • the display of an author

    To use this feature, add a _reposense/config.json to the root of your repo using the format in the example below (another example) and commit it (reason: RepoSense can see committed code only):

    {
    "ignoreGlobList": ["about-us/**", "**index.html"],
    "formats": ["html", "css"],
    "ignoreCommitList": ["90018e49f129ce7e0abdc8b18e91c9813588c601", "67890def"],
    "ignoreAuthorList": ["charlie"],
    "authors":
    [
    {
    "githubId": "alice",
    "emails": ["alice@example.com", "alicet@example.com"],
    "displayName": "Alice T.",
    "authorNames": ["AT", "A"],
    "ignoreGlobList": ["**.css"]
    },
    {
    "githubId": "bob"
    }
    ]
    }

    Note: all fields are optional unless specified otherwise.

    Fields to provide repository-level info:

    • ignoreGlobList: Folders/files to ignore, specified using the glob format.
    • formats: File formats to analyze. Default: all file formats
    • ignoreCommitList: The list of commits to ignore during analysis. For accurate results, the commits should be provided with their full hash.
    • ignoreAuthorList: The list of authors to ignore during analysis. Authors specified in authors field or author-config.csv will be also be omitted if they are in this list. Authors should be specified by their Git Author Name.

    Fields to provide author-level info:
    Note: authors field should contain all authors that should be captured in the analysis.

    • githubId: GitHub username of the author. mandatory field.
    • emails: Associated GitHub emails of the author. This can be found in your GitHub settings.
    • displayName: Name to display on the report for this author.
    • authorNames: Git Author Name(s) used in the author's commits. By default RepoSense assumes an author would use her GitHub username as the Git username too. The meaning of Git Author Name is explained in A note about git author name.
    • ignoreGlobList: Additional (i.e. on top of the repo-level ignoreGlobList) folders/files to ignore for a specific author . In the example above, the actual ignoreGlobList for alice would be ["about-us/**", "**index.html", "**.css"]

    To verify your standalone configuration is as intended, add the _reposense/config.json to your local copy of repo and run RepoSense against it as follows:

    • Format : java -jar RepoSense.jar --repo LOCAL_REPO_LOCATION
    • Example: java -jar RepoSense.jar --repo c:/myRepose/foo/bar
      After that, view the report to see if the configuration you specified in the config file is being reflected correctly in the report.

    A note about git author name

    Git Author Name refers to the customizable author's display name set in the local .gitconfig file. For example, in the Git Log's display:

    ...
    commit cd7f610e0becbdf331d5231887d8010a689f87c7
    Author: ConfiguredAuthorName <author@example.com>
    Date: Fri Feb 9 19:14:41 2018 +0800

    Make some changes to show my new author's name

    commit e3f699fd4ef128eebce98d5b4e5b3bb06a512f49
    Author: ActualGitHubId <author@example.com>
    Date: Fri Feb 9 19:13:13 2018 +0800

    Initial commit
    ...

    ActualGitHubId and ConfiguredAuthorName are both Git Author Name of the same author.
    To find the author name that you are currently using for your current git repository, run the following command within your git repository:

    git config user.name

    To set the author name to the value you want (e.g., to set it to your GitHub username) for your current git repository, you can use the following command (more info):

    git config user.name "YOUR_AUTHOR_NAME”

    To set the author name to use a default value you want for future git repositories, you can use the following command:

    git config --global user.name "YOUR_AUTHOR_NAME”

    RepoSense expects the Git Author Name to be the same as author's GitHub username. If an author's Git Author Name is different from her GitHub ID, the Git Author Name needs to be specified in the standalone config file. If the author has more than one Git Author Name, multiple values can be entered too.

    Note: Symbols such as ", !, / etc. in your author name will be omitted, which may reduce the accuracy of the analysis if 2 names in the repository are approximately similar.