100-days-mlops-kodekloud

Configure Pre-Commit Hooks for ML Repository

Problem

The xFusionCorp Industries ML team enforces code quality on every commit via pre-commit. A draft .pre-commit-config.yaml exists in the git repository at /root/code/fraud-detection/, but it does not match the team’s standard and pre-commit run --all-files fails against it. Correct the configuration.

  1. A git repository already exists at /root/code/fraud-detection/ with .pre-commit-config.yaml and process.py already tracked. pre-commit is installed system-wide.

  2. The corrected configuration must declare the following five hooks so that pre-commit run --all-files executes every one of them:

    • trailing-whitespace, end-of-file-fixer, and check-yaml – All three sourced from the pre-commit/pre-commit-hooks repository, pinned to a current release;
    • ruff – Sourced from the astral-sh/ruff-pre-commit repository, pinned to a current release;
    • black – Sourced from the psf/black-pre-commit-mirror repository, pinned to a current release.
    • given .pre-commit-config.yaml that needs to be corrected:
     repos:
     - repo: https://github.com/pre-commit/pre-commit-hooks
         rev: v2.3.0
         hooks:
         - id: trailing-whitespace
         - id: end-of-file-fixer
         - id: check_yaml
    
     - repo: https://github.com/charliermarsh/ruff-pre-commit
         rev: v0.1.0
         hooks:
         - id: ruff-lint
    
     - repo: https://github.com/psf/black-pre-commit-mirror
         hooks:
         - id: black
    
  3. Every repository entry in the configuration must include a rev: field.

  4. Review the existing .pre-commit-config.yaml and correct everything that prevents the hooks above from running.

  5. Once the configuration is correct, register the hooks with git and run them against the tracked files:

     pre-commit install
     pre-commit run --all-files
    

Tip: pre-commit autoupdate queries each referenced repository and rewrites the rev: pins to the latest released tag. This is the standard way to discover current versions without looking them up by hand.

Solution

  1. Let’s open the .pre-commit-config.yaml file in editor. We can notice several issues like rev:version is missing, github sources for repo are also wrong, hooks id are wrong, etc.

  2. Let’s update with the correct config;

     repos:
     - repo: https://github.com/pre-commit/pre-commit-hooks
         rev: v6.0.0
         hooks:
         - id: trailing-whitespace
         - id: end-of-file-fixer
         - id: check-yaml
    
     - repo: https://github.com/astral-sh/ruff-pre-commit
         rev: v0.15.13
         hooks:
         - id: ruff
    
     - repo: https://github.com/psf/black-pre-commit-mirror
         rev: 26.5.1
         hooks:
         - id: black
    

    You can find the yaml file here What we have updated here?

    • We have added rev field for each repo with the latest version.
    • We have updated the repo sources to correct github urls.
    • We have updated the hooks id to correct ones.
  3. Now we can install the hooks and run them:

     cd /root/fraud-detection
     pre-commit install
     pre-commit run --all-files
    

    If everything is correct, it should run all hooks successfully without any error. You will get like below result

     pre-commit run --all-files
     trim trailing whitespace.................................................Passed
     fix end of files.........................................................Passed
     check yaml...............................................................Passed
     ruff check...............................................................Passed
     ruff format..............................................................Passed
     black....................................................................Passed
    
  4. If you see any version specific error, make sure you have run the following command:

     pre-commit autoupdate
    

    It will update the rev field to the latest version for each repo.

  5. Tips: Go to each github repo, and you will find more details for specific repo. Like how we can configure the repo, available hooks id, and latest release version, etc.