The xFusionCorp Industries data science team compares multiple training runs with different hyperparameters using DVC experiments. Run three experiments that vary the n_estimators hyperparameter, identify the best-performing one, and promote it to the tracked workspace.
A project exists at /root/code/fraud-detection/ with a parameterised DVC pipeline already in place. params.yaml contains n_estimators: 100 and the baseline pipeline has been run once.
Run three DVC experiments, each with a different value for n_estimators across a reasonable range (for example 50, 200, and 500). Each experiment should produce a fresh metrics.json.
Compare the experiments and choose the one whose f1_score is the highest.
Apply the chosen experiment to the workspace so its n_estimators, metrics.json, and models/model.pkl become the tracked state.
The DVC extension’s EXPERIMENTS section under the DVC view lists every experiment alongside its parameters and metrics, supports running fresh experiments through the + action, and applies a selected experiment to the workspace from the right-click menu—every operation in this lab can be performed either through the extension UI or with the equivalent dvc exp commands.
To run and compare DVC experiments, follow these steps:
First, ensure you are in the project directory:
cd /root/code/fraud-detection/
Run three DVC experiments with different n_estimators values:
dvc exp run -S n_estimators=50
dvc exp run -S n_estimators=200
dvc exp run -S n_estimators=500
After running the experiments, compare the results using:
dvc exp show
This will display a table of experiments with their parameters and metrics. Identify the experiment with the highest f1_score.
You can sort experiments by f1_score to make the best run easier to spot:
dvc exp show --sort-by metrics.json:f1_score- \
--keep name \
--keep params.yaml:n_estimators \
--keep metrics.json:accuracy \
--keep metrics.json:f1_score
The trailing
-afterf1_scoresorts in descending order, so the highest score appears first.
Once you have identified the best experiment, apply it to the workspace using:
dvc exp apply <experiment_name>
Verify that the selected experiment is now the workspace state:
cat params.yaml
cat metrics.json
dvc status
dvc exp run -S key=value changes a parameter for that experiment only. It does not permanently change the tracked workspace until you run dvc exp apply.n_estimators: 100 already exists in params.yaml, so you do not need to run another experiment with 100 unless you want to reproduce the baseline.f1_score because the problem asks for best f1_score, not best accuracy.dvc exp show --json is useful for automation, but dvc exp show or sorted output is easier for manual comparison.main shows FileNotFoundError for metrics.json, the baseline commit likely does not contain that metric file. Experiments can still have metrics if they generated metrics.json.dvc exp apply, commit the changed tracked files such as params.yaml, metrics.json, and dvc.lock if the lab expects the promoted result to persist in Git.