100-days-mlops-kodekloud

Log an ML Experiment to MLflow

Problem

A xFusionCorp Industries data scientist needs a training run recorded in MLflow so the team has a baseline record on the tracking dashboard. The non-MLflow scaffolding has already been written at /root/code/log_experiment.py; the MLflow logging calls are left as TODO blocks. Your task is to complete the script so that every element of the run is captured by the MLflow tracking server.

  1. The MLflow tracking server is already running on port 5000. The MLflow UI button at the top of the lab can be opened to view the dashboard; the Default experiment is present on first load.

  2. /root/code/log_experiment.py can be opened in the VS Code editor. The script prepares a params dictionary, fits a trivial sklearn model, and advertises a pair of synthetic evaluation scores (accuracy and f1). Three blocks marked # TODO inside the mlflow.start_run() context are the only edits required.

  3. Execute the script once (python3 /root/code/log_experiment.py) after the TODOs are completed. The end state must include:

    • Exactly one new run in the Default experiment.
    • Every hyperparameter in the params dict (n_estimators=100, max_depth=5, random_state=42) recorded as a run parameter.
    • Both advertised scores (accuracy, f1_score) recorded as run metrics.
    • The sklearn model captured as an MLflow model artefact on the run.

The result can be confirmed in the MLflow UI—once the run is opened, the Parameters, Metrics, and Artifacts panels each show the expected content.

Solution

  1. Add the following python lines under the TODO section:

     # TODO 1
     mlflow.log_params(params)
     # TODO 2
     mlflow.log_metric("accuracy", accuracy)
     mlflow.log_metric("f1_score", f1)
     # TODO 3
     mlflow.sklearn.log_model(model, name="model")
    

    See the full log_experiment.py script

  2. Execute the log_experiment.py:

     python3 /root/code/log_experiment.py
    
  3. Click on the MLflow UI button to open the dashboard and confirm that the run is logged with the expected parameters, metrics, and model artefact.

Good to Know?