MLflow
Overview
MLflow is an open source platform to manage the Machine Learning (ML) lifecycle, including experimentation, reproducibility, deployment, and a central model registry. To learn more about MLflow, please refer to its documentation.
Prerequisites
In order to use MLflow on Summit, load the module as shown below:
$ module load workflows
$ module load mlflow/1.22.0
Run the following command to verify that MLflow is available:
$ mlflow --version
mlflow, version 1.22.0
Hello world!
To run this MLflow demo on Summit, you will create a directory with two files and then submit a batch job to LSF from a Summit login node.
First, create a directory mlflow-example
to contain two files. The first will be
named MLproject
:
name: demo
entry_points:
main:
command: "python3 demo.py"
The second will be named demo.py
:
import mlflow
print("MLflow Version:", mlflow.version.VERSION)
print("Tracking URI:", mlflow.tracking.get_tracking_uri())
with mlflow.start_run() as run:
print("Run ID:", run.info.run_id)
print("Artifact URI:", mlflow.get_artifact_uri())
with open("hello.txt", "w") as f:
f.write("Hello world!")
mlflow.log_artifact("hello.txt")
Finally, create an LSF batch script called mlflow_demo.lsf
, and
change abc123
to match your own project identifier:
#BSUB -P abc123
#BSUB -W 10
#BSUB -nnodes 1
#BSUB -J mlflow_demo
#BSUB -o mlflow_demo.o%J
#BSUB -e mlflow_demo.e%J
module load git
module load workflows
module load mlflow/1.22.0
jsrun -n 1 mlflow run ./mlflow-example --no-conda
Finally, submit the batch job to LSF by executing the following command from a Summit login node:
$ bsub mlflow_demo.lsf
Congratulations! Once the job completes, you will be able to check the standard output files to find the tracking and artifact directories.