Edit on GitHub

DVCLive with DVC

Even though DVCLive does not require DVC, they can integrate in several useful ways:

  • The outputs DVCLive produces can be fed as dvc plots/dvc metrics, making it easier to add metrics logging to DVC stages. Those same outputs can be visualized in DVC Studio
  • You can monitor model performance in realtime with the HTML report that DVCLive generates when used alongside DVC.
  • DVCLive is also capable of generating checkpoint signal files used by DVC experiments

Setup

We will refer to a training script (train.py) already using dvclive:

# train.py

from dvclive import Live

live = Live()

for epoch in range(NUM_EPOCHS):
    train_model(...)
    metrics = evaluate_model(...)

    for metric_name, value in metrics.items():
        live.log(metric_name, value)

    live.next_step()

Let's use dvc stage add to create a stage to wrap this code (don't forget to dvc init first):

$ dvc stage add -n train --live training_metrics
                -d train.py python train.py

dvc.yaml will contain a new train stage with the DVCLive configuration (in the live field):

stages:
  train:
    cmd: python train.py
    deps:
      - train.py
    live:
      training_metrics:
        summary: true
        html: true

The value passed to --live (training_metrics) became the directory path for DVCLive to write logs in, and DVC will now track it. Other supported command options for the DVC integration:

  • --live-no-cache <path> - specify a DVCLive log directory path but don't track it with DVC. Useful if you prefer to track it with Git.
  • --live-no-summary - deactivates summary generation.
  • --live-no-html - deactivates HTML report generation.

Note that summary files are never tracked by DVC

Run the training with dvc repro or dvc exp run:

$ dvc repro train

Outputs

After that's finished, you should see the following content in the project:

$ ls
dvc.lock  training_metrics       train.py
dvc.yaml  training_metrics.json

The .tsv files generated under training_metrics can be visualized with dvc plots.

In addition, training_metrics.json can be used by dvc metrics and visualized with dvc exp show/dvc exp diff.

DVC Studio

DVC Studio will automatically parse the outputs generated by DVCLive, allowing to compare and visualize experiments using DVCLive in DVC Studio.

dvclive studio plots

HTML report

In addition to the outputs described in the Quickstart, DVC generates an HTML report.

If you open training_metrics_dvc_plots/index.html in a browser, you'll see a plot for metrics automatically updated during the model training!

Checkpoints

When used alongside DVC, DVCLive can create checkpoint signal files used by DVC experiments

This will save the metrics, plots, models, etc. associated to each step.

You can learn more about how to use them in the Checkpoints User Guide.

Content

🐛 Found an issue? Let us know! Or fix it:

Edit on GitHub

Have a question? Join our chat, we will help you:

Discord Chat