Skip to content
Edit on GitHub

DVCLive with DVC

Even though DVCLive does not require DVC, they can integrate in a couple useful ways:

Setup

We will refer to a training script (train.py) already using dvclive:

If you use one of the supported ML Frameworks, you can jump to its corresponding page to find an example usage.

# train.py

from dvclive import Live

live = Live("training_metrics")

for epoch in range(NUM_EPOCHS):
    train_model(...)
    metrics = evaluate_model(...)

    for metric_name, value in metrics.items():
        live.log(metric_name, value)

    live.next_step()

Let's use dvc stage add to create a stage to wrap this code (don't forget to dvc init first):

$ dvc stage add \
--name train \
--deps train.py \
--metrics-no-cache training_metrics.json \
--plots-no-cache training_metrics/scalars \
python train.py

Note that the paths indicated in the metrics and plots options match the path passed to Live() in the Python code ("training_metrics").

dvc.yaml will contain a new train stage using the DVCLive outputs as dvc metrics and dvc plots:

stages:
  train:
    cmd: python train.py
    deps:
      - train.py
    metrics:
      - training_metrics.json:
          cache: false
    plots:
      - training_metrics/scalars:
          cache: false

Run the training with dvc repro or dvc exp run:

$ dvc repro train

Outputs

After that's finished, you should see the following content in the project:

$ tree
├── dvc.lock
├── dvc.yaml
├── training_metrics
│   ├── report.html
│   └── scalars
│       ├── acc.tsv
│       └── loss.tsv
├── training_metrics.json
└── train.py

The metrics summary in training_metrics.json can be used by dvc metrics and visualized with dvc exp show/dvc exp diff.

The metrics history training_metrics/scalars can be visualized with dvc plots.

The HTML report in training_metrics/report.html will contain all the logged data and will be automatically updated during training on each step update!

If you don't update the step number, the HTML report won't be generated unless you call Live.make_report() directly.

Iterative Studio

Iterative Studio will automatically parse the outputs generated by DVCLive, allowing to compare and visualize experiments using DVCLive in Iterative Studio.

dvclive studio plots

Checkpoints

When used alongside DVC, DVCLive can create checkpoint signal files used by DVC experiments

This will save all the outputs (metrics, plots, models, etc.) associated to each step.

You can learn more about how to use them in the Checkpoints User Guide.

If you don't update the step number, checkpoints won't be created.

Content

🐛 Found an issue? Let us know! Or fix it:

Edit on GitHub

Have a question? Join our chat, we will help you:

Discord Chat