Edit on GitHub

Get Started: Experiments

New in DVC 2.0

Experiments proliferate quickly in ML projects where there are many parameters to tune or other permutations of the code. We can organize such projects and keep only what we ultimately need with dvc experiments. DVC can track experiments for you so there's no need to commit each one to Git. This way your repo doesn't become polluted with all of them. You can discard experiments once they're no longer needed.

๐Ÿ“– See Experiment Management for more information on DVC's approach.

Running experiments

Previously, we learned how to tune ML pipelines and compare the changes. Let's further increase the number of features in the featurize stage to see how it compares.

dvc exp run makes it easy to change hyperparameters and run a new experiment:

$ dvc exp run --set-param featurize.max_features=3000
๐Ÿ’ก Expand to see what happens under the hood.

dvc exp run is similar to dvc repro but with some added conveniences for running experiments. The --set-param (or -S) flag sets the values for parameters

Check that the featurize.max_features value has been updated in params.yaml:

 featurize:
-  max_features: 1500
+  max_features: 3000

Any edits to dependencies (parameters or source code) will be reflected in the experiment run.

dvc exp diff compares experiments:

$ dvc exp diff
Path         Metric    Value    Change
scores.json  avg_prec  0.56191  0.009322
scores.json  roc_auc   0.93345  0.018087

Path         Param                   Value    Change
params.yaml  featurize.max_features  3000     1500

Queueing experiments

So far, we have been tuning the featurize stage, but there are also parameters for the train stage (which trains a random forest classifier).

These are the train parameters from params.yaml:

train:
  seed: 20170428
  n_est: 50
  min_split: 2

Let's set up experiments with different hyperparameters. We can use the --queue flag to define all the combinations we want to try without executing anything (yet):

$ dvc exp run --queue -S train.min_split=8
Queued experiment 'd3f6d1e' for future execution.
$ dvc exp run --queue -S train.min_split=64
Queued experiment 'f1810e0' for future execution.
$ dvc exp run --queue -S train.min_split=2 -S train.n_est=100
Queued experiment '7323ea2' for future execution.
$ dvc exp run --queue -S train.min_split=8 -S train.n_est=100
Queued experiment 'c605382' for future execution.
$ dvc exp run --queue -S train.min_split=64 -S train.n_est=100
Queued experiment '0cdee86' for future execution.

Next, run all (--run-all) queued experiments in parallel (using --jobs):

$ dvc exp run --run-all --jobs 2

Comparing many experiments

To compare all of these experiments, we need more than diff. dvc exp show compares any number of experiments in one table:

$ dvc exp show --no-timestamp \
               --include-params train.n_est,train.min_split
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ neutral:**Experiment**    โ”ƒ metric:**avg_prec** โ”ƒ metric:**roc_auc** โ”ƒ param:**train.n_est**โ”ƒ param:**train.min_split** โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ **workspace**     โ”‚  **0.56191** โ”‚ **0.93345** โ”‚ **50**         โ”‚ **2**               โ”‚
โ”‚ **master**        โ”‚  **0.55259** โ”‚ **0.91536** โ”‚ **50**         โ”‚ **2**               โ”‚
โ”‚ โ”œโ”€โ”€ exp-bfe64 โ”‚  0.57833 โ”‚ 0.95555 โ”‚ 50         โ”‚ 8               โ”‚
โ”‚ โ”œโ”€โ”€ exp-b8082 โ”‚  0.59806 โ”‚ 0.95287 โ”‚ 50         โ”‚ 64              โ”‚
โ”‚ โ”œโ”€โ”€ exp-c7250 โ”‚  0.58876 โ”‚ 0.94524 โ”‚ 100        โ”‚ 2               โ”‚
โ”‚ โ”œโ”€โ”€ exp-b9cd4 โ”‚  0.57953 โ”‚ 0.95732 โ”‚ 100        โ”‚ 8               โ”‚
โ”‚ โ”œโ”€โ”€ exp-98a96 โ”‚  0.60405 โ”‚  0.9608 โ”‚ 100        โ”‚ 64              โ”‚
โ”‚ โ””โ”€โ”€ exp-ad5b1 โ”‚  0.56191 โ”‚ 0.93345 โ”‚ 50         โ”‚ 2               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Each experiment is given an arbitrary name by default (although we can specify one with dvc exp run -n.) We can see that exp-98a96 performed best among both of our metrics, with 100 estimators and a minimum of 64 samples to split a node.

See dvc exp show --help for more info on its options.

Persisting experiments

Now that we know the best parameters, let's keep that experiment and ignore the rest.

dvc exp apply rolls back the workspace

$ dvc exp apply exp-98a96
Changes for experiment 'exp-98a96' have been applied to your workspace.
๐Ÿ’ก Expand to see what happens under the hood.

dvc exp apply is similar to dvc checkout, but works with experiments instead. DVC tracks everything in the pipeline for each experiment (parameters, metrics, dependencies, and outputs), retrieving things later as needed.

Check that scores.json reflects the metrics in the table above:

{ "avg_prec": 0.6040544652105823, "roc_auc": 0.9608017142900953 }

Once an experiment has been applied to the workspace, it is no different from reproducing the result without dvc exp run. Let's make it persistent in our regular pipeline by committing it in our Git branch:

$ git add dvc.lock params.yaml prc.json roc.json scores.json
$ git commit -a -m "Preserve best random forest experiment"

Sharing experiments

After committing the best experiments to our Git branch, we can store and share them remotely like any other iteration of the pipeline.

dvc push
git push
๐Ÿ’ก Important information on storing experiments remotely.

The commands in this section require both a dvc remote default and a Git remote. A DVC remote stores the experiment data, and a Git remote stores the code, parameters, and other metadata associated with the experiment. DVC supports various types of remote storage (local file system, SSH, Amazon S3, Google Cloud Storage, HTTP, HDFS, etc.). The Git remote is often a central Git server (GitHub, GitLab, BitBucket, etc.).

Experiments that have not been made persistent will not be stored or shared remotely through dvc push or git push.

dvc exp push enables storing and sharing any experiment remotely.

$ dvc exp push gitremote exp-bfe64
Pushed experiment 'exp-bfe64' to Git remote 'gitremote'.

dvc exp list shows all experiments that have been saved.

$ dvc exp list gitremote --all
72ed9cd:
        exp-bfe64

dvc exp pull retrieves the experiment from a Git remote.

$ dvc exp pull gitremote exp-bfe64
Pulled experiment 'exp-bfe64' from Git remote 'gitremote'.

All these commands take a Git remote as an argument. A dvc remote default is also required to share the experiment data.

Cleaning up

Let's take another look at the experiments table:

$ dvc exp show --no-timestamp \
               --include-params train.n_est,train.min_split
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ neutral:**Experiment** โ”ƒ metric:**avg_prec** โ”ƒ metric:**roc_auc** โ”ƒ param:**train.n_est**โ”ƒ param:**train.min_split** โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ **workspace**  โ”‚  **0.60405** โ”‚  **0.9608** โ”‚ **100**        โ”‚ **64**              โ”‚
โ”‚ **master**     โ”‚  **0.60405** โ”‚  **0.9608** โ”‚ **100**        โ”‚ **64**              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Where did all the experiments go? By default, dvc exp show only shows experiments since the last commit, but don't worry. The experiments remain cached and can be shown or applied. For example, use -n to show experiments from the previous n commits:

$ dvc exp show -n 2 --no-timestamp \
                    --include-params train.n_est,train.min_split
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ neutral:**Experiment**    โ”ƒ metric:**avg_prec** โ”ƒ metric:**roc_auc** โ”ƒ param:**train.n_est**โ”ƒ param:**train.min_split** โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ **workspace**     โ”‚  **0.60405** โ”‚  **0.9608** โ”‚ **100**        โ”‚ **64**              โ”‚
โ”‚ **master**        โ”‚  **0.60405** โ”‚  **0.9608** โ”‚ **100**        โ”‚ **64**              โ”‚
โ”‚ **64d74b2**       โ”‚  **0.55259** โ”‚ **0.91536** โ”‚ **50**         โ”‚ **2**               โ”‚
โ”‚ โ”œโ”€โ”€ exp-bfe64 โ”‚  0.57833 โ”‚ 0.95555 โ”‚ 50         โ”‚ 8               โ”‚
โ”‚ โ”œโ”€โ”€ exp-b8082 โ”‚  0.59806 โ”‚ 0.95287 โ”‚ 50         โ”‚ 64              โ”‚
โ”‚ โ”œโ”€โ”€ exp-c7250 โ”‚  0.58876 โ”‚ 0.94524 โ”‚ 100        โ”‚ 2               โ”‚
โ”‚ โ”œโ”€โ”€ exp-98a96 โ”‚  0.60405 โ”‚  0.9608 โ”‚ 100        โ”‚ 64              โ”‚
โ”‚ โ”œโ”€โ”€ exp-b9cd4 โ”‚  0.57953 โ”‚ 0.95732 โ”‚ 100        โ”‚ 8               โ”‚
โ”‚ โ””โ”€โ”€ exp-ad5b1 โ”‚  0.56191 โ”‚ 0.93345 โ”‚ 50         โ”‚ 2               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Eventually, old experiments may clutter the experiments table.

dvc exp gc removes all references to old experiments:

$ dvc exp gc --workspace
$ dvc exp show -n 2 --no-timestamp \
                    --include-params train.n_est,train.min_split
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ neutral:**Experiment** โ”ƒ metric:**avg_prec** โ”ƒ metric:**roc_auc** โ”ƒ param:**train.n_est**โ”ƒ param:**train.min_split** โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ **workspace**  โ”‚  **0.60405** โ”‚  **0.9608** โ”‚ **100**        โ”‚ **64**              โ”‚
โ”‚ **master**     โ”‚  **0.60405** โ”‚  **0.9608** โ”‚ **100**        โ”‚ **64**              โ”‚
โ”‚ **64d74b2**    โ”‚  **0.55259** โ”‚ **0.91536** โ”‚ **50**         โ”‚ **2**               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

dvc exp gc only removes references to the experiments; not the cached objects associated with them. To clean up the cache, use dvc gc.

Content

โ–ถ๏ธ It can be run online:

Run in Katacoda

๐Ÿ› Found an issue? Let us know! Or fix it:

Edit on GitHub

โ“ Have a question? Join our chat, we will help you:

Discord Chat