All of the above can be combined into experiments to run and compare many iterations of your ML project.
First, let's see the mechanism to capture values for these ML attributes. Add a final evaluation stage to our earlier pipeline:
$ dvc stage add -n evaluate \ -d src/evaluate.py -d model.pkl -d data/features \ -o eval \ python src/evaluate.py model.pkl data/features
evaluate.py uses DVCLive to write scalar metrics values (e.g.
plots data (e.g.
ROC curve) to files in the
eval directory that DVC can
parse to compare and visualize across iterations. By default, DVCLive will
configure metrics and plots for you in
dvc.yaml, but in this example we
customize them by editing
dvc.yaml to combine train and test plots.
Let's run and save these changes:
$ dvc repro $ git add .gitignore dvc.yaml dvc.lock eval $ git commit -a -m "Create evaluation stage"
You can view metrics and plots from the command line, or you can load your project in VS Code and use the DVC Extension to view metrics, plots, and more.
You can view tracked metrics with
dvc metrics show :
$ dvc metrics show Path avg_prec.test avg_prec.train roc_auc.test roc_auc.train eval/metrics.json 0.94496 0.97723 0.96191 0.98737
You can view plots with
dvc plots show (shown below), which generates an HTML
file you can open in a browser.
$ dvc plots show file:///Users/dvc/example-get-started/dvc_plots/index.html
Later we will see how to compare and visualize different pipeline iterations. For now, let's see how to capture another important piece of information which will be useful for comparison: parameters.
It's pretty common for data science pipelines to include configuration files that define adjustable parameters to train a model, do pre-processing, etc. DVC provides a mechanism for stages to depend on the values of specific sections of such a config file (YAML, JSON, TOML, and Python formats are supported).
featurize: cmd: python src/featurization.py data/prepared data/features deps: - data/prepared - src/featurization.py params: - featurize.max_features - featurize.ngrams outs: - data/features
params section defines the parameter dependencies of the
stage. By default, DVC reads those values (
featurize.ngrams) from a
params.yaml file. But as with metrics and plots,
parameter file names and structure can also be user- and case-defined.
Here's the contents of our
prepare: split: 0.20 seed: 20170428 featurize: max_features: 100 ngrams: 1 train: seed: 20170428 n_est: 50 min_split: 2
We are definitely not happy with the AUC value we got so far! Let's edit the
params.yaml file to use bigrams and increase the number of features:
featurize: - max_features: 100 - ngrams: 1 + max_features: 200 + ngrams: 2
The beauty of
dvc.yaml is that all you need to do now is run:
$ dvc repro
It'll analyze the changes, use existing results from the run cache, and execute only the commands needed to produce new results (model, metrics, plots).
The same logic applies to other possible adjustments — edit source code, update
datasets — you do the changes, use
dvc repro, and DVC runs what needs to be
Finally, let's see how the updates improved performance. DVC has a few commands to see changes in and visualize metrics, parameters, and plots. These commands can work for one or across multiple pipeline iteration(s). Let's compare the current "bigrams" run with the last committed "baseline" iteration:
$ dvc params diff Path Param HEAD workspace params.yaml featurize.max_features 100 200 params.yaml featurize.ngrams 1 2
dvc params diff can show how params in the workspace differ vs. the last
dvc metrics diff does the same for metrics:
$ dvc metrics diff Path Metric HEAD workspace Change eval/metrics.json avg_prec.test 0.9014 0.925 0.0236 eval/metrics.json avg_prec.train 0.95704 0.97437 0.01733 eval/metrics.json roc_auc.test 0.93196 0.94602 0.01406 eval/metrics.json roc_auc.train 0.97743 0.98667 0.00924
And finally, we can compare all plots with a single command (we show only some of them for simplicity):
$ dvc plots diff file:///Users/dvc/example-get-started/plots.html
dvc plots difffor more info on its options.
All these commands also accept Git revisions (commits, tags, branch names) to compare.