Edit on GitHub

dag

Visualize the pipeline(s) in dvc.yaml as one or more graph(s) of connected stages.

Synopsis

usage: dvc dag [-h] [-q | -v] [--dot] [--full] [target]

positional arguments:
  target          Stage or output to show pipeline for (optional)
                  Uses all stages in the workspace by default.

Description

Displays the stages of a pipeline up to the target stage. If omitted, it will show the full project DAG.

Directed acyclic graph

A data pipeline, in general, is a series of data processing stages (for example, console commands that take an input and produce an outcome). The connections between stages are formed by the output of one turning into the dependency of another. A pipeline may produce intermediate data, and has a final result.

Data science and machine learning pipelines typically start with large raw datasets, include intermediate featurization and training stages, and produce a final model, as well as accuracy metrics.

In DVC, pipeline stages and commands, their data I/O, interdependencies, and results (intermediate or final) are specified in dvc.yaml, which can be written manually or built using the helper command dvc run. This allows DVC to restore one or more pipelines later (see dvc repro).

DVC builds a dependency graph (DAG) to do this.

Paginating the output

This command's output is automatically piped to less if available in the terminal (the exact command used is less --chop-long-lines --clear-screen). If less is not available (e.g. on Windows), the output is simply printed out.

It's also possible to enable less on Windows.

Note that this also applies to dvc exp show.

Providing a custom pager

It's possible to override the default pager via the DVC_PAGER environment variable. For example, the following command will replace the default pager with more, for a single run:

$ DVC_PAGER=more dvc dag

For a persistent change, define DVC_PAGER in the shell configuration. For example in Bash, we could add the following line to ~/.bashrc:

export DVC_PAGER=more

Options

  • --full - show full DAG that the target stage belongs to, instead of showing only its ancestors.
  • --dot - show DAG in DOT format. It can be passed to third party visualization utilities.
  • -o, --outs - show a DAG of chained dependencies and outputs instead of the stages themselves. The graph may be significantly different.
  • -h, --help - prints the usage/help message, and exit.
  • -q, --quiet - do not write anything to standard output. Exit with 0 if no problems arise, otherwise 1.
  • -v, --verbose - displays detailed tracing information.

Example: Visualize a DVC Pipeline

Visualize the prepare, featurize, train, and evaluate stages of a pipeline as defined in dvc.yaml:

$ dvc dag
         +---------+
         | prepare |
         +---------+
              *
              *
              *
        +-----------+
        | featurize |
        +-----------+
         **        **
       **            *
      *               **
+-------+               *
| train |             **
+-------+            *
         **        **
           **    **
             *  *
        +----------+
        | evaluate |
        +----------+

The pipeline can also be seen from the point of view of how stage outputs/dependencies are connected (using the --outs option). Notice that the resulting graph may be different:

$ dvc dag --outs
                  +---------------+
                  | data/prepared |
                  +---------------+
                          *
                          *
                          *
                  +---------------+
                  | data/features |
                **+---------------+**
            ****          *          *****
       *****              *               ****
   ****                   *                   ****
***                 +-----------+                 ***
  **                | model.pkl |                **
    **              +-----------+              **
      **           **           **           **
        **       **               **       **
          **   **                   **   **
      +-------------+            +----------+
      | scores.json |            | prc.json |
      +-------------+            +----------+
Content

๐Ÿ› Found an issue? Let us know! Or fix it:

Edit on GitHub

โ“ Have a question? Join our chat, we will help you:

Discord Chat