Edit on GitHub

Using DVC Commands

DVC is a command line tool. The typical DVC workflow goes as follows:

  • In an existing Git repository, initialize a DVC project with dvc init.
  • Copy source code files for modeling into the repository and track the files with DVC using the dvc add command.
  • Process raw data with your own data processing and modeling code, using the dvc run command, along with its --outs option for outputs that should also be tracked by DVC after the code is executed.
  • Sharing a Git repository with the source code of your ML pipeline will not include the project's cache. Use remote storage and dvc push to share this cache (data tracked by DVC).
  • Use dvc repro to automatically reproduce your full pipeline, iteratively as input data or source code change.

The command references under this section provide a precise specification, complete description, and isolated usage examples for the dvc CLI tool. These are our most technical documentation pages, similar to man-pages in Linux.

🐛 Found an issue? Let us know! Or fix it:

Edit on GitHub

Have a question? Join our chat, we will help you:

Discord Chat