DVC is a command line tool. The typical DVC workflow goes as follows:
- In an existing Git repository, initialize a DVC project with
- Copy source code files for modeling into the repository and track the files
with DVC using the
- Process raw data with your own data processing and modeling code, using the
dvc runcommand, along with its
--outsoption for outputs that should also be tracked by DVC after the code is executed.
- Sharing a Git repository with the source code of your ML
pipeline will not include the project's
cache. Use remote storage and
dvc pushto share this cache (data tracked by DVC).
dvc reproto automatically reproduce your full pipeline, iteratively as input data or source code change.
The command references under this section provide a precise specification,
complete description, and isolated usage examples for the
dvc CLI tool. These
are our most technical documentation pages, similar to
man-pages in Linux.