Get Started with DVC
Before we begin, settle on a directory for this guide. Everything we will do will be self contained there.
Initializing a project
Inside your chosen directory, we will use our current working directory as a
DVC project. Let's initialize it by running
dvc init inside a Git
$ dvc init
A few internal files are created that should be added to Git:
$ git status Changes to be committed: new file: .dvc/.gitignore new file: .dvc/config ... $ git commit -m "Initialize DVC"
Now you're ready to DVC!
Following This Guide
To help you understand and use DVC better, consider the following two use-cases: data management and experiment tracking. You may pick either one to start learning about how DVC helps you "solve" that scenario!
Choose a trail to jump into its first chapter:
Data Management - Track and version large amounts of data along with your code, and use DVC as a build system for reproducible, data driven pipelines.
Experiment Management - Easily track your experiments and their progress by only instrumenting your code, and collaborate on ML experiments like software engineers do for code.
Feel free to "choose your own adventure" and follow the chapters which answer your specific needs. In case you're unsure where to start, we recommend starting with data management.