Edit on GitHub

.dvc Files

You can use dvc add to track data files or directories located in your current workspace, or in supported external locations. Additionally, dvc import and dvc import-url let you bring data from external locations to your project, and start tracking it locally.

See Data Versioning and Data Access for more info.

Files ending with the .dvc extension ("dot DVC file") are created by these commands as data placeholders that can be versioned with Git. They contain the information needed to track the target data over time. Here's an example:

outs:
  - md5: a304afb96060aad90176268345e10355
    path: data.xml
    desc: Cats and dogs dataset

# Comments and user metadata are supported.
meta:
  name: 'Devee Bird'
  email: devee@dvc.org

These files use the YAML 1.2 file format, and a human-friendly schema described below. We encourage you to get familiar with it so you may modify, write, or generate .dvc files on your own.

See also How to Merge Conflicts.

Specification

These are the fields that are accepted at the root level of the .dvc file schema:

FieldDescription
outs(Required) list of output entries (details below) that represent the files or directories tracked with DVC. Typically there is only one (but several can be added or combined manually).
depsList of dependency entries (details below). Only present when dvc import or dvc import-url are used to generate this .dvc file. Typically there is only one (but several can be added manually).
wdirWorking directory for the outs and deps paths (relative to the .dvc file's location). It defaults to . (the file's location).
md5(Only for imports) MD5 hash of the .dvc file itself.
meta(Optional) arbitrary user metadata can be added manually with this field. Any YAML content is supported. meta contents are ignored by DVC.

Comments can be entered using the # comment format.

meta fields and # comments are preserved among executions of dvc repro and dvc commit, but not when the file is overwritten by dvc add, dvc move, dvc import, or dvc import-url.

Output entries

The following subfields may be present under outs entries:

FieldDescription
path(Required) Path to the file or directory (relative to wdir, which defaults to the file's location)
md5
etag
checksum
Hash value for the file or directory being tracked with DVC. MD5 is used for most locations (local file system and SSH); ETag for HTTP, S3, or Azure external outputs; and a special checksum for HDFS and WebHDFS.
sizeSize of the file or directory (sum of all files).
nfilesIf this output is a directory, the number of files inside (recursive).
isexecWhether this is an executable file. DVC preserves execute permissions upon dvc checkout and dvc pull. This has no effect on directories, or in general on Windows.
cacheWhether or not this file or directory is cached (true by default). See the --no-commit option of dvc add.
persistWhether the output file/dir should remain in place while dvc repro runs (false by default: outputs are deleted when dvc repro starts)
desc(Optional) user description for this output (supported in metrics and plots too). This doesn't affect any DVC operations.

Dependency entries

The following subfields may be present under deps entries:

FieldDescription
path(Required) Path to the dependency (relative to wdir, which defaults to the file's location)
md5
etag
checksum
Hash value for the file or directory being tracked with DVC. MD5 is used for most locations (local file system and SSH); ETag for HTTP, S3, or Azure external dependencies; and a special checksum for HDFS and WebHDFS. See dvc import-url for more information.
sizeSize of the file or directory (sum of all files).
nfilesIf this dependency is a directory, the number of files inside (recursive).
repoThis entry is only for external dependencies created with dvc import, and can contain url, rev, and rev_lock (detailed below).

Dependency repo subfields:

FieldDescription
urlURL of Git repository with source DVC project
revOnly present when the --rev option of dvc import is used. Specific commit hash, branch or tag name, etc. (a Git revision) used to import the dependency from.
rev_lockGit commit hash of the external DVC repository at the time of importing or updating the dependency (with dvc update)
Content

๐Ÿ› Found an issue? Let us know! Or fix it:

Edit on GitHub

โ“ Have a question? Join our chat, we will help you:

Discord Chat