Here we provide help for some of the problems that DVC user might stumble upon.
Users may encounter errors when running
dvc pull and
dvc fetch, like
WARNING: Cache 'xxxx' not found. or
ERROR: failed to pull data from the cloud. The most common cause is changes
pushed to Git without the corresponding data being uploaded to the
DVC remote. Make sure to
dvc push from the
original project, and try again.
A known problem some users run into with the
dvc fetch and
dvc push commands is
[Errno 24] Too many open files (most common for S3
remotes on macOS). The more
--jobs specified, the more file descriptors need
to be open on the host file system for each download thread, and the limit may
be reached, causing this error.
To solve this, it's often possible to increase the open file descriptors limit,
ulimit on UNIX-like system (for example
ulimit -n 1024), or
increasing Handles limit
on Windows. Otherwise, please try using a lower
Unable to detect supported link types, as the
cache directory doesn't exist. It is
usually created automatically by DVC commands that need it, but you can create
it manually (e.g.
mkdir .dvc/cache) to enable this check.
You may encounter an error message saying
Unable to acquire lock if you have
another DVC process running in the project. If that is not the case, it usually
means that DVC was terminated abruptly and manually removing the lock file in
.dvc/tmp/lock should resolve the issue.
If the issue still persists then it may be the case that you are running DVC on
some network filesystem like NFS, Lustre, etc. If so, the solution is to enable
core.hardlink_lock which can be done by running following command:
$ dvc config core.hardlink_lock true
You may encounter this error if DVC cannot find a valid file link type to use when linking data files from cache into your workspace. To resolve the issue, you may need to reconfigure DVC to use alternative link types which are supported on your machine.
After reconfiguring cache types, you can re-link data files in your workspace using:
$ dvc checkout --relink
DVC does not currently support authentication with Git credentials. This means that unless the Git server allows unauthenticated HTTP write/read, you should use an SSH Git URL for Git remotes used for listing, pulling or pushing experiments.
You may encounter this error when using DVC on different Python versions with the same DVC project directory, for example having created the project on Python 3.8. in one environment and later attempting to update it from a Python 3.7 env. This is due to temporary internal directories that can be incompatible with older Python versions once created.
In these rare situations, it is safe to remove the corresponding tmp directory and retry the DVC command. Specifically, one of:
This often occurs in transient remote environments such as Continuous Integration (CI) jobs, which use shallow clones by default. In those cases, change their configuration to avoid shallow cloning. Common examples:
0 in the
- uses: actions/checkout@v3 with: fetch-depth: 0
See the GitHub Actions docs for more information.
GIT_DEPTH env var to
variables: GIT_DEPTH: '0'
See the GitLab CI/CD docs for more information.