Returns the URL to the storage location of a data file or directory tracked in a DVC project.
def get_url(path: str, repo: str = None, rev: str = None, remote: str = None) -> str
import dvc.api resource_url = dvc.api.get_url( 'get-started/data.xml', repo='https://github.com/iterative/dataset-registry' ) # resource_url is now "https://remote.dvc.org/dataset-registry/a3/04afb96060aad90176268345e10355"
Returns the URL string of the storage location (in a DVC remote where a target
file or directory, specified by its
path in a
project), is stored.
The URL is formed by reading the project's remote configuration and the
.dvc file where the given
path is found (
outs field). The
schema of the URL returned depends on the storage type of the
the Parameters section).
If the target is a directory, the returned URL will end in
.dir. Refer to
Structure of cache directory and
dvc add to learn more about how DVC handles
This function does not check for the actual existence of the file or directory in the remote storage.
path(required) - location and file name of the target, relative to the root of the project (
repo- specifies the location of the DVC project. It can be a URL or a file system path. Both HTTP and SSH protocols are supported for online Git repos (e.g.
[user@]server:project.git). Default: The current project (found by walking up from the current working directory tree).
rev- Git commit (any revision such as a branch or tag name, commit hash, or experiment name). If
repois not a Git repo, this option is ignored. Default:
None(current working tree will be used)
remote- name of the DVC remote to use to form the returned URL string. Default: The default remote of
Example: Getting the URL to a DVC-tracked file
import dvc.api resource_url = dvc.api.get_url( 'get-started/data.xml', repo='https://github.com/iterative/dataset-registry', ) print(resource_url)
The script above prints
This URL represents the location where the data is stored, and is built by
reading the corresponding
.dvc file (
get-started/data.xml.dvc) where the
md5 file hash is stored,
outs: - md5: a304afb96060aad90176268345e10355 path: get-started/data.xml
and the project configuration (
.dvc/config) where the remote URL is saved:
['remote "storage"'] url = https://remote.dvc.org/dataset-registry