Download a file or directory from a supported URL (for example
ssh://, and other protocols) into the local file system.
dvc getto download data/model files or directories from other DVC repositories (e.g. hosted on GitHub).
usage: dvc get-url [-h] [-q | -v] [-j <number>] url [out] positional arguments: url (See supported URLs in the description.) out Destination path to put files in.
In some cases it's convenient to get a file or directory from a remote location
into the local file system. The
dvc get-url command helps the user do just
url argument should provide the location of the data to be downloaded,
out can be used to specify the directory and/or file name desired for
the downloaded data. If an existing directory is specified, then the file or
directory will be placed inside.
dvc list-url for a way to browse the external location for files and
directories to download.
DVC supports several types of (local or) remote data sources (protocols):
|Microsoft Azure Blob Storage|
|Google Cloud Storage|
|HDFS to file*|
|HTTP to file*|
|WebDav to file*|
|HDFS REST API*|
If you installed DVC via
pipand plan to use cloud services as remote storage, you might need to install these optional dependencies:
[ssh]. Alternatively, use
[all]to include them all. The command should look like this:
pip install "dvc[s3]". (This example installs
boto3library along with DVC to support S3 storage.)
* Notes on remote locations:
- HDFS, HTTP, WebDav, and WebHDFS do not support downloading entire directories, only single files.
$ wget https://example.com/path/to/data.csv
--jobs <number>- parallelism level for DVC to download data from the source. The default value is
4 * cpu_count(). Using more jobs may speed up the operation.
--help- prints the usage/help message, and exit.
--quiet- do not write anything to standard output. Exit with 0 if no problems arise, otherwise 1.
--verbose- displays detailed tracing information.