Edit on GitHub

Amazon S3 and Compatible Servers

Start with dvc remote add to define the remote. Set a name and valid S3 URL:

$ dvc remote add -d myremote s3://<bucket>/<key>

Upon dvc push (or when needed), DVC will try to authenticate using your AWS CLI config. This reads the default AWS credentials file (if available) or env vars.

The AWS user needs the following permissions: s3:ListBucket, s3:GetObject, s3:PutObject, s3:DeleteObject.

To use custom auth or further configure your DVC remote, set any supported config param with dvc remote modify.

Cloud versioning

Requires S3 Versioning enabled on the bucket and the following AWS user permissions: s3:ListBucketVersions, s3:GetObjectVersion, s3:DeleteObjectVersion.

$ dvc remote modify myremote version_aware true

version_aware (true or false) enables cloud versioning features for this remote. This lets you explore the bucket files under the same structure you see in your project directory locally.

Custom authentication

Use these configuration options if you don't have the AWS CLI setup in your environment, if you want to override those values, or to change the auth method.

The dvc remote modify --local flag is needed to write sensitive user info to a Git-ignored config file (.dvc/config.local) so that no secrets are leaked through Git. See dvc config.

To use custom AWS CLI config or credential files, or to specify a profile name, use configpath, credentialpath, or profile:

$ dvc remote modify --local myremote \
                    configpath 'path/to/config'
# or
$ dvc remote modify --local myremote \
                    credentialpath 'path/to/credentials'
# and (optional)
$ dvc remote modify myremote profile 'myprofile'

Another option is to use an AWS access key ID (access_key_id) and secret access key (secret_access_key) pair, and if required, an MFA session token (session_token):

$ dvc remote modify --local myremote \
                    access_key_id 'mysecret'
$ dvc remote modify --local myremote \
                    secret_access_key 'mysecret'
$ dvc remote modify --local myremote \
                    session_token 'mysecret'

S3-compatible servers (non-Amazon)

Set the endpointurl parameter with the URL to connect to the S3-compatible service (e.g. MinIO, IBM Cloud Object Storage, etc.). For example, let's set up a DigitalOcean Space (equivalent to a bucket in S3) called mystore found in the nyc3 region:

$ dvc remote add -d myremote s3://mystore/path
$ dvc remote modify myremote endpointurl \
                    https://nyc3.digitaloceanspaces.com

Any other S3 parameter can also be set for S3-compatible storage. Whether they're effective depends on each storage platform.

More configuration parameters

See dvc remote modify for more command usage details.

  • url - modify the remote location (scroll up for details)

  • region - specific AWS region

    $ dvc remote modify myremote region 'us-east-2'
  • read_timeout - time in seconds until a timeout exception is thrown when attempting to read from a connection (60 by default)

  • connect_timeout - time in seconds until a timeout exception is thrown when attempting to make a connection (60 by default)

  • listobjects (true or false) - whether to use the list_objects() S3 API method instead of the default list_objects_v2(). Useful for Ceph and other S3 emulators

  • use_ssl (true or false) - whether to use SSL. Used by default.

  • ssl_verify - whether to verify SSL certificates (true or false), or a path to a custom CA certificates bundle to do so (implies true). Any certs found in the AWS CLI config file (ca_bundle) are used by default.

    $ dvc remote modify myremote ssl_verify false
    # or
    $ dvc remote modify myremote \
                        ssl_verify 'path/to/ca_bundle.pem'
  • sse (AES256 or aws:kms) - server-side encryption algorithm to use. None by default

    $ dvc remote modify myremote sse 'AES256'
  • sse_kms_key_id - encryption key ID (or alias) when using SSE-KMS (see sse)

  • sse_customer_key - key to encrypt data uploaded when using customer-provided keys (SSE-C) instead of sse. The value should be a base64-encoded 256 bit key.

  • sse_customer_algorithm - algorithm to use with sse_customer_key. AES256 by default

  • acl - object-level access control list (ACL) such as private, public-read, etc. None by default. Cannot be used with the grant_ params below.

    $ dvc remote modify myremote \
                        acl 'bucket-owner-full-control'
  • grant_read - grant READ permissions at object-level ACL to specific grantees. Cannot be used with acl.

    $ dvc remote modify myremote grant_read \
          'id=myuser,id=anotheruser'
  • grant_read_acp - grant READ_ACP permissions at object-level ACL to specific grantees. Cannot be used with acl.

  • grant_write_acp - grant WRITE_ACP permissions at object-level ACL to specific grantees. Cannot be used with acl.

  • grant_full_control - grant FULL_CONTROL permissions at object-level ACL to specific grantees. Cannot be used with acl.

Environment variables

Authentication and other config can also be set via boto3 env vars. These are tried if no config params are set. Example:

$ dvc remote add -d myremote s3://mybucket
$ export AWS_ACCESS_KEY_ID='myid'
$ export AWS_SECRET_ACCESS_KEY='mysecret'
$ dvc push
Content

🐛 Found an issue? Let us know! Or fix it:

Edit on GitHub

❓ Have a question? Join our chat, we will help you:

Discord Chat