Edit on GitHub


Returns the contents of a tracked file.

This is similar to the dvc get command in our CLI.

def read(path: str,
         repo: str = None,
         rev: str = None,
         remote: str = None,
         mode: str = "r",
         encoding: str = None,
         config: dict = None)


import dvc.api

modelpkl = dvc.api.read(


This function wraps dvc.api.open(), for a simple way to return the complete contents of a file tracked in a DVC project. The file can be tracked by DVC (as an output) or by Git.

The returned contents can be a string or a bytearray. These are loaded to memory directly (without using any disc space).

The type returned depends on the mode used. For more details, please refer to Python's open() built-in, which is used under the hood.


  • path (required) - location and file name of the target to read, relative to the root of the project (repo).

  • repo - specifies the location of the DVC project. It can be a URL or a file system path. Both HTTP and SSH protocols are supported for online Git repos (e.g. [user@]server:project.git). Default: The current project is used (the current working directory tree is walked up to find it).

  • rev - Git commit (any revision such as a branch or tag name, commit hash, or experiment name). If repo is not a Git repo, this option is ignored. Default: None (current working tree will be used)

  • remote - name of the DVC remote to look for the target data. Default: The default remote of repo is used if a remote argument is not given. For local projects, the cache is tried before the default remote.

  • mode - specifies the mode in which the file is opened. Defaults to "r" (read). Mirrors the namesake parameter in builtin open().

  • encoding - codec used to decode the file contents to a string. This should only be used in text mode. Defaults to "utf-8". Mirrors the namesake parameter in builtin open().

  • config - config dictionary to pass to the DVC project. This is merged with the existing project config and can be used to, for example, provide credentials to the remote. See dvc.api.open for examples.


  • dvc.exceptions.FileMissingError - file in path is missing from repo.

  • dvc.exceptions.PathMissingError - path cannot be found in repo.

  • dvc.exceptions.NoRemoteError - no remote is found.

Example: Load data from a DVC repository

Any file tracked in a DVC project (and stored remotely) can be loaded directly in your Python code with this API. For example, let's say that you want to load and unserialize a binary model from a repo on GitHub:

import pickle
import dvc.api

data = dvc.api.read(
model = pickle.loads(data)

We're using 'rb' mode here for compatibility with pickle.loads().


๐Ÿ› Found an issue? Let us know! Or fix it:

Edit on GitHub

โ“ Have a question? Join our chat, we will help you:

Discord Chat