Returns the contents of a tracked file.
This is similar to the
dvc get command in our CLI.
def read(path: str, repo: str = None, rev: str = None, remote: str = None, mode: str = "r", encoding: str = None)
import dvc.api modelpkl = dvc.api.read( 'model.pkl', repo='https://github.com/iterative/example-get-started', mode='rb' )
This function wraps
dvc.api.open(), for a simple way to return the complete
contents of a file tracked in a DVC project. The file can be
tracked by DVC (as an output) or by Git.
The returned contents can be a string or a bytearray. These are loaded to memory directly (without using any disc space).
The type returned depends on the
mode used. For more details, please refer to
open() built-in, which is used under the hood.
path(required) - location and file name of the target to read, relative to the root of the project (
repo- specifies the location of the DVC project. It can be a URL or a file system path. Both HTTP and SSH protocols are supported for online Git repos (e.g.
[user@]server:project.git). Default: The current project is used (the current working directory tree is walked up to find it).
rev- Git commit (any revision such as a branch or tag name, commit hash, or experiment name). If
repois not a Git repo, this option is ignored. Default:
None(current working tree will be used)
remote- name of the DVC remote to look for the target data. Default: The default remote of
repois used if a
remoteargument is not given. For local projects, the cache is tried before the default remote.
mode- specifies the mode in which the file is opened. Defaults to
"r"(read). Mirrors the namesake parameter in builtin
encoding- codec used to decode the file contents to a string. This should only be used in text mode. Defaults to
"utf-8". Mirrors the namesake parameter in builtin
dvc.exceptions.FileMissingError- file in
pathis missing from
pathcannot be found in
Example: Load data from a DVC repository
Any file tracked in a DVC project (and stored remotely) can be loaded directly in your Python code with this API. For example, let's say that you want to load and unserialize a binary model from a repo on GitHub:
import pickle import dvc.api data = dvc.api.read( 'model.pkl', repo='https://github.com/iterative/example-get-started' mode='rb' ) model = pickle.loads(data)
'rb'mode here for compatibility with