December ’20 Heartbeat

Monthly updates are here- read all about our brand new video docs, the DVC Udemy course, open jobs with our team, and essential reading about Git-flow with DVC.

  • Elle O'Brien
  • December 18, 20203 min read
Hero Picture

This holiday season, show your loved ones you care with our new shirt.

News

Welcome to the December Heartbeat! Let's dive in with some news from the team.

We're still hiring

Our search continues for two roles:

  • A Senior Software Engineer for the core DVC team- someone with strong Python development skills who can build and ship essential DVC features.

  • A Developer Advocate to support and inspire developers by creating new content like blogs, tutorials, and videos- plus lead outreach through meetups and conferences.

Does this sound like you or someone you know? Be in touch!

Video docs complete!

As you may have heard last month, we've been working on adding complete video docs to the "Getting Started" section of the DVC site. We now have 100% coverage! We have videos that mirror the tutorials for:

  • Data versioning - how to use Git and DVC together to track different versions of a dataset

  • Data access - how to share models and datasets across projects and environments

  • Pipelines - how to create reproducible pipelines to transform datasets to features to models

  • Experiments - how to do a git diff for models that compares and visualizes metrics

The full playlist is on our YouTube channel- where, by the way, we've recently passed 2,000 subscribers! Thanks so much for your support. There's much more coming up soon.

Collaboration with GitLab

We recently released a new blog with GitLab all about using CML with GitLab CI.

You may notice that the tweet spelled our name differently, and since Twitter doesn't have an edit button, I think that means we're "Interative" now. Hurry up and get your merch!

newname

Workshops

We gave a workshop at a virtual meetup held by the Toronto Machine Learning Society, and you can catch a video recording if you missed it. This workshop was all about getting started with GitHub Actions and CML! It starts with some high-level overview and then gets into live-coding.

By clicking play, you agree to YouTube's Privacy Policy and Terms of Service

From the community

There's no shortage of cool things to report from the community:

The DVC Udemy Course

Now you can learn the fundamentals of machine learning engineering, from experiment tracking to data management to continuous integration, with DVC and Udemy! Data scientists/DVC ambassadors Mikhail Rozhkov and Marcel Ribeiro-Dantas created a course full of practical tips and tricks for learners of all levels.

Machine Learning Experiments and Engineering with DVC

Automate machine learning experiments, pipelines and model deployment (CI/CD, MLOps) with Data Version Control (DVC).
Machine Learning Experiments and Engineering with DVC

A proposal for Git-flow with DVC

Fabian Rabe at Universität Augsburg wrote a killer doc about his team's tried-and-true approach to creating a workflow for a DVC project. He writes,

Over the past couple of months we have started using DVC in our small team. With a handful of developers all coding, training models & committing in the same repository, we soon realized the need for a workflow.

The post outlines three strategies his team adopted:

  1. Create a "debugging dataset" containing a subset of your data, with which you can test your complete DVC pipeline locally on a developer's machine

  2. Use CI-Runners to execute the DVC pipeline on the full dataset

  3. Adopt a naming convention for Git branches that correspond to machine learning experiments, in addition to the usual feature branches

Agree? Disagree? Fabian is actively soliciting feedback on his proposal (and possible solutions for some unresolved issues), so please read and chime in on our discussion board.

Git Flow for DVC

Fabian Rabe
Git Flow for DVC

Channel 9 talks Machine Learning and Python

The AI Show on Channel 9, part of the Microsoft DevRel universe, put out an episode all about ML and scientific computing with Python featuring Tania Allard and Seth Juarez. Their episode includes how DVC can fit in this development toolkit, so check it out!

A nice tweet

We'll end on a tweet we love:

This beautiful diagram, made by Joy Heron in response to a talk by Dr. Larysa Visengeriyeva about MLOps, is a wonderful encapsulation of the many considerations (at many scales) that go into ML engineering. Do you see DVC in there? 🕵️

Thank you for reading, and happy holidays to you! ❄️ 🎁 ☃️

Back to blog