July '21 Heartbeat

This month you will find:

  • 📈 DVC + Streamlit = ❤️,
  • 🇯🇵 DVC in Japanese,
  • 📖 A new Udacity Course that includes DVC,
  • 🧑🏽‍💻 More and more jobs requiring DVC
  • 🧪 June Meetup on Experiments,
  • 🚀 New team member, a secret code and more!
  • Jeny De Figueiredo
  • July 16, 20218 min read

Welcome to Summer!

It's summer!

From the Community

As usual we have a ton of goodness from the Community! Let's jump in!

Antoine Toubhans' Post Combining Streamlit and DVC!

Antoine Toubhans of Sicara wrote a fantastic and detailed tutorial entitled How to Build Customizable Web UI for Machine Learning with Streamlit and DVC bringing together the best of DVC and integrating it with Streamlit to provide a customizable UI. The tutorial goes through the steps of setting up a pipeline, spltting a dataset, training and evaluating a model, tracking changes to data and model, dvc metrics and plots and then bridging the gap in visualizations using Streamlit. You won't want to miss this one!

DVC and Streamlit DVC + Streamlit = ♥️! Source link

DVC and CML in Japanese!

For our friends that speak Japanese, these slides created by Yusuke Shibui walk you through a machine learning to production project using DVC and CML. We love seeing our tools being used all around the world! 🌏

DVC and CML in Japanese DVC and CML in Japanese! Source link

Miguel Méndez' DVC Tutorial

Miguel Méndez and his team at Gradiant struggled with reproducibility before using DVC for versioning their image dataset and annotations. The dataset and annotaions are held in a shared storage space and used by the whole team. DVC enables the team to track changes and know what versions of the dataset produce the best results. His tutorial walks you through the steps to set it up!

Version Control Your Dataset with DVC

Miguel Méndez' tutorial on using DVC for versioning datasets and providing reproducibility
Version Control Your Dataset with DVC

Jobs requiring DVC!

We have been seeing an uptick in the number of jobs requiring knowledge of DVC. It's exciting to see that our tools are helping these companies in their MLOps workflows! 🎉

job descriptions

Learning Opportunities

With all those DVC job opportunities out there, you better get on it! 😉

A New Udacity Course Incorporating DVC!

Just this month a new Udacity nannodegree program came out entitled Machine Learning DevOps Engineer, that teaches DVC as part of the program. This course includes sections on:

  • Clean Code Principles
  • Building a Reproducible Model Workflow
  • Deploying a Scalable ML Pipeline in Production
  • Automated Model Scoring and Monitoring

Machine Learning DevOps Engineer

A new nanodegree program offered by Udacity teaching DVC as part of the curriculum
Machine Learning DevOps Engineer

DVC Learn

This week we kicked off our new DVC Learn Meetup series with Milecia McGregor. This set of three, short, half-hour classes are designed to get you up and running in DVC. If you are just getting started with DVC or kicking the tires, this Meetup series is for you! Our next class on August 4th will get you started with experiments.

If you are interested in weighing in on what kinds of educational content you would like to see from us, we'd be grateful if you'd fill out this survey to help us plan! 🙏🏼

DVC Learn - Getting Started: Experiments

The next DVC Learn Meetup taught by Melecia McGregor designed to get you started with DVC Experiments
DVC Learn - Getting Started: Experiments

Data Science Journal Article on Reproducibility Practices in Research

New research presented in the Data Science Journal aims to provide best practices for providing reproducibility in research datasets. This is necessary to pinpoint the version of the dataset that grounds any research. In this work the authors reviewed 39 use cases from 33 organizations to arrive at six principles for versioning datasets. These include Revision, Release, Granularity, Manifestation, Provenance and Citation. See the full work below. 👇🏼

Versioning Data is About More Than Revisions: A Conceptual Framework and Proposed Priniciples

Authors analyze 39 use cases in 33 organziations to arrive at proposed principles when versioning data.
Versioning Data is About More Than Revisions:  A Conceptual Framework and Proposed Priniciples

June Office Hours Meetup

The June Office Hours Meetup was 🔥! Amazing discussion on experiments ignited by Sami Jawhar of Kernel around experiment use cases and workflows.
You can find the repo for his presentation here and watch all the great DVC discussion below.

DVC News

Summer and vaccinations mean travel! ☀️💉 And that travel has enabled some of our team members to get together! Pictured below are Dmitry Petrov, Alexander Guschin, Max Shmakov, Mikhail Rozhkov, Sergey Kryukov, Mikhail Sveshnikov, and Guro Bokum… But not necessarily in that order.

The first person to guess the correct order of our teammates starting from the upper right of the picture moving clockwise, and post in the corresponding Twitter Heartbeat post, will win some DVC SWAG! Hint: If you've been wondering why there are random purple letters in this blog post, they're a clue to this cipher. 🧐

team Team Meetup in Moscow! (hand signals obscured for our UK friends, because we care! 🤗)

New Team Member

David de la Iglesia Castro is the third teammate joining us from Spain! 🇪🇸 And also the third David! He hails from Galicia and has been an active member of our Community for over two years. We are so excited to have him join the team as a software enginer where he will work to improve DVC Live. When he's not contributing to DVC, David likes to go climbing, surfing or just hiking whenever he can! Welcome David!

Open Positions

And yes indeed, we are still hiring! Use this link to find details of all the positions including:

  • Senior Front-End Engineer (TypeScript, Node, React)
  • Senior Software Engineer (ML, Dev Tools, Python)
  • Senior Software Engineer (ML, Data Infra, GoLang)
  • Machine Learning Engineer/Field Data Scientist
  • Developer Advocate (ML)
  • Director/VP of Engineering (ML, DevTools)
  • Director/VP of Product (ML, Data Infra, SaaS)
  • Director/VP of Operations/Chief of Staff

Please pass this info on to anyone you know that may fit the bill. We look forward to new team members! 🎉

Next Meetup

Don't miss our Meetup July 28th at 2:00 pm UTC (7:00 am PDT), where João Santiago of Billie will present "DVThis" a set of utility functions for DVC pipelines using R scripts. Additionally the project aims to document the usual workflows of a DVC pipeline using these scripts and create templates for the use of DVC and R together.

Following Santiago, team member Tapa Dipti Sitaula will give a demo of DVC Studio! Bring your questions; we look forward to seeing you!

DVThis

July DVC Office Hours with João Santiago of Billie shows us how to use R with DVC, presenting DVThis and Tapa Dipti Sitaula shares a demo of DVC Studio.
DVThis

Tweet Love ❤️


Do you have any use case questions or need support? Join us in Discord!

Head to the DVC Forum to discuss your ideas and best practices.

Subscribe for updates. We won't spam you.