We have renamed Views to Projects in Iterative Studio.
Accordingly, Views dashboard is now called Projects dashboard; View settings are now called Project settings; and so on.
You can change your hyperparameters or select a different dataset and re-run your model training using Iterative Studio.
Iterative Studio uses your regular CI/CD setup (e.g. GitHub Actions) to run the experiments. This means that to enable experimentation from Iterative Studio, you should do the following:
First, integrate your Git repository with a CI/CD setup that includes model training process. You can use the wizard provided by Iterative Studio to automatically generate the CI script, or you can write it on your own.
Then, setup the yaml workflow ENV vars as secrets. This is needed so that your CI workflow can launch the runner in your desired cloud provider.
Now, you can submit your experiments from Iterative Studio. Each submission will invoke your CI/CD setup, triggering the model training process.
Due to access restrictions, you cannot run experiments on the demo project (
example-get-started) that is provided to you by default. Once you connect to your ML project repositories, you can follow the instructions given below to run experiments directly from Iterative Studio.
Select a commit and click Run. You will see a message that invites you to set up your CI.
The CI setup wizard has two sections, pre-filled with default values:
Left section with 2 sets of parameters:
Right section which displays the generated yaml to be used in your CI set up.
It reflects all your input parameters. Use the
Copy to clipboard and paste in your CI Workflow file link to copy the
generated yaml and create your CI script.
That's it! At this point you should have CML in place within your CI/CD to run your experiments. After this, proceed with submitting your experiments as described below.
This step is responsible for launching a self-hosted runner within your cloud vendor. The parameters listed here are a subset of the parameters for CML self-hosted runners.
|Whether you want to launch a spot cloud instance, cutting down the costs of your training.|
|Your cloud provider.|
|Cloud-vendor specific region or a CML synthetic region (an abstraction across all the cloud vendors).|
|Cloud-vendor specific instance type or a CML synthetic type |
|Hard disk size in GB. We highly recommend you to enter a big enough value (eg, 100) to avoid unexpected runner termination due to hard disk exhaustion.|
|Values for the CML flags |
|Text labels to identify your CML runners from other self hosted runners that you might have.|
|This is the script needed for your runner to execute your job, which would commonly include training your model. The default template is a very common combination of CML and DVC taking into account that DVC enables you to make the most of Iterative Studio. You can update this script to reflect your exact model training process, whether you use DVC or not.|
Watch this video for an overview of how you can run experiments from Iterative Studio, or read below for details.
Note that we have renamed DVC Studio to Iterative Studio and Views to Projects.
To run experiments from Iterative Studio, first you need to determine the Git
commit (experiment) on which you want to iterate. Select the commit that you
want to use and click the
Run button. A form will let you specify all the
changes that you want to make to your experiment. On this form, there are 2
types of inputs that you can change:
example-get-startedML project, you can change the
data.xmlfile. Iterative Studio identifies all the files used in your ML project, which means that if you select the
Show all input parameters (including hidden)option, then you can also change the hidden files such as the
model.pklmodel file and the
scores.jsonmetrics file. You can also choose not to change any input data files if you only wish to change the values of one or more hyperparameters.
example-get-startedML project, you can change
max_features(the maximum number of features that the model uses),
ngrams, etc. You can also choose not to change any hyperparameters if you only wish to change one or more input data files.
The default values of the input data files and hyperparameters in this form are extracted from your selected commit.
Once you have made all the required changes, enter your Git commit message and description.
If your CI job creates a new Git commit to write the experiment results to your
Git repository, you may want to hide the Git commit that you created when
submitting the experiment from your project table. In this case, add
[skip studio] in the commit message. For details, refer to Display
preferences -> Hide commits.
Then, select the branch to commit to. You can commit to either the base branch
or a new branch. If you commit to a new branch, a Git pull request will
automatically be created from the new branch to the base branch. Now, click on
At this point, the new experiment appears in the project's experiment table. If you just committed to a new branch, then a new pull request will also have been created from the new branch to the base branch.
If your ML project is integrated with a CI/CD setup (e.g. GitHub Actions), the CI/CD setup will get invoked. If this setup includes a model training process, it will be triggered, which means that your ML experiment will run automatically. The model training can happen on any cloud or Kubernetes. For more details on how to set up CI/CD pipelines for your ML project, refer to CML. You can also create CML reports with metrics, plots or other details at the end of each experiment run.
Once the experiment completes, its metrics will be available in the project's
experiment table. You can then generate plots and trend charts for it, or
compare it with the other experiments. If a CML report has been defined in your
CI/CD flow, you can access the report by clicking on the CML report icon next to
the Git commit message in the table. The
CML Report tooltip appears over the
CML report icon on mouse hover.