Managing Machine Learning Experiments

I run Machine Learning experiments for a living and I run an average of 50 experiments per stage of a project. For each experiment I write code for training models, identifying the right test cases and metrics, finding the right preprocessors - the list goes on.

So how to manage these experiments? Here are a few criteria that I have:

  1. Compatible with Git: I manage all my code with Git and I want to make sure that experiment manager can keep track of how my code changes with time.
  2. Version control for data: I want to be able to work with multiple versions of my test and training datasets. I need to know if any 2 datasets are duplicates of each other, so that I am running my tests on the same datasets.
  3. Model Management: When I run experiments and store models, I want to store models that are associated with an experiment, rather than a particular run. I need to have meta-data associated with the model that tells me information about how this model was created, data it was trained on etc. (This is also the experiment meta-data)
  4. Metrics: I want to be able to store the output metrics for each experiment, and create new metrics by running the same model over different test datasets.
  5. (Optional) Running experiments: I want to be able to run experiments with a single command - this can be bath experiments, or a single experiment. I don’t want to have to worry about containers and dockers and the logistics of it.
  6. (Optional) Experiment optimization: I have so many variations and tests to try out. If a system to automatically try out these different variations for me that go beyond optimizing hyper-parameters of an algorithm, I would love to try such a system out.

I went on a quest to find a solution to my problems. And while I was on my quest, I discovered some more criteria that I had previously not considered while evaluating tools.

Here are some products that I have been looking at:

Product Pricing
Comet Paid
neptune.ml Paid
Tensordash Paid
Weights and Biases Free for individuals and academics, Paid
Valohai Paid
FloydHub Paid
Verta.ai Not Launched Yet
SirioML Not Launched Yet


Below is an impressive list of opensource tools in this space for running, managing and analyzing experiments.

MLFlow Free Yes
DVC Free iterative.ai
Guild.ml Free Yes
MLModelScope Free
Machine Learning Lab Free Yes
ModelChimp Free
Trains Free Yes
ModelDB Free Yes
Omniboard Free Yes
Datmo Free Yes
Randopt Free Yes
StudioML Free Yes
KubeFlow Free Yes
Lore Free Yes
Featureforge Free Yes
pachyderm Free Yes
PolyAxon Free Yes
Runway Free paper: http://www.jsntsay.com/publications/tsay-sysml2018.pdf
Sacred Free Yes
Sumatra Free Yes