I run Machine Learning experiments for a living and I run an average of 50 experiments per stage of a project. For each experiment I write code for training models, identifying the right test cases and metrics, finding the right preprocessors - the list goes on.
So how to manage these experiments? Here are a few criteria that I have:
- Compatible with Git: I manage all my code with Git and I want to make sure that experiment manager can keep track of how my code changes with time.
- Version control for data: I want to be able to work with multiple versions of my test and training datasets. I need to know if any 2 datasets are duplicates of each other, so that I am running my tests on the same datasets.
- Model Management: When I run experiments and store models, I want to store models that are associated with an experiment, rather than a particular run. I need to have meta-data associated with the model that tells me information about how this model was created, data it was trained on etc. (This is also the experiment meta-data)
- Metrics: I want to be able to store the output metrics for each experiment, and create new metrics by running the same model over different test datasets.
- (Optional) Running experiments: I want to be able to run experiments with a single command - this can be bath experiments, or a single experiment. I don’t want to have to worry about containers and dockers and the logistics of it.
- (Optional) Experiment optimization: I have so many variations and tests to try out. If a system to automatically try out these different variations for me that go beyond optimizing hyper-parameters of an algorithm, I would love to try such a system out.
I went on a quest to find a solution to my problems. And while I was on my quest, I discovered some more criteria that I had previously not considered while evaluating tools.
Here are some products that I have been looking at:
| Product | Pricing | |
|---|---|---|
| Comet | Paid | |
| neptune.ml | Paid | |
| Tensordash | Paid | |
| Weights and Biases | Free for individuals and academics, Paid | |
| Valohai | Paid | |
| FloydHub | Paid | |
| Verta.ai | Not Launched Yet | |
| SirioML | Not Launched Yet | |
Below is an impressive list of opensource tools in this space for running, managing and analyzing experiments.
| MLFlow | Free | Yes |
| DVC | Free | iterative.ai |
| Guild.ml | Free | Yes |
| MLModelScope | Free | |
| Machine Learning Lab | Free | Yes |
| ModelChimp | Free | |
| Trains | Free | Yes |
| ModelDB | Free | Yes |
| Omniboard | Free | Yes |
| Datmo | Free | Yes |
| Randopt | Free | Yes |
| StudioML | Free | Yes |
| KubeFlow | Free | Yes |
| Lore | Free | Yes |
| Featureforge | Free | Yes |
| pachyderm | Free | Yes |
| PolyAxon | Free | Yes |
| Runway | Free | paper: http://www.jsntsay.com/publications/tsay-sysml2018.pdf |
| Sacred | Free | Yes |
| Sumatra | Free | Yes |