Skip to content

mslapek/letstune

Repository files navigation



letstune

Hyper-parameter tuning for the masses!

License: MIT Documentation Status PyPI wheel Code style: black Imports: isort

Lint and test workflow

Machine Learning algorithms have many parameters, which are expected to be chosen by a user - like the number of layers or learning rate.

It requires a lot of trial and error.

letstune automatically tries various parameter configurations and gives you back the best model.

How it differs from GridSearchCV?

letstune will give you a better model in a shorter time in comparison to the classical hyperparameter tuning algorithms.

  1. Generate random parameters
  2. Evaluate each parameter with a small time budget
  3. Drop low-performers automatically, only good-performers will stay in the pool

The 3rd point is the distinguishing feature of letstune - other algorithms dutifully train weak models - without a good reason.

Ergonomics

Common tasks in letstune are realized with Python one-liners:

The best model:

model = tuning[0].best_epoch.checkpoint.load_pickle()

Pandas summary dataframe with parameters and metric values:

df = tuning.to_df()

Great! How to use it?

Install with pip:

pip install letstune

First, define your parameters:

import letstune
from letstune import Params, rand

class SGDClassifierParams(Params):
    model_cls = SGDClassifier

    average: bool
    l1_ratio: float = rand.uniform(0, 1)
    alpha: float = rand.uniform(1e-2, 1e0, log=True)

Then define a trainer. Trainer is an object, which knows how to train a model!

class DigitsTrainer(letstune.SimpleTrainer):
    params_cls = SGDClassifierParams
    metric = "accuracy"

    def load_dataset(self, dataset):
        self.X_train, self.X_test, self.y_train, self.y_test = dataset

    def train(self, params):
        # params has type SGDClassifierParams

        # letstune provides method create_model
        # returning SGDClassifier
        model = params.create_model(
            loss="hinge",
            penalty="elasticnet",
            fit_intercept=True,
            random_state=42,
        )
        model.fit(self.X_train, self.y_train)

        accuracy = model.score(self.X_test, self.y_test)

        return model, {"accuracy": accuracy}


trainer = DigitsTrainer()  # new instance!

Neural networks and gradient boosting trainings can be based on letstune.EpochTrainer, which has train_epoch method.

Finally, let's tune!

tuning = letstune.tune(
    trainer,
    16,  # number of tested random parameters
    dataset=(X_train, X_test, y_train, y_test),
    results_dir="digits_tuning",
)

Our model is ready to use:

model = tuning[0].checkpoint.load_pickle()

Don't forget to check out examples directory! 👀

Documentation is here!

Additionally

Works with your favourite ML library 🐍 - it's library agnostic!

Resumes work from the point, where program was stopped.

Permissive business-friendly MIT license.

References

A System for Massively Parallel Hyperparameter Tuning by Li et al.; arXiv:1810.05934

Overview of various hyperparameter-tuning algorithms. letstune implements a variant of Successive Halving.

Contributing

Issues are tracked on GitHub.

Changelog

Please see CHANGELOG.md.