WarmupScheduler

class lib.training.lr_warmup.WarmupScheduler(optimizer: Optimizer, steps: int, last_epoch: int = -1)

Bases: LRScheduler

Handles the updating of the model’s learning rate during Learning Rate Warmup

Parameters:
  • optimizer (Optimizer) – The torch optimizer in use

  • steps (int) – The number of iterations to warmup the learning rate for

  • last_epoch (int) – The last step that was run (last_epoch is a misnomer inherited from PyTorch and actually refers to steps in our use case). Default: -1 (not yet started)

Methods Summary

get_last_lr()

Get the most recent learning rates computed by this scheduler.

get_lr()

Get the learning rate for the current step

load_state_dict(state_dict)

Load the scheduler's state.

state_dict()

Return the state of the scheduler as a dict.

step([epoch])

If a learning rate update is required, update the model's learning rate, otherwise do nothing

Methods Documentation

get_last_lr() list[float | Tensor]

Get the most recent learning rates computed by this scheduler.

Returns:

A list of learning rates with entries for each of the optimizer’s param_groups, with the same types as their group["lr"]s.

Return type:

list[float | Tensor]

Note

The returned Tensors are copies, and never alias the optimizer’s group["lr"]s.

get_lr() list[float | Tensor]

Get the learning rate for the current step

Return type:

The next learning rate for each parameter group for the next step

load_state_dict(state_dict: dict[str, Any]) None

Load the scheduler’s state.

Parameters:

state_dict (dict) – scheduler state. Should be an object returned from a call to state_dict().

Return type:

None

state_dict() dict[str, Any]

Return the state of the scheduler as a dict.

It contains an entry for every variable in self.__dict__ which is not the optimizer.

Return type:

dict[str, Any]

step(epoch=None) None

If a learning rate update is required, update the model’s learning rate, otherwise do nothing

Parameters:

epoch – Deprecated argument from PyTorch that should always be None. Default: None

Return type:

None

get_lr() list[float | Tensor]

Get the learning rate for the current step

Return type:

The next learning rate for each parameter group for the next step

step(epoch=None) None

If a learning rate update is required, update the model’s learning rate, otherwise do nothing

Parameters:

epoch – Deprecated argument from PyTorch that should always be None. Default: None

Return type:

None

steps

The total number of steps to warmup the LR for