Optimizer

class lib.training.optimizer.Optimizer(model: Model, config: type[OptConfig], mixed_precision: bool = False, warmup_steps: int = 0)

Bases: object

Object for managing the selected Torch optimizer

Parameters:
  • model (Model) – The model that is to be trained

  • config (type[OptConfig]) – The optimizer user configuration options

  • mixed_precision (bool) – True to train using mixed precision. Default: False

  • warmup_steps (int) – The number of steps to warmup the learning rate for. Default: 0

Methods Summary

backward(loss)

Perform the optimizer's backward pass

find_learning_rate(trainer, steps, start_lr, ...)

Use the Learning Rate Finder to discover the optimal learning rate

load_state_dict(state_dict)

Load the serialized data from a state dict into this object

set_lr(lr)

Manually assign the optimizer's learning rate with the given value

state_dict()

Serialized data as a dict for relevant options contained in this class

step()

Perform the optimizer step if valid and zero the gradients.

to(device)

Place the optimizer onto the given device

Methods Documentation

backward(loss: Tensor) None

Perform the optimizer’s backward pass

Parameters:

loss (Tensor) – The loss scalar from the forward pass

Return type:

None

find_learning_rate(trainer: Trainer, steps: int, start_lr: float, end_lr: float, strength: T.Literal['default', 'aggressive', 'extreme'], mode: T.Literal['set', 'graph_and_set', 'graph_and_exit']) bool

Use the Learning Rate Finder to discover the optimal learning rate

Parameters:
  • trainer (Trainer) – The training loop with the loaded training plugin

  • steps (int) – The number of iterations to run the learning rate finder for

  • start_lr (float) – The learning rate to start scanning from

  • end_lr (float) – The final learning rate to scan until

  • strength (T.Literal['default', 'aggressive', 'extreme']) – How aggressively to set the optimal learning rate

  • mode (T.Literal['set', 'graph_and_set', 'graph_and_exit']) – The mode to run the Learning Rate Finder in

Return type:

True if an optimal learning rate was discovered.

load_state_dict(state_dict: dict[str, Any]) None

Load the serialized data from a state dict into this object

Parameters:

state_dict (dict[str, Any]) – The serialized data to load

Return type:

None

set_lr(lr: float) None

Manually assign the optimizer’s learning rate with the given value

Parameters:

lr (float) – The learning rate to apply to the optimizer

Return type:

None

state_dict() dict[str, Any]

Serialized data as a dict for relevant options contained in this class

Return type:

The serialized data for this object for saving and loading

step() None

Perform the optimizer step if valid and zero the gradients.

Handles gradient accumulation, scaling for mixed precision and gradient clipping

Return type:

None

to(device: torch.Device) None

Place the optimizer onto the given device

Parameters:

device (torch.Device) – The device to place the optimizer on to

Return type:

None

backward(loss: Tensor) None

Perform the optimizer’s backward pass

Parameters:

loss (Tensor) – The loss scalar from the forward pass

Return type:

None

find_learning_rate(trainer: Trainer, steps: int, start_lr: float, end_lr: float, strength: T.Literal['default', 'aggressive', 'extreme'], mode: T.Literal['set', 'graph_and_set', 'graph_and_exit']) bool

Use the Learning Rate Finder to discover the optimal learning rate

Parameters:
  • trainer (Trainer) – The training loop with the loaded training plugin

  • steps (int) – The number of iterations to run the learning rate finder for

  • start_lr (float) – The learning rate to start scanning from

  • end_lr (float) – The final learning rate to scan until

  • strength (T.Literal['default', 'aggressive', 'extreme']) – How aggressively to set the optimal learning rate

  • mode (T.Literal['set', 'graph_and_set', 'graph_and_exit']) – The mode to run the Learning Rate Finder in

Return type:

True if an optimal learning rate was discovered.

load_state_dict(state_dict: dict[str, Any]) None

Load the serialized data from a state dict into this object

Parameters:

state_dict (dict[str, Any]) – The serialized data to load

Return type:

None

set_lr(lr: float) None

Manually assign the optimizer’s learning rate with the given value

Parameters:

lr (float) – The learning rate to apply to the optimizer

Return type:

None

state_dict() dict[str, Any]

Serialized data as a dict for relevant options contained in this class

Return type:

The serialized data for this object for saving and loading

step() None

Perform the optimizer step if valid and zero the gradients.

Handles gradient accumulation, scaling for mixed precision and gradient clipping

Return type:

None

to(device: torch.Device) None

Place the optimizer onto the given device

Parameters:

device (torch.Device) – The device to place the optimizer on to

Return type:

None